Skip to content

Latest commit

 

History

History
38 lines (31 loc) · 1.51 KB

File metadata and controls

38 lines (31 loc) · 1.51 KB

Changelog

All notable changes to this project will be documented in this file.

The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.

[Unreleased]

[0.1.0] - 2024-12-XX

Added

  • Initial release of rouge-torch
  • Fully vectorized PyTorch implementation of ROUGE scores
  • Support for ROUGE-1, ROUGE-2, and ROUGE-L metrics
  • Differentiable loss function for neural network training
  • GPU acceleration and batch processing
  • Comprehensive test suite with 14 test cases
  • Overfit validation test demonstrating convergence to zero loss
  • Memory-efficient algorithms for long sequences
  • Approximation methods for very long sequences (>100 tokens)
  • Multiple reference support
  • Configurable reduction modes (mean, sum, none)
  • Type hints and comprehensive documentation
  • Example utilities for tokenization and text-to-tensor conversion

Key Features

  • Performance: Fully vectorized operations with no Python loops
  • Loss Bounds: Proper loss bounds [0, N] where N = number of ROUGE types
  • Flexibility: Works with any tokenizer and vocabulary size
  • Validation: Extensively tested including overfit convergence validation
  • Documentation: Comprehensive README with usage examples

Technical Details

  • Requires Python 3.8+
  • Requires PyTorch 1.9+
  • Memory complexity: O(batch_size × seq_len²) for ROUGE-L
  • Time complexity: O(batch_size × seq_len) for ROUGE-N, O(batch_size × seq_len²) for ROUGE-L