Commit 9e3c75c
feat: implement production-ready informer and chronos transformer models
InformerModel (Zhou et al., AAAI 2021):
- ProbSparse self-attention with O(L log L) complexity via top-k query selection
- Self-attention distilling with 1D convolution + ELU + max pooling
- Multi-head attention with Q/K/V projections and scaled dot-product
- Sinusoidal positional encoding for temporal awareness
- Pre-norm layer normalization architecture with GELU activation
- Generative decoder with learnable start tokens for single-pass forecasting
- Feed-forward networks with 4x expansion ratio
- Comprehensive educational XML documentation explaining concepts
ChronosFoundationModel (Ansari et al., 2024):
- Mean-scaling tokenization for scale-invariant value representation
- Causal multi-head self-attention with proper masking
- Pre-norm transformer layers with Q/K/V projections
- Feed-forward networks with GELU activation
- Sinusoidal positional encoding
- Temperature-based sampling for probabilistic forecasts
- Comprehensive educational XML documentation
Both models include stochastic coordinate descent training with numerical
gradient estimation for framework-agnostic operation.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>1 parent 98b7105 commit 9e3c75c
2 files changed
Lines changed: 2589 additions & 640 deletions
0 commit comments