theomgdev
diff --git a/‎CHANGELOG.md‎
Lines changed: 20 additions & 0 deletions b/‎CHANGELOG.md‎
Lines changed: 20 additions & 0 deletions
diff --git a/‎CITATION.cff‎
Lines changed: 1 addition & 1 deletion b/‎CITATION.cff‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/LIBRARY.md‎
Lines changed: 12 additions & 0 deletions b/‎docs/LIBRARY.md‎
Lines changed: 12 additions & 0 deletions
@@ -4,6 +4,26 @@ All notable changes to OdyssNet will be documented in this file.
 
 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/).
 
+## [2.4.1] — 2026-04-17
+
+### Added
+- Added compact direct-input fast path for non-vocab float tensors where the trailing feature dimension matches `len(input_ids)`, avoiding dense neuron-space expansion.
+- Added regression tests for compact input paths and single-step output scaling consistency.
+
+### Changed
+- Optimized recurrent forward internals in `OdyssNet`:
+  - lazy input-scale lookup in sparse legacy injection,
+  - precomputed Hebbian row/diagonal factors,
+  - correlation matrix computation switched from `einsum` to `matmul`,
+  - skipped unnecessary Hebbian buffer clones in no-grad inference,
+  - selective scaling applied only to output neurons.
+- Optimized `OdyssNetTrainer` data movement and AMP plumbing by caching device-type flags, using non-blocking tensor transfers, and reusing compact direct-input routing in both `train_batch()` and `predict()`.
+- Optimized `examples/advanced/experiment_financial_oracle.py` with vectorized split sampling, vectorized BTC overlay window normalization, pinned-memory GPU transfer path, reusable best-state snapshots, and downsampled workbench heatmap rendering.
+
+### Fixed
+- Fixed single-step output scaling aliasing where scaled outputs could share storage with returned final state.
+- Fixed trainer single-step extraction to consistently use scaled last-timestep outputs.
+
 ## [2.4.0] — 2026-04-10
 
 ### Added
 
@@ -5,7 +5,7 @@ authors:
   given-names: "Cahit"
   email: "cksoftwaresystems@gmail.com"
 title: "OdyssNet: The Trainable Dynamic System & Zero-Hidden Architecture"
-version: 2.1.0
+version: 2.4.1
 date-released: 2025-12-12
 url: "https://github.com/theomgdev/OdyssNet"
 abstract: "OdyssNet is a chaotic, fully connected neural network architecture that proves temporal depth (thinking steps) can replace spatial depth (hidden layers). It solves non-linear problems like MNIST with Zero Hidden Layers by utilizing Trainable Chaos."
 
@@ -159,6 +159,18 @@ tokens = torch.randint(0, 50257, (batch, 128))
 output = model(tokens, steps=640)
 ```
 
+### 4. Compact Direct Feature Mode (Fast Path)
+**Use case**: Non-vocab float inputs where the feature width equals `len(input_ids)`.
+*   **Behavior**: For 2D `(Batch, Features)` and 3D `(Batch, Steps, Features)` tensors, `OdyssNetTrainer` can bypass dense neuron-space expansion and feed compact features directly.
+*   **Compatibility**: Active only in non-vocab mode and only for floating-point feature tensors.
+*   **Benefit**: Lower memory traffic and faster training/inference on feature-driven workloads.
+
+```python
+# Compact 3D feature input (len(input_ids)=4)
+x = torch.randn(batch, steps, 4)
+pred = trainer.predict(x, thinking_steps=steps)
+```
+
 #### Comparison of Sequential Input Formats
 | Input Type | Format | Modality | Recommended Use Case |
 | :--- | :--- | :--- | :--- |