theomgdev
diff --git a/‎CHANGELOG.md‎
Lines changed: 10 additions & 0 deletions b/‎CHANGELOG.md‎
Lines changed: 10 additions & 0 deletions
diff --git a/‎CITATION.cff‎
Lines changed: 1 addition & 1 deletion b/‎CITATION.cff‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎CONTRIBUTING.md‎
Lines changed: 10 additions & 8 deletions b/‎CONTRIBUTING.md‎
Lines changed: 10 additions & 8 deletions
diff --git a/‎README.md‎
Lines changed: 2 additions & 2 deletions b/‎README.md‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎README_TR.md‎
Lines changed: 2 additions & 2 deletions b/‎README_TR.md‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎docs/LIBRARY.md‎
Lines changed: 14 additions & 11 deletions b/‎docs/LIBRARY.md‎
Lines changed: 14 additions & 11 deletions
diff --git a/‎odyssnet/__init__.py‎
Lines changed: 1 addition & 1 deletion b/‎odyssnet/__init__.py‎
Lines changed: 1 addition & 1 deletion
@@ -4,6 +4,16 @@ All notable changes to OdyssNet will be documented in this file.
 
 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/).
 
+## [2.5.0] — 2026-04-30
+
+### Added
+- **Spatial Hebbian Plasticity**: Introduced a co-activation learning mechanism (classic Hebbian) alongside the existing STDP-style learning.
+- **`hebb_mode` functionality (`hebb_type`)**: `hebb_type` is now repurposed to act as the mechanism toggle: `None` (disabled), `"temporal"`, `"spatial"`, or `"both"`.
+- **`hebb_res`**: Controls the structural resolution (`"global"`, `"neuron"`, `"synapse"`). Defaults to `"neuron"`.
+
+### Changed
+- **BREAKING**: Replaced single-path Hebbian parameters with path-specific prefixes (`t_hebb_factor` and `s_hebb_factor`, etc.). Existing checkpoints utilizing `hebb_factor` will need to be loaded with `strict=False` and re-trained, or manually patched, as we have prioritized a clean architecture over legacy support.
+
 ## [2.4.0] — 2026-04-10
 
 ### Added
 
@@ -5,7 +5,7 @@ authors:
   given-names: "Cahit"
   email: "cksoftwaresystems@gmail.com"
 title: "OdyssNet: The Trainable Dynamic System & Zero-Hidden Architecture"
-version: 2.4.0
+version: 2.5.0
 date-released: 2025-12-12
 url: "https://github.com/theomgdev/OdyssNet"
 abstract: "OdyssNet is a chaotic, fully connected neural network architecture that proves temporal depth (thinking steps) can replace spatial depth (hidden layers). It solves non-linear problems like MNIST with Zero Hidden Layers by utilizing Trainable Chaos."
 
@@ -155,17 +155,17 @@ For tasks requiring high input/output dimensionality (like vision or LLMs) witho
 model = OdyssNet(num_neurons=10, ..., vocab_size=(784, 10))
 ```
 
-### F. Heterogeneous Synaptic Plasticity (`hebb_type`)
-For tasks where **online synaptic plasticity** may help — e.g., fast-adaptation, continual learning, or tasks with shifting statistics — enable one of the three resolution levels:
+### F. Heterogeneous Synaptic Plasticity (`hebb_type` & `hebb_res`)
+For tasks where **online synaptic plasticity** may help — e.g., fast-adaptation, continual learning, or tasks with shifting statistics — enable `hebb_type` (`"temporal"`, `"spatial"`, or `"both"`) and choose one of the three resolution levels:
 
-| `hebb_type` | Extra Params | When to use |
+| `hebb_res` | Extra Params per path | When to use |
 |---|---|---|
 | `"global"` | +2 | Quick experiments; uniform plasticity across all synapses. |
 | `"neuron"` | +2N | RL and reactive environments; per-neuron "caste" differentiation. |
 | `"synapse"` | +2N² | Logic, NLP, and reasoning tasks requiring **dynamic variable binding**. |
 
-*   **What it does:** At each step the network accumulates temporal cross-neuron correlations $C_t = h_t \otimes h_{t-1}$ and applies them as $W_\text{eff} = W + (f_h \odot C_t)$ (where $f_h$ is `hebb_factor`). Both `hebb_factor` and `hebb_decay` are **learnable** — the network discovers how plastic each synapse should be.
-*   **State:** Correlations are persisted via buffers (`hebb_state_W`, `hebb_state_mem`) across intra-sequence forward calls and are explicitly cleared on `reset_state()` between sequences.
+*   **What it does:** At each step the network accumulates correlations (temporal $C_t = h_t \otimes h_{t-1}$ and/or spatial $C_s = h_t \otimes h_t$) depending on `hebb_type`. It applies them to the weights. Both factors and decays (`t_hebb_factor`, `s_hebb_factor`, etc.) are **learnable** — the network discovers how plastic each synapse should be.
+*   **State:** Correlations are persisted via buffers (`t_hebb_state_W`, `s_hebb_state_mem`, etc.) across intra-sequence forward calls and are explicitly cleared on `reset_state()` between sequences.
 *   **Best Use Case (Generation / Sequential Building):** Hebbian shines in tasks where step T relies heavily on expanding or completing a pattern from step T-1. It provides a powerful **short-term working memory** between steps, acting as a dynamic shortcut that fast-tracks sequence generation.
 *   **When *not* to use it (Classification / Independent Features):** Avoid Hebbian in classification tasks where each step processes distinct, independent chunks of information (e.g. sequential MNIST classification). In these tasks, inter-step short-term memory acts as "overfit noise".
 *   **Compatibility:** Fully compatible with `gradient_checkpointing=True`.
@@ -178,7 +178,8 @@ model = OdyssNet(
     input_ids=[0, 1],
     output_ids=[31],
     activation='tanh',
-    hebb_type='synapse',   # Per-synapse plasticity
+    hebb_type='both',
+    hebb_res='synapse',    # Per-synapse plasticity
     device='cuda',
 )
 
@@ -188,12 +189,13 @@ model = OdyssNet(
     input_ids=list(range(8)),
     output_ids=list(range(56, 64)),
     activation='tanh',
-    hebb_type='neuron',    # Per-neuron plasticity
+    hebb_type='temporal',
+    hebb_res='neuron',     # Per-neuron plasticity
     device='cuda',
 )
 
 # Quick experiment — global plasticity
-model = OdyssNet(..., hebb_type='global')
+model = OdyssNet(..., hebb_type='spatial', hebb_res='global')
 
 # Default: Prodigy optimizer — auto-calibrates LR, no tuning needed
 trainer = OdyssNetTrainer(model)
 
@@ -35,7 +35,7 @@ OdyssNet achieves its efficiency through **Space-Time Trade-off**. Instead of ad
 *   **Space-Time Conversion:** Replaces millions of parameters with a few "Thinking Steps".
 *   **Layerless Architecture:** A single $N \times N$ matrix. No hidden layers.
 *   **Trainable Chaos:** Uses **StepNorm** and **Tanh** to tame chaotic signals.
-*   **Heterogeneous Synaptic Plasticity:** Optional online Hebbian learning (`hebb_type='synapse'|'neuron'|'global'`) — the network accumulates temporal neuron correlations and learns *how fast to learn* at global, per-neuron, or per-synapse resolution via fully differentiable logit parameters (`hebb_factor`, `hebb_decay`).
+*   **Heterogeneous Synaptic Plasticity:** Optional online Hebbian learning (`hebb_type='temporal'|'spatial'|'both'`, `hebb_res='synapse'|'neuron'|'global'`) — the network accumulates correlations and learns *how fast to learn* via fully differentiable logit parameters (`t_hebb_factor`, `s_hebb_decay`, etc.).
 *   **Skill Transfer via Transplantation:** Learned temporal skills can be transplanted across model sizes and re-used in new tasks.
 *   **Living Dynamics:** Demonstrates **Willpower** (Latch), **Rhythm** (Stopwatch), and **Resonance** (Sine Wave).
 
@@ -158,7 +158,7 @@ Uncontrolled feedback loops lead to explosion. OdyssNet engineers the chaos to f
 *   **StepNorm** acts as gravity, keeping energy bounded.
 *   **Tanh** filters meaningful signals while maintaining signal symmetry.
 *   **Prodigy Optimizer (default)**: Auto-calibrates the learning rate continuously — no manual tuning required. Pass an explicit `lr` to use AdamW instead.
-*   **Heterogeneous Synaptic Plasticity**: When `hebb_type` is set, temporal correlations $h_t \otimes h_{t-1}$ are accumulated each step and injected as $W_\text{eff} = W + (f_h \odot C_t)$ — where `hebb_factor` can be a global scalar, a per-neuron vector, or a full per-synapse matrix. All variants are learnable, letting the network discover how plastic each pathway should be.
+*   **Heterogeneous Synaptic Plasticity**: When `hebb_type` is set, correlations (temporal $h_t \otimes h_{t-1}$ or spatial $h_t \otimes h_t$) are accumulated each step and injected — where factors like `t_hebb_factor` can be a global scalar, a per-neuron vector, or a full per-synapse matrix. All variants are learnable, letting the network discover how plastic each pathway should be.
 *   **The Latch Experiment** proved OdyssNet can create a stable attractor to hold a decision forever against noise.
 
 ### 5. Why Not RNN or LSTM?
 
@@ -36,7 +36,7 @@ OdyssNet verimliliğini **Uzay-Zaman Takası** (Space-Time Trade-off) ile sağla
 *   **Uzay-Zaman Dönüşümü:** Milyonlarca parametrenin yerini birkaç "Düşünme Adımı" alıyor.
 *   **Katmansız Mimari:** Tek bir $N \times N$ matris. Gizli katman yok.
 *   **Eğitilebilir Kaos:** Kaotik sinyalleri dizginlemek için **StepNorm** ve **Tanh** kullanır.
-*   **Heterojen Sinaptik Plastisitesi:** İsteğe bağlı çevrimiçi Hebbian öğrenmesi (`hebb_type='synapse'|'neuron'|'global'`) — ağ zamansal nöron korelasyonlarını biriktirir ve global, nöron başına veya sinaps başına çözünürlükte tamamen türevlenebilir logit parametreleri (`hebb_factor`, `hebb_decay`) aracılığıyla *ne kadar hızlı öğreneceğini* öğrenir.
+*   **Heterojen Sinaptik Plastisitesi:** İsteğe bağlı çevrimiçi Hebbian öğrenmesi (`hebb_type='temporal'|'spatial'|'both'`, `hebb_res='synapse'|'neuron'|'global'`) — ağ korelasyonları biriktirir ve global, nöron başına veya sinaps başına çözünürlükte tamamen türevlenebilir logit parametreleri (`t_hebb_factor`, `s_hebb_decay`, vb.) aracılığıyla *ne kadar hızlı öğreneceğini* öğrenir.
 *   **Transplant ile Beceri Transferi:** Öğrenilmiş zamansal beceriler model boyutları arasında taşınabilir ve yeni görevlerde yeniden kullanılabilir.
 *   **Canlı Dinamikler:** **İrade** (Mandal), **Ritim** (Kronometre) ve **Rezonans** (Sinüs Dalgası) gösterir.
 
@@ -159,7 +159,7 @@ Kontrolsüz geri besleme döngüleri patlamaya yol açar. OdyssNet kaosun mühen
 *   **StepNorm** yerçekimi gibi davranır, enerjiyi sınırlı tutar.
 *   **Tanh** anlamlı sinyalleri filtreler ve sinyal simetrisini korur.
 *   **Prodigy Optimizer (varsayılan):** Öğrenme hızını sürekli olarak otomatik kalibre eder — manuel ayar gerekmez. Açık bir `lr` değeri geçildiğinde AdamW kullanılır.
-*   **Heterojen Sinaptik Plastisitesi:** `hebb_type` ayarlandığında her adımda zamansal korelasyonlar $h_t \otimes h_{t-1}$ biriktirilir ve $W_\text{eff} = W + (f_h \odot C_t)$ olarak enjekte edilir — `hebb_factor` global bir skaler, nöron başına vektör veya tam sinaps başına matris olabilir. Tüm çeşitler öğrenilebilir olduğundan ağ, her sinaptik yolun ne kadar plastik olması gerektiğini keşfeder.
+*   **Heterojen Sinaptik Plastisitesi:** `hebb_type` ayarlandığında her adımda korelasyonlar (zamansal $h_t \otimes h_{t-1}$ veya uzamsal $h_t \otimes h_t$) biriktirilir ve enjekte edilir — `t_hebb_factor` gibi faktörler global bir skaler, nöron başına vektör veya tam sinaps başına matris olabilir. Tüm çeşitler öğrenilebilir olduğundan ağ, her sinaptik yolun ne kadar plastik olması gerektiğini keşfeder.
 *   **Mandal Deneyi** OdyssNet'in gürültüye karşı bir kararı sonsuza kadar tutmak için kararlı bir çekici oluşturabileceğini kanıtladı.
 
 ### 5. Neden RNN veya LSTM Değil?
 
@@ -32,7 +32,8 @@ model = OdyssNet(
     gate=None,           # Default resolves to ['none', 'none', 'identity']
     vocab_size=None,     # Optional: Decouples input/output size from neurons
     vocab_mode='hybrid', # 'hybrid', 'discrete', or 'continuous'
-    hebb_type=None,      # Optional: Plasticity resolution — None, 'global', 'neuron', or 'synapse'
+    hebb_type=None,      # Toggle: None, 'temporal', 'spatial', or 'both'
+    hebb_res='neuron',   # Plasticity resolution: 'global', 'neuron', or 'synapse'
     debug=False,         # NaN/Inf diagnosis — raises RuntimeError at the first offending operation
 )
 ```
@@ -66,22 +67,24 @@ model = OdyssNet(
     *   `'continuous'`: Initializes only Linear Projection. Use for float-only inputs (e.g., vision, audio). Saves VRAM.
 *   `tie_embeddings` (bool): 
     *   If `True`, ties the input embedding weights to the output decoder weights, saving significant VRAM and parameter count (Symmetric `vocab_size` only). Default is `False`.
-*   `hebb_type` (str or None): Controls the structural resolution of **Heterogeneous Synaptic Plasticity**. Default is `None` (plasticity disabled).
+*   `hebb_type` (str or None): Controls the active mechanism for **Heterogeneous Synaptic Plasticity**. Default is `None` (plasticity disabled).
+    *   `'temporal'`: STDP-style learning; correlates current state $h_t$ with previous state $h_{t-1}$.
+    *   `'spatial'`: Co-activation learning (classic Hebbian); correlates current state $h_t$ with itself $h_t$ (neurons firing simultaneously).
+    *   `'both'`: Combines both temporal and spatial mechanisms.
+*   `hebb_res` (str): Controls the structural resolution of plasticity. Default is `'neuron'`.
 
-    | `hebb_type` | Parameter Shape | Extra Params | Mechanics |
+    | `hebb_res` | Parameter Shape | Extra Params per Path | Mechanics |
     |---|---|---|---|
-    | `None` | — | 0 | Disabled. |
     | `"global"` | scalar `()` | +2 | Uniform plasticity — the whole network is equally plastic. |
     | `"neuron"` | vector `(N,)` | +2N | Per-neuron plasticity — each neuron learns its own adaptation rate. |
     | `"synapse"` | matrix `(N, N)` | +2N² | Per-synapse plasticity — each connection has its own factor and decay. |
 
-    *   Two learnable logit parameters are created according to the resolution:
-        *   `hebb_factor` (raw logit → `sigmoid` → learning rate ≈ 0.047 initially)
-        *   `hebb_decay` (raw logit → `sigmoid` → retention ≈ 0.90 initially)
-    *   During each forward pass the model accumulates temporal cross-neuron correlations $C_t = h_t \otimes h_{t-1}$ and applies them as $W_\text{eff} = W + (f_h \odot C_t)$ (where $f_h$ is `hebb_factor`), with $\odot$ element-wise multiplication broadcast to the chosen resolution.
-    *   The Hebbian state is persisted across forward calls via registered buffers (`hebb_state_W`, `hebb_state_mem`) and is cleared by `reset_state()`.
-    *   Both `hebb_factor` and `hebb_decay` are fully differentiable — gradients flow into them via the recurrent computation so the network **learns how to learn** online.
-    *   **Memory cost**: `"global"` adds negligible overhead; `"neuron"` adds $O(N)$; `"synapse"` triples total parameter count to $3N^2$.
+    *   For each active path (`t_` for temporal, `s_` for spatial), two learnable logit parameters are created according to the resolution:
+        *   `t_hebb_factor` / `s_hebb_factor` (raw logit → `sigmoid` → learning rate ≈ 0.047 initially)
+        *   `t_hebb_decay` / `s_hebb_decay` (raw logit → `sigmoid` → retention ≈ 0.90 initially)
+    *   During each forward pass the model accumulates correlations (temporal $h_t \otimes h_{t-1}$ and/or spatial $h_t \otimes h_t$) and applies them to the effective weights.
+    *   The Hebbian states are persisted across forward calls via registered buffers (`t_hebb_state_W`, `s_hebb_state_W`, etc.) and are cleared by `reset_state()`.
+    *   Both factors and decays are fully differentiable — gradients flow into them via the recurrent computation so the network **learns how to learn** online.
 *   `gate` (None, str, or list[str]): Optional parametric gating mechanism. Default is `None`, which resolves to `['none', 'none', 'identity']`.
     *   `None`: Default configuration with memory identity gate enabled, others disabled.
     *   `str` (e.g., `'sigmoid'`): Applies the same gate activation to all three branches `[encoder_decoder, core, memory]`.
 
@@ -1,4 +1,4 @@
-__version__ = "2.4.0"
+__version__ = "2.5.0"
 
 from .core.network import OdyssNet
 from .training.trainer import OdyssNetTrainer
Original file line number	Diff line number	Diff line change
`@@ -1,4 +1,4 @@`
`1`		`-__version__ = "2.4.0"`
	`1`	`+__version__ = "2.5.0"`
`2`	`2`
`3`	`3`	`from .core.network import OdyssNet`
`4`	`4`	`from .training.trainer import OdyssNetTrainer`