OptimumAF
diff --git a/‎CHANGELOG.md‎
Lines changed: 1 addition & 0 deletions b/‎CHANGELOG.md‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎README.md‎
Lines changed: 2 additions & 2 deletions b/‎README.md‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎docs/benchmarks/benchmark_continual_shift_hardest_case_2026-02-28.txt‎
Lines changed: 4 additions & 4 deletions b/‎docs/benchmarks/benchmark_continual_shift_hardest_case_2026-02-28.txt‎
Lines changed: 4 additions & 4 deletions
diff --git a/‎docs/figures/hardest_mode_dynamics.gif‎
-219 KB b/‎docs/figures/hardest_mode_dynamics.gif‎
-219 KB
diff --git a/‎docs/figures/interactive_hardest_mode_dynamics.html‎
Lines changed: 26 additions & 10 deletions b/‎docs/figures/interactive_hardest_mode_dynamics.html‎
Lines changed: 26 additions & 10 deletions
diff --git a/‎scripts/generate_hardest_mode_dynamics.py‎
Lines changed: 44 additions & 12 deletions b/‎scripts/generate_hardest_mode_dynamics.py‎
Lines changed: 44 additions & 12 deletions
diff --git a/‎scripts/run_continual_shift_benchmark.py‎
Lines changed: 25 additions & 0 deletions b/‎scripts/run_continual_shift_benchmark.py‎
Lines changed: 25 additions & 0 deletions
diff --git a/‎src/adapters/cli.py‎
Lines changed: 10 additions & 0 deletions b/‎src/adapters/cli.py‎
Lines changed: 10 additions & 0 deletions
diff --git a/‎src/app/continual_shift_benchmark.py‎
Lines changed: 18 additions & 1 deletion b/‎src/app/continual_shift_benchmark.py‎
Lines changed: 18 additions & 1 deletion
diff --git a/‎src/app/experiment_runner.py‎
Lines changed: 13 additions & 2 deletions b/‎src/app/experiment_runner.py‎
Lines changed: 13 additions & 2 deletions
@@ -62,6 +62,7 @@ for versioning even while in research-stage development.
 - Added hardest-case dynamics GIF (training progression + inference decision-map evolution) and surfaced it near the top of README and docs dashboard.
 - Added an interactive Plotly hardest-case dynamics page with playback controls and circadian internals visualization (node/edge weights, chemical/plasticity state) on the docs dashboard.
 - Increased hardest-case difficulty substantially (higher drift/noise, lower phase-B train fraction, longer training horizon) and raised hidden-layer width in hardest-case runs for all three models.
+- Added multi-hidden-layer support across NumPy baseline models (backprop, predictive coding, and circadian with an adaptive top hidden layer plus trainable pre-hidden stack).
 - Refreshed README benchmark section with a latest master verification run on 2026-02-28 and added raw output artifact under `docs/benchmarks/`.
 - Repositioned repository messaging to Circadian Predictive Coding as the primary focus.
 - Updated `README.md` with:
 
@@ -128,13 +128,13 @@ Strengths:
 - Sources:
   - [`docs/benchmarks/benchmark_continual_shift_strength_case_2026-02-28.txt`](docs/benchmarks/benchmark_continual_shift_strength_case_2026-02-28.txt)
   - [`docs/benchmarks/benchmark_continual_shift_hardest_case_2026-02-28.txt`](docs/benchmarks/benchmark_continual_shift_hardest_case_2026-02-28.txt)
-- Dynamic capacity adaptation is observable and measurable (updated hardest-case: mean splits `27.57`, hidden size `24 -> 51.57`).
+- Dynamic capacity adaptation is observable and measurable (updated hardest-case: mean splits `48.57`, hidden size `24 -> 72.57`).
 - Competitive behavior in moderate continual-shift stress tests with stable multi-seed performance.
 
 Weaknesses:
 
 - Not best on every benchmark; on the latest CIFAR-100 subset master check, predictive coding accuracy (`0.692`) was higher than circadian (`0.685`).
-- In the updated ultra-hard hardest-case setting, predictive coding currently leads circadian on balanced score (`0.844` vs `0.831`), even though circadian still outperforms backprop (`0.785`).
+- In the updated ultra-hard hardest-case setting, the margin between circadian and predictive coding is small (`0.812` vs `0.808`) with high variance, so ranking can flip across seeds/configurations.
 - Extra algorithmic machinery (sleep scheduling, replay, split/prune controls) adds tuning burden and implementation complexity compared with fixed-width baselines.
 - Speed overhead can appear depending on configuration; in the latest CIFAR-100 subset master check, circadian train speed (`874.2` SPS) was lower than predictive coding (`965.2` SPS).
 - Results are regime-dependent; claims should be tied to specific benchmark settings and seeds instead of treated as universal.
 
@@ -2,10 +2,10 @@ Continual Shift Benchmark
 -------------------------
 Phase A trains on base distribution; phase B trains on shifted/rotated distribution.
 Seeds: [3, 7, 11, 19, 23, 31, 37]
-Setup: hidden_dim=24, phaseA_epochs=120, phaseB_epochs=180, phaseA_noise=0.80, phaseB_noise=1.45
+Setup: hidden_dim=24, hidden_dims=[24, 24, 24], phaseA_epochs=120, phaseB_epochs=180, phaseA_noise=0.80, phaseB_noise=1.45
 Phase B transform: rotation=68.0 deg, translation=(1.60, -1.30)
 Phase B train fraction: 0.05
 
-Backprop: A_pre=0.975+/-0.009, A_post=0.714+/-0.145, B_post=0.856+/-0.025, retention=0.732+/-0.145, balanced=0.785+/-0.064
-Predictive coding: A_pre=0.971+/-0.010, A_post=0.852+/-0.130, B_post=0.836+/-0.037, retention=0.878+/-0.131, balanced=0.844+/-0.058
-Circadian predictive coding: A_pre=0.972+/-0.009, A_post=0.829+/-0.143, B_post=0.833+/-0.029, retention=0.852+/-0.148, balanced=0.831+/-0.064, sleep_events=13.86, splits=27.57, prunes=0.00, hidden_end=51.57
+Backprop: A_pre=0.973+/-0.007, A_post=0.699+/-0.156, B_post=0.808+/-0.035, retention=0.718+/-0.158, balanced=0.753+/-0.089
+Predictive coding: A_pre=0.975+/-0.009, A_post=0.793+/-0.162, B_post=0.823+/-0.046, retention=0.815+/-0.170, balanced=0.808+/-0.063
+Circadian predictive coding: A_pre=0.975+/-0.008, A_post=0.784+/-0.164, B_post=0.841+/-0.026, retention=0.804+/-0.167, balanced=0.812+/-0.073, sleep_events=24.43, splits=48.57, prunes=0.00, hidden_end=72.57
@@ -45,6 +45,7 @@ class HardestModeConfig:
     test_ratio: float = 0.25
     phase_b_train_fraction: float = 0.05
     hidden_dim: int = 24
+    hidden_dims: tuple[int, ...] = (24, 24, 24)
     phase_a_epochs: int = 120
     phase_b_epochs: int = 180
     phase_a_noise: float = 0.8
@@ -83,6 +84,7 @@ class HardestModeSnapshot:
     hidden_bias: FloatVector
     chemical_state: FloatVector
     plasticity_state: FloatVector
+    probe_adaptive_input: FloatVector
     probe_hidden_activation: FloatVector
     probe_output_probability: float
 
@@ -244,13 +246,24 @@ def collect_hardest_mode_snapshots(
     phase_a, phase_b = build_datasets(config)
     x_bounds, y_bounds = compute_bounds(phase_a, phase_b)
 
-    backprop = BackpropMLP(input_dim=2, hidden_dim=config.hidden_dim, seed=config.seed)
-    predictive = PredictiveCodingNetwork(input_dim=2, hidden_dim=config.hidden_dim, seed=config.seed + 1)
+    backprop = BackpropMLP(
+        input_dim=2,
+        hidden_dim=config.hidden_dim,
+        seed=config.seed,
+        hidden_dims=list(config.hidden_dims),
+    )
+    predictive = PredictiveCodingNetwork(
+        input_dim=2,
+        hidden_dim=config.hidden_dim,
+        seed=config.seed + 1,
+        hidden_dims=list(config.hidden_dims),
+    )
     circadian = CircadianPredictiveCodingNetwork(
         input_dim=2,
         hidden_dim=config.hidden_dim,
         seed=config.seed + 2,
         circadian_config=build_hardest_circadian_config(),
+        hidden_dims=list(config.hidden_dims),
     )
 
     snapshots: list[HardestModeSnapshot] = []
@@ -332,7 +345,8 @@ def collect_hardest_mode_snapshots(
             grid_size=config.decision_grid_size,
         )
         predictions = circadian.predict_label(phase_b.test_input).reshape(-1).astype(np.int8)
-        hidden_linear = probe_input @ circadian.weight_input_hidden + circadian.bias_hidden
+        _, _, probe_adaptive_input = circadian._forward_pre_hidden(probe_input)
+        hidden_linear = probe_adaptive_input @ circadian.weight_input_hidden + circadian.bias_hidden
         probe_hidden_activation = np.tanh(hidden_linear).reshape(-1).astype(np.float32)
         probe_output_probability = float(circadian.predict_proba(probe_input)[0, 0])
         input_hidden_weights = circadian.weight_input_hidden.astype(np.float32, copy=True)
@@ -364,6 +378,7 @@ def collect_hardest_mode_snapshots(
                 hidden_bias=hidden_bias,
                 chemical_state=chemical_state,
                 plasticity_state=plasticity_state,
+                probe_adaptive_input=probe_adaptive_input.reshape(-1).astype(np.float32),
                 probe_hidden_activation=probe_hidden_activation,
                 probe_output_probability=probe_output_probability,
             )
@@ -646,6 +661,7 @@ def build_interactive_payload(
                 "hidden_bias": np.round(snapshot.hidden_bias, 4).tolist(),
                 "chemical_state": np.round(snapshot.chemical_state, 4).tolist(),
                 "plasticity_state": np.round(snapshot.plasticity_state, 4).tolist(),
+                "probe_adaptive_input": np.round(snapshot.probe_adaptive_input, 4).tolist(),
                 "probe_hidden_activation": np.round(snapshot.probe_hidden_activation, 4).tolist(),
                 "probe_output_probability": round(snapshot.probe_output_probability, 4),
             }
@@ -937,18 +953,34 @@ def write_interactive_hardest_mode_html(
 
     function drawNetwork(frame) {{
       const hiddenCount = frame.circadian_hidden_dim;
+      const sourceCount = frame.probe_adaptive_input.length;
+      const sourceY = [];
       const hiddenY = [];
+      if (sourceCount <= 1) {{
+        sourceY.push(0.0);
+      }} else {{
+        for (let i = 0; i < sourceCount; i++) {{
+          sourceY.push(0.9 - (1.8 * i / (sourceCount - 1)));
+        }}
+      }}
       if (hiddenCount <= 1) {{
         hiddenY.push(0.0);
       }} else {{
         for (let i = 0; i < hiddenCount; i++) {{
-          hiddenY.push(0.85 - (1.7 * i / (hiddenCount - 1)));
+          hiddenY.push(0.9 - (1.8 * i / (hiddenCount - 1)));
         }}
       }}
-      const nodeX = [-1.0, -1.0];
-      const nodeY = [0.35, -0.35];
-      const nodeText = ["x1", "x2"];
-      const nodeValues = [payload.phase_b_test_points_x[0], payload.phase_b_test_points_y[0]];
+      const nodeX = [];
+      const nodeY = [];
+      const nodeText = [];
+      const nodeValues = [];
+
+      for (let i = 0; i < sourceCount; i++) {{
+        nodeX.push(-1.0);
+        nodeY.push(sourceY[i]);
+        nodeText.push("p" + (i + 1));
+        nodeValues.push(frame.probe_adaptive_input[i] ?? 0.0);
+      }}
 
       for (let i = 0; i < hiddenCount; i++) {{
         nodeX.push(0.0);
@@ -969,15 +1001,15 @@ def write_interactive_hardest_mode_html(
         ...inputHidden.flat().map((v) => Math.abs(v)),
         ...hiddenOutput.map((v) => Math.abs(v))
       );
-      for (let inputIndex = 0; inputIndex < 2; inputIndex++) {{
+      for (let inputIndex = 0; inputIndex < sourceCount; inputIndex++) {{
         for (let h = 0; h < hiddenCount; h++) {{
           const w = inputHidden[inputIndex][h];
           edgeTraces.push({{
             x: [-1.0, 0.0],
-            y: [inputIndex === 0 ? 0.35 : -0.35, hiddenY[h]],
+            y: [sourceY[inputIndex], hiddenY[h]],
             mode: "lines",
             hoverinfo: "text",
-            text: ["w_in[" + (inputIndex + 1) + "->h" + (h + 1) + "]=" + w.toFixed(4)],
+            text: ["w_pre[" + (inputIndex + 1) + "->h" + (h + 1) + "]=" + w.toFixed(4)],
             line: {{
               color: w >= 0 ? "#2874c8" : "#c45b58",
               width: 1 + 5 * Math.abs(w) / maxAbsWeight
@@ -1025,7 +1057,7 @@ def write_interactive_hardest_mode_html(
         title: {{text: "Circadian Internals Graph (nodes + edge weights)"}},
         margin: {{l: 20, r: 20, t: 42, b: 20}},
         xaxis: {{visible: false, range: [-1.25, 1.25]}},
-        yaxis: {{visible: false, range: [-1.0, 1.0]}},
+        yaxis: {{visible: false, range: [-1.1, 1.1]}},
         paper_bgcolor: "#ffffff",
         plot_bgcolor: "#ffffff"
       }};
 
@@ -29,6 +29,7 @@ class ProfileDefaults:
     phase_a_epochs: int
     phase_b_epochs: int
     hidden_dim: int
+    hidden_dims: tuple[int, ...] | None
     phase_a_noise_scale: float
     phase_b_noise_scale: float
     phase_b_rotation_degrees: float
@@ -60,6 +61,12 @@ def build_parser() -> argparse.ArgumentParser:
     parser.add_argument("--phase-a-epochs", type=int, default=None)
     parser.add_argument("--phase-b-epochs", type=int, default=None)
     parser.add_argument("--hidden-dim", type=int, default=None)
+    parser.add_argument(
+        "--hidden-dims",
+        type=str,
+        default="",
+        help="Optional comma-separated hidden-layer widths (e.g. 24,24,24).",
+    )
     parser.add_argument("--phase-a-noise-scale", type=float, default=None)
     parser.add_argument("--phase-b-noise-scale", type=float, default=None)
     parser.add_argument("--phase-b-rotation-degrees", type=float, default=None)
@@ -87,6 +94,12 @@ def main() -> None:
             else _build_hardest_case_circadian_config()
         )
     )
+    cli_hidden_dims = _parse_optional_hidden_dims(args.hidden_dims)
+    selected_hidden_dims = (
+        cli_hidden_dims
+        if cli_hidden_dims is not None
+        else profile_defaults.hidden_dims
+    )
     config = ContinualShiftConfig(
         sample_count_phase_a=_resolve_optional_int(
             args.sample_count_phase_a, profile_defaults.sample_count_phase_a
@@ -100,6 +113,7 @@ def main() -> None:
         phase_a_epochs=_resolve_optional_int(args.phase_a_epochs, profile_defaults.phase_a_epochs),
         phase_b_epochs=_resolve_optional_int(args.phase_b_epochs, profile_defaults.phase_b_epochs),
         hidden_dim=_resolve_optional_int(args.hidden_dim, profile_defaults.hidden_dim),
+        hidden_dims=selected_hidden_dims,
         phase_a_noise_scale=_resolve_optional_float(
             args.phase_a_noise_scale, profile_defaults.phase_a_noise_scale
         ),
@@ -179,6 +193,7 @@ def _build_profile_defaults(profile: str) -> ProfileDefaults:
             phase_a_epochs=120,
             phase_b_epochs=180,
             hidden_dim=24,
+            hidden_dims=(24, 24, 24),
             phase_a_noise_scale=0.8,
             phase_b_noise_scale=1.45,
             phase_b_rotation_degrees=68.0,
@@ -194,6 +209,7 @@ def _build_profile_defaults(profile: str) -> ProfileDefaults:
         phase_a_epochs=110,
         phase_b_epochs=80,
         hidden_dim=12,
+        hidden_dims=None,
         phase_a_noise_scale=0.8,
         phase_b_noise_scale=1.0,
         phase_b_rotation_degrees=40.0,
@@ -223,5 +239,14 @@ def _parse_int_list(raw_values: str) -> list[int]:
     return [int(item) for item in items]
 
 
+def _parse_optional_hidden_dims(raw_values: str) -> tuple[int, ...] | None:
+    if not raw_values.strip():
+        return None
+    values = _parse_int_list(raw_values)
+    if any(value <= 0 for value in values):
+        raise ValueError("hidden_dims values must be positive")
+    return tuple(values)
+
+
 if __name__ == "__main__":
     main()
@@ -30,6 +30,12 @@ def build_argument_parser() -> argparse.ArgumentParser:
     parser.add_argument("--samples", type=int, default=settings.dataset_size, help="Number of samples.")
     parser.add_argument("--epochs", type=int, default=settings.epoch_count, help="Training epochs.")
     parser.add_argument("--hidden-dim", type=int, default=12, help="Hidden layer width.")
+    parser.add_argument(
+        "--hidden-dims",
+        type=str,
+        default="",
+        help="Optional comma-separated hidden-layer widths (for multi-hidden-layer models).",
+    )
     parser.add_argument("--noise", type=float, default=0.8, help="Dataset noise scale.")
     parser.add_argument(
         "--noise-levels",
@@ -236,11 +242,15 @@ def main() -> None:
         replay_inference_steps=arguments.replay_inference_steps,
         replay_inference_learning_rate=arguments.replay_inference_learning_rate,
     )
+    hidden_dims = None
+    if arguments.hidden_dims.strip():
+        hidden_dims = tuple(_parse_int_list(arguments.hidden_dims))
 
     config = ExperimentConfig(
         sample_count=arguments.samples,
         noise_scale=arguments.noise,
         hidden_dim=arguments.hidden_dim,
+        hidden_dims=hidden_dims,
         epoch_count=arguments.epochs,
         circadian_sleep_interval=arguments.sleep_interval,
         circadian_force_sleep=(not arguments.respect_adaptive_sleep_trigger),
 
@@ -32,6 +32,7 @@ class ContinualShiftConfig:
     phase_b_translation_y: float = -0.7
 
     hidden_dim: int = 12
+    hidden_dims: tuple[int, ...] | None = None
     phase_a_epochs: int = 110
     phase_b_epochs: int = 80
 
@@ -180,6 +181,7 @@ def format_continual_shift_benchmark(result: ContinualShiftBenchmarkResult) -> s
         (
             "Setup: "
             f"hidden_dim={config.hidden_dim}, "
+            f"hidden_dims={list(config.hidden_dims) if config.hidden_dims is not None else [config.hidden_dim]}, "
             f"phaseA_epochs={config.phase_a_epochs}, phaseB_epochs={config.phase_b_epochs}, "
             f"phaseA_noise={config.phase_a_noise_scale:.2f}, phaseB_noise={config.phase_b_noise_scale:.2f}"
         ),
@@ -256,17 +258,25 @@ def _run_single_seed(config: ContinualShiftConfig, seed: int) -> ContinualShiftS
     )
     phase_b = _build_phase_b_dataset(config=config, seed=seed + 101)
 
-    backprop_model = BackpropMLP(input_dim=2, hidden_dim=config.hidden_dim, seed=seed)
+    resolved_hidden_dims = list(config.hidden_dims) if config.hidden_dims is not None else None
+    backprop_model = BackpropMLP(
+        input_dim=2,
+        hidden_dim=config.hidden_dim,
+        seed=seed,
+        hidden_dims=resolved_hidden_dims,
+    )
     predictive_coding_model = PredictiveCodingNetwork(
         input_dim=2,
         hidden_dim=config.hidden_dim,
         seed=seed + 1,
+        hidden_dims=resolved_hidden_dims,
     )
     circadian_model = CircadianPredictiveCodingNetwork(
         input_dim=2,
         hidden_dim=config.hidden_dim,
         seed=seed + 2,
         circadian_config=config.circadian_config,
+        hidden_dims=resolved_hidden_dims,
     )
 
     sleep_event_count = 0
@@ -553,6 +563,13 @@ def _validate_config(config: ContinualShiftConfig) -> None:
         raise ValueError("phase noise scales must be positive")
     if config.hidden_dim <= 0:
         raise ValueError("hidden_dim must be positive")
+    if config.hidden_dims is not None:
+        if len(config.hidden_dims) == 0:
+            raise ValueError("hidden_dims cannot be empty")
+        if any(hidden <= 0 for hidden in config.hidden_dims):
+            raise ValueError("all hidden_dims values must be positive")
+        if config.hidden_dim != config.hidden_dims[-1]:
+            raise ValueError("hidden_dim must match the last value in hidden_dims")
     if config.phase_a_epochs <= 0 or config.phase_b_epochs <= 0:
         raise ValueError("phase epochs must be positive")
     if config.backprop_learning_rate <= 0.0:
 
@@ -22,6 +22,7 @@ class ExperimentConfig:
     sample_count: int = 400
     noise_scale: float = 0.8
     hidden_dim: int = 12
+    hidden_dims: tuple[int, ...] | None = None
     epoch_count: int = 160
     backprop_learning_rate: float = 0.12
     pc_learning_rate: float = 0.05
@@ -79,15 +80,25 @@ def run_experiment(
         seed=config.random_seed,
     )
 
-    backprop_model = BackpropMLP(input_dim=2, hidden_dim=config.hidden_dim, seed=config.random_seed)
+    resolved_hidden_dims = list(config.hidden_dims) if config.hidden_dims is not None else None
+    backprop_model = BackpropMLP(
+        input_dim=2,
+        hidden_dim=config.hidden_dim,
+        seed=config.random_seed,
+        hidden_dims=resolved_hidden_dims,
+    )
     predictive_coding_model = PredictiveCodingNetwork(
-        input_dim=2, hidden_dim=config.hidden_dim, seed=config.random_seed + 1
+        input_dim=2,
+        hidden_dim=config.hidden_dim,
+        seed=config.random_seed + 1,
+        hidden_dims=resolved_hidden_dims,
     )
     circadian_model = CircadianPredictiveCodingNetwork(
         input_dim=2,
         hidden_dim=config.hidden_dim,
         seed=config.random_seed + 2,
         circadian_config=config.circadian_config,
+        hidden_dims=resolved_hidden_dims,
     )
 
     backprop_losses: list[float] = []