Skip to content

Commit 8793c49

Browse files
committed
Add multi-hidden-layer support and harder deepest benchmark mode
1 parent 6af5cda commit 8793c49

18 files changed

Lines changed: 535 additions & 137 deletions

CHANGELOG.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -62,6 +62,7 @@ for versioning even while in research-stage development.
6262
- Added hardest-case dynamics GIF (training progression + inference decision-map evolution) and surfaced it near the top of README and docs dashboard.
6363
- Added an interactive Plotly hardest-case dynamics page with playback controls and circadian internals visualization (node/edge weights, chemical/plasticity state) on the docs dashboard.
6464
- Increased hardest-case difficulty substantially (higher drift/noise, lower phase-B train fraction, longer training horizon) and raised hidden-layer width in hardest-case runs for all three models.
65+
- Added multi-hidden-layer support across NumPy baseline models (backprop, predictive coding, and circadian with an adaptive top hidden layer plus trainable pre-hidden stack).
6566
- Refreshed README benchmark section with a latest master verification run on 2026-02-28 and added raw output artifact under `docs/benchmarks/`.
6667
- Repositioned repository messaging to Circadian Predictive Coding as the primary focus.
6768
- Updated `README.md` with:

README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -128,13 +128,13 @@ Strengths:
128128
- Sources:
129129
- [`docs/benchmarks/benchmark_continual_shift_strength_case_2026-02-28.txt`](docs/benchmarks/benchmark_continual_shift_strength_case_2026-02-28.txt)
130130
- [`docs/benchmarks/benchmark_continual_shift_hardest_case_2026-02-28.txt`](docs/benchmarks/benchmark_continual_shift_hardest_case_2026-02-28.txt)
131-
- Dynamic capacity adaptation is observable and measurable (updated hardest-case: mean splits `27.57`, hidden size `24 -> 51.57`).
131+
- Dynamic capacity adaptation is observable and measurable (updated hardest-case: mean splits `48.57`, hidden size `24 -> 72.57`).
132132
- Competitive behavior in moderate continual-shift stress tests with stable multi-seed performance.
133133

134134
Weaknesses:
135135

136136
- Not best on every benchmark; on the latest CIFAR-100 subset master check, predictive coding accuracy (`0.692`) was higher than circadian (`0.685`).
137-
- In the updated ultra-hard hardest-case setting, predictive coding currently leads circadian on balanced score (`0.844` vs `0.831`), even though circadian still outperforms backprop (`0.785`).
137+
- In the updated ultra-hard hardest-case setting, the margin between circadian and predictive coding is small (`0.812` vs `0.808`) with high variance, so ranking can flip across seeds/configurations.
138138
- Extra algorithmic machinery (sleep scheduling, replay, split/prune controls) adds tuning burden and implementation complexity compared with fixed-width baselines.
139139
- Speed overhead can appear depending on configuration; in the latest CIFAR-100 subset master check, circadian train speed (`874.2` SPS) was lower than predictive coding (`965.2` SPS).
140140
- Results are regime-dependent; claims should be tied to specific benchmark settings and seeds instead of treated as universal.

docs/benchmarks/benchmark_continual_shift_hardest_case_2026-02-28.txt

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -2,10 +2,10 @@ Continual Shift Benchmark
22
-------------------------
33
Phase A trains on base distribution; phase B trains on shifted/rotated distribution.
44
Seeds: [3, 7, 11, 19, 23, 31, 37]
5-
Setup: hidden_dim=24, phaseA_epochs=120, phaseB_epochs=180, phaseA_noise=0.80, phaseB_noise=1.45
5+
Setup: hidden_dim=24, hidden_dims=[24, 24, 24], phaseA_epochs=120, phaseB_epochs=180, phaseA_noise=0.80, phaseB_noise=1.45
66
Phase B transform: rotation=68.0 deg, translation=(1.60, -1.30)
77
Phase B train fraction: 0.05
88

9-
Backprop: A_pre=0.975+/-0.009, A_post=0.714+/-0.145, B_post=0.856+/-0.025, retention=0.732+/-0.145, balanced=0.785+/-0.064
10-
Predictive coding: A_pre=0.971+/-0.010, A_post=0.852+/-0.130, B_post=0.836+/-0.037, retention=0.878+/-0.131, balanced=0.844+/-0.058
11-
Circadian predictive coding: A_pre=0.972+/-0.009, A_post=0.829+/-0.143, B_post=0.833+/-0.029, retention=0.852+/-0.148, balanced=0.831+/-0.064, sleep_events=13.86, splits=27.57, prunes=0.00, hidden_end=51.57
9+
Backprop: A_pre=0.973+/-0.007, A_post=0.699+/-0.156, B_post=0.808+/-0.035, retention=0.718+/-0.158, balanced=0.753+/-0.089
10+
Predictive coding: A_pre=0.975+/-0.009, A_post=0.793+/-0.162, B_post=0.823+/-0.046, retention=0.815+/-0.170, balanced=0.808+/-0.063
11+
Circadian predictive coding: A_pre=0.975+/-0.008, A_post=0.784+/-0.164, B_post=0.841+/-0.026, retention=0.804+/-0.167, balanced=0.812+/-0.073, sleep_events=24.43, splits=48.57, prunes=0.00, hidden_end=72.57
-219 KB
Loading

docs/figures/interactive_hardest_mode_dynamics.html

Lines changed: 26 additions & 10 deletions
Large diffs are not rendered by default.

scripts/generate_hardest_mode_dynamics.py

Lines changed: 44 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -45,6 +45,7 @@ class HardestModeConfig:
4545
test_ratio: float = 0.25
4646
phase_b_train_fraction: float = 0.05
4747
hidden_dim: int = 24
48+
hidden_dims: tuple[int, ...] = (24, 24, 24)
4849
phase_a_epochs: int = 120
4950
phase_b_epochs: int = 180
5051
phase_a_noise: float = 0.8
@@ -83,6 +84,7 @@ class HardestModeSnapshot:
8384
hidden_bias: FloatVector
8485
chemical_state: FloatVector
8586
plasticity_state: FloatVector
87+
probe_adaptive_input: FloatVector
8688
probe_hidden_activation: FloatVector
8789
probe_output_probability: float
8890

@@ -244,13 +246,24 @@ def collect_hardest_mode_snapshots(
244246
phase_a, phase_b = build_datasets(config)
245247
x_bounds, y_bounds = compute_bounds(phase_a, phase_b)
246248

247-
backprop = BackpropMLP(input_dim=2, hidden_dim=config.hidden_dim, seed=config.seed)
248-
predictive = PredictiveCodingNetwork(input_dim=2, hidden_dim=config.hidden_dim, seed=config.seed + 1)
249+
backprop = BackpropMLP(
250+
input_dim=2,
251+
hidden_dim=config.hidden_dim,
252+
seed=config.seed,
253+
hidden_dims=list(config.hidden_dims),
254+
)
255+
predictive = PredictiveCodingNetwork(
256+
input_dim=2,
257+
hidden_dim=config.hidden_dim,
258+
seed=config.seed + 1,
259+
hidden_dims=list(config.hidden_dims),
260+
)
249261
circadian = CircadianPredictiveCodingNetwork(
250262
input_dim=2,
251263
hidden_dim=config.hidden_dim,
252264
seed=config.seed + 2,
253265
circadian_config=build_hardest_circadian_config(),
266+
hidden_dims=list(config.hidden_dims),
254267
)
255268

256269
snapshots: list[HardestModeSnapshot] = []
@@ -332,7 +345,8 @@ def collect_hardest_mode_snapshots(
332345
grid_size=config.decision_grid_size,
333346
)
334347
predictions = circadian.predict_label(phase_b.test_input).reshape(-1).astype(np.int8)
335-
hidden_linear = probe_input @ circadian.weight_input_hidden + circadian.bias_hidden
348+
_, _, probe_adaptive_input = circadian._forward_pre_hidden(probe_input)
349+
hidden_linear = probe_adaptive_input @ circadian.weight_input_hidden + circadian.bias_hidden
336350
probe_hidden_activation = np.tanh(hidden_linear).reshape(-1).astype(np.float32)
337351
probe_output_probability = float(circadian.predict_proba(probe_input)[0, 0])
338352
input_hidden_weights = circadian.weight_input_hidden.astype(np.float32, copy=True)
@@ -364,6 +378,7 @@ def collect_hardest_mode_snapshots(
364378
hidden_bias=hidden_bias,
365379
chemical_state=chemical_state,
366380
plasticity_state=plasticity_state,
381+
probe_adaptive_input=probe_adaptive_input.reshape(-1).astype(np.float32),
367382
probe_hidden_activation=probe_hidden_activation,
368383
probe_output_probability=probe_output_probability,
369384
)
@@ -646,6 +661,7 @@ def build_interactive_payload(
646661
"hidden_bias": np.round(snapshot.hidden_bias, 4).tolist(),
647662
"chemical_state": np.round(snapshot.chemical_state, 4).tolist(),
648663
"plasticity_state": np.round(snapshot.plasticity_state, 4).tolist(),
664+
"probe_adaptive_input": np.round(snapshot.probe_adaptive_input, 4).tolist(),
649665
"probe_hidden_activation": np.round(snapshot.probe_hidden_activation, 4).tolist(),
650666
"probe_output_probability": round(snapshot.probe_output_probability, 4),
651667
}
@@ -937,18 +953,34 @@ def write_interactive_hardest_mode_html(
937953
938954
function drawNetwork(frame) {{
939955
const hiddenCount = frame.circadian_hidden_dim;
956+
const sourceCount = frame.probe_adaptive_input.length;
957+
const sourceY = [];
940958
const hiddenY = [];
959+
if (sourceCount <= 1) {{
960+
sourceY.push(0.0);
961+
}} else {{
962+
for (let i = 0; i < sourceCount; i++) {{
963+
sourceY.push(0.9 - (1.8 * i / (sourceCount - 1)));
964+
}}
965+
}}
941966
if (hiddenCount <= 1) {{
942967
hiddenY.push(0.0);
943968
}} else {{
944969
for (let i = 0; i < hiddenCount; i++) {{
945-
hiddenY.push(0.85 - (1.7 * i / (hiddenCount - 1)));
970+
hiddenY.push(0.9 - (1.8 * i / (hiddenCount - 1)));
946971
}}
947972
}}
948-
const nodeX = [-1.0, -1.0];
949-
const nodeY = [0.35, -0.35];
950-
const nodeText = ["x1", "x2"];
951-
const nodeValues = [payload.phase_b_test_points_x[0], payload.phase_b_test_points_y[0]];
973+
const nodeX = [];
974+
const nodeY = [];
975+
const nodeText = [];
976+
const nodeValues = [];
977+
978+
for (let i = 0; i < sourceCount; i++) {{
979+
nodeX.push(-1.0);
980+
nodeY.push(sourceY[i]);
981+
nodeText.push("p" + (i + 1));
982+
nodeValues.push(frame.probe_adaptive_input[i] ?? 0.0);
983+
}}
952984
953985
for (let i = 0; i < hiddenCount; i++) {{
954986
nodeX.push(0.0);
@@ -969,15 +1001,15 @@ def write_interactive_hardest_mode_html(
9691001
...inputHidden.flat().map((v) => Math.abs(v)),
9701002
...hiddenOutput.map((v) => Math.abs(v))
9711003
);
972-
for (let inputIndex = 0; inputIndex < 2; inputIndex++) {{
1004+
for (let inputIndex = 0; inputIndex < sourceCount; inputIndex++) {{
9731005
for (let h = 0; h < hiddenCount; h++) {{
9741006
const w = inputHidden[inputIndex][h];
9751007
edgeTraces.push({{
9761008
x: [-1.0, 0.0],
977-
y: [inputIndex === 0 ? 0.35 : -0.35, hiddenY[h]],
1009+
y: [sourceY[inputIndex], hiddenY[h]],
9781010
mode: "lines",
9791011
hoverinfo: "text",
980-
text: ["w_in[" + (inputIndex + 1) + "->h" + (h + 1) + "]=" + w.toFixed(4)],
1012+
text: ["w_pre[" + (inputIndex + 1) + "->h" + (h + 1) + "]=" + w.toFixed(4)],
9811013
line: {{
9821014
color: w >= 0 ? "#2874c8" : "#c45b58",
9831015
width: 1 + 5 * Math.abs(w) / maxAbsWeight
@@ -1025,7 +1057,7 @@ def write_interactive_hardest_mode_html(
10251057
title: {{text: "Circadian Internals Graph (nodes + edge weights)"}},
10261058
margin: {{l: 20, r: 20, t: 42, b: 20}},
10271059
xaxis: {{visible: false, range: [-1.25, 1.25]}},
1028-
yaxis: {{visible: false, range: [-1.0, 1.0]}},
1060+
yaxis: {{visible: false, range: [-1.1, 1.1]}},
10291061
paper_bgcolor: "#ffffff",
10301062
plot_bgcolor: "#ffffff"
10311063
}};

scripts/run_continual_shift_benchmark.py

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,7 @@ class ProfileDefaults:
2929
phase_a_epochs: int
3030
phase_b_epochs: int
3131
hidden_dim: int
32+
hidden_dims: tuple[int, ...] | None
3233
phase_a_noise_scale: float
3334
phase_b_noise_scale: float
3435
phase_b_rotation_degrees: float
@@ -60,6 +61,12 @@ def build_parser() -> argparse.ArgumentParser:
6061
parser.add_argument("--phase-a-epochs", type=int, default=None)
6162
parser.add_argument("--phase-b-epochs", type=int, default=None)
6263
parser.add_argument("--hidden-dim", type=int, default=None)
64+
parser.add_argument(
65+
"--hidden-dims",
66+
type=str,
67+
default="",
68+
help="Optional comma-separated hidden-layer widths (e.g. 24,24,24).",
69+
)
6370
parser.add_argument("--phase-a-noise-scale", type=float, default=None)
6471
parser.add_argument("--phase-b-noise-scale", type=float, default=None)
6572
parser.add_argument("--phase-b-rotation-degrees", type=float, default=None)
@@ -87,6 +94,12 @@ def main() -> None:
8794
else _build_hardest_case_circadian_config()
8895
)
8996
)
97+
cli_hidden_dims = _parse_optional_hidden_dims(args.hidden_dims)
98+
selected_hidden_dims = (
99+
cli_hidden_dims
100+
if cli_hidden_dims is not None
101+
else profile_defaults.hidden_dims
102+
)
90103
config = ContinualShiftConfig(
91104
sample_count_phase_a=_resolve_optional_int(
92105
args.sample_count_phase_a, profile_defaults.sample_count_phase_a
@@ -100,6 +113,7 @@ def main() -> None:
100113
phase_a_epochs=_resolve_optional_int(args.phase_a_epochs, profile_defaults.phase_a_epochs),
101114
phase_b_epochs=_resolve_optional_int(args.phase_b_epochs, profile_defaults.phase_b_epochs),
102115
hidden_dim=_resolve_optional_int(args.hidden_dim, profile_defaults.hidden_dim),
116+
hidden_dims=selected_hidden_dims,
103117
phase_a_noise_scale=_resolve_optional_float(
104118
args.phase_a_noise_scale, profile_defaults.phase_a_noise_scale
105119
),
@@ -179,6 +193,7 @@ def _build_profile_defaults(profile: str) -> ProfileDefaults:
179193
phase_a_epochs=120,
180194
phase_b_epochs=180,
181195
hidden_dim=24,
196+
hidden_dims=(24, 24, 24),
182197
phase_a_noise_scale=0.8,
183198
phase_b_noise_scale=1.45,
184199
phase_b_rotation_degrees=68.0,
@@ -194,6 +209,7 @@ def _build_profile_defaults(profile: str) -> ProfileDefaults:
194209
phase_a_epochs=110,
195210
phase_b_epochs=80,
196211
hidden_dim=12,
212+
hidden_dims=None,
197213
phase_a_noise_scale=0.8,
198214
phase_b_noise_scale=1.0,
199215
phase_b_rotation_degrees=40.0,
@@ -223,5 +239,14 @@ def _parse_int_list(raw_values: str) -> list[int]:
223239
return [int(item) for item in items]
224240

225241

242+
def _parse_optional_hidden_dims(raw_values: str) -> tuple[int, ...] | None:
243+
if not raw_values.strip():
244+
return None
245+
values = _parse_int_list(raw_values)
246+
if any(value <= 0 for value in values):
247+
raise ValueError("hidden_dims values must be positive")
248+
return tuple(values)
249+
250+
226251
if __name__ == "__main__":
227252
main()

src/adapters/cli.py

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,12 @@ def build_argument_parser() -> argparse.ArgumentParser:
3030
parser.add_argument("--samples", type=int, default=settings.dataset_size, help="Number of samples.")
3131
parser.add_argument("--epochs", type=int, default=settings.epoch_count, help="Training epochs.")
3232
parser.add_argument("--hidden-dim", type=int, default=12, help="Hidden layer width.")
33+
parser.add_argument(
34+
"--hidden-dims",
35+
type=str,
36+
default="",
37+
help="Optional comma-separated hidden-layer widths (for multi-hidden-layer models).",
38+
)
3339
parser.add_argument("--noise", type=float, default=0.8, help="Dataset noise scale.")
3440
parser.add_argument(
3541
"--noise-levels",
@@ -236,11 +242,15 @@ def main() -> None:
236242
replay_inference_steps=arguments.replay_inference_steps,
237243
replay_inference_learning_rate=arguments.replay_inference_learning_rate,
238244
)
245+
hidden_dims = None
246+
if arguments.hidden_dims.strip():
247+
hidden_dims = tuple(_parse_int_list(arguments.hidden_dims))
239248

240249
config = ExperimentConfig(
241250
sample_count=arguments.samples,
242251
noise_scale=arguments.noise,
243252
hidden_dim=arguments.hidden_dim,
253+
hidden_dims=hidden_dims,
244254
epoch_count=arguments.epochs,
245255
circadian_sleep_interval=arguments.sleep_interval,
246256
circadian_force_sleep=(not arguments.respect_adaptive_sleep_trigger),

src/app/continual_shift_benchmark.py

Lines changed: 18 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -32,6 +32,7 @@ class ContinualShiftConfig:
3232
phase_b_translation_y: float = -0.7
3333

3434
hidden_dim: int = 12
35+
hidden_dims: tuple[int, ...] | None = None
3536
phase_a_epochs: int = 110
3637
phase_b_epochs: int = 80
3738

@@ -180,6 +181,7 @@ def format_continual_shift_benchmark(result: ContinualShiftBenchmarkResult) -> s
180181
(
181182
"Setup: "
182183
f"hidden_dim={config.hidden_dim}, "
184+
f"hidden_dims={list(config.hidden_dims) if config.hidden_dims is not None else [config.hidden_dim]}, "
183185
f"phaseA_epochs={config.phase_a_epochs}, phaseB_epochs={config.phase_b_epochs}, "
184186
f"phaseA_noise={config.phase_a_noise_scale:.2f}, phaseB_noise={config.phase_b_noise_scale:.2f}"
185187
),
@@ -256,17 +258,25 @@ def _run_single_seed(config: ContinualShiftConfig, seed: int) -> ContinualShiftS
256258
)
257259
phase_b = _build_phase_b_dataset(config=config, seed=seed + 101)
258260

259-
backprop_model = BackpropMLP(input_dim=2, hidden_dim=config.hidden_dim, seed=seed)
261+
resolved_hidden_dims = list(config.hidden_dims) if config.hidden_dims is not None else None
262+
backprop_model = BackpropMLP(
263+
input_dim=2,
264+
hidden_dim=config.hidden_dim,
265+
seed=seed,
266+
hidden_dims=resolved_hidden_dims,
267+
)
260268
predictive_coding_model = PredictiveCodingNetwork(
261269
input_dim=2,
262270
hidden_dim=config.hidden_dim,
263271
seed=seed + 1,
272+
hidden_dims=resolved_hidden_dims,
264273
)
265274
circadian_model = CircadianPredictiveCodingNetwork(
266275
input_dim=2,
267276
hidden_dim=config.hidden_dim,
268277
seed=seed + 2,
269278
circadian_config=config.circadian_config,
279+
hidden_dims=resolved_hidden_dims,
270280
)
271281

272282
sleep_event_count = 0
@@ -553,6 +563,13 @@ def _validate_config(config: ContinualShiftConfig) -> None:
553563
raise ValueError("phase noise scales must be positive")
554564
if config.hidden_dim <= 0:
555565
raise ValueError("hidden_dim must be positive")
566+
if config.hidden_dims is not None:
567+
if len(config.hidden_dims) == 0:
568+
raise ValueError("hidden_dims cannot be empty")
569+
if any(hidden <= 0 for hidden in config.hidden_dims):
570+
raise ValueError("all hidden_dims values must be positive")
571+
if config.hidden_dim != config.hidden_dims[-1]:
572+
raise ValueError("hidden_dim must match the last value in hidden_dims")
556573
if config.phase_a_epochs <= 0 or config.phase_b_epochs <= 0:
557574
raise ValueError("phase epochs must be positive")
558575
if config.backprop_learning_rate <= 0.0:

src/app/experiment_runner.py

Lines changed: 13 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,7 @@ class ExperimentConfig:
2222
sample_count: int = 400
2323
noise_scale: float = 0.8
2424
hidden_dim: int = 12
25+
hidden_dims: tuple[int, ...] | None = None
2526
epoch_count: int = 160
2627
backprop_learning_rate: float = 0.12
2728
pc_learning_rate: float = 0.05
@@ -79,15 +80,25 @@ def run_experiment(
7980
seed=config.random_seed,
8081
)
8182

82-
backprop_model = BackpropMLP(input_dim=2, hidden_dim=config.hidden_dim, seed=config.random_seed)
83+
resolved_hidden_dims = list(config.hidden_dims) if config.hidden_dims is not None else None
84+
backprop_model = BackpropMLP(
85+
input_dim=2,
86+
hidden_dim=config.hidden_dim,
87+
seed=config.random_seed,
88+
hidden_dims=resolved_hidden_dims,
89+
)
8390
predictive_coding_model = PredictiveCodingNetwork(
84-
input_dim=2, hidden_dim=config.hidden_dim, seed=config.random_seed + 1
91+
input_dim=2,
92+
hidden_dim=config.hidden_dim,
93+
seed=config.random_seed + 1,
94+
hidden_dims=resolved_hidden_dims,
8595
)
8696
circadian_model = CircadianPredictiveCodingNetwork(
8797
input_dim=2,
8898
hidden_dim=config.hidden_dim,
8999
seed=config.random_seed + 2,
90100
circadian_config=config.circadian_config,
101+
hidden_dims=resolved_hidden_dims,
91102
)
92103

93104
backprop_losses: list[float] = []

0 commit comments

Comments
 (0)