Problem
nous's DESIGN intent is "iter-1 < iter-2 cost" (Occam's razor). The schema-blessed mechanism for narrowing iter-1 is experiment_spec.rehearsal_subset, which exposes seeds and arms only — i.e., it lets the agent reduce breadth (fewer cells). It does NOT expose any knob for depth (smaller cells, e.g., shorter duration_seconds, lower concurrency).
When the agent wants iter-1 cheaper but only has breadth-narrowing knobs, it bakes depth-shrinkage directly into verified_parameters. This is dangerous because scale-dependent apparatus checks become invalid silently:
- An empirical-PMF histogram on a 10s window may not stabilize.
- A "backlog-nonempty ≥ 99.9% post-warmup" check may pass trivially under low concurrency.
- A 30-second sliding-window arrival-curve check loses statistical power.
The right principle: retain physics validation with simplicity, instead of sacrificing physics for the sake of simplicity. Occam should narrow what's tested, not weaken what each test means.
Concrete instance
In paper-memorytime-mirage iter-1, the agent shrank duration_seconds from the campaign's locked 600 down to 60 (and concurrency_per_tenant from 32 to 8) to get a faster iter-1 — silently invalidating the workload-distribution histogram and backlog-nonempty checks at the iter-1 scale.
Desired behavior
Extend rehearsal_subset to a richer schema that distinguishes the two modes:
rehearsal_subset:
seeds: [42] # narrows breadth — preserves cell physics
arms: [h-main] # narrows breadth — preserves cell physics
depth_overrides: # narrows depth — invalidates scale-dependent checks
duration_seconds: 120
concurrency_per_tenant: 8
invalidates_checks: # author MUST declare which checks become invalid
- workload-distribution-histogram
- backlog-nonempty-99.9
The depth_overrides block requires invalidates_checks to be present and non-empty if any depth-class parameter is overridden. This forces the agent (or campaign author) to be explicit about which apparatus guarantees they're surrendering.
Suggested implementation sketch
- Extend the bundle schema to add
rehearsal_subset.depth_overrides with a required invalidates_checks sub-field.
- In the methodology prompt for DESIGN, add a paragraph distinguishing breadth vs depth shrinkage, with the worked example above.
- The bundle validator rejects a bundle that has
depth_overrides without an explicit invalidates_checks list.
- The findings synthesizer marks any check listed in
invalidates_checks as "not run at design scale" rather than "passed/failed".
Acceptance criteria
Severity
MEDIUM — invalidated scale-dependent checks at iter-1 in this campaign.
Source
friction-report.md F3, paper-memorytime-mirage campaign (2026-05).
Part of friction-report tracking issue #245.
Problem
nous's DESIGN intent is "iter-1 < iter-2 cost" (Occam's razor). The schema-blessed mechanism for narrowing iter-1 is
experiment_spec.rehearsal_subset, which exposesseedsandarmsonly — i.e., it lets the agent reduce breadth (fewer cells). It does NOT expose any knob for depth (smaller cells, e.g., shorterduration_seconds, lowerconcurrency).When the agent wants iter-1 cheaper but only has breadth-narrowing knobs, it bakes depth-shrinkage directly into
verified_parameters. This is dangerous because scale-dependent apparatus checks become invalid silently:The right principle: retain physics validation with simplicity, instead of sacrificing physics for the sake of simplicity. Occam should narrow what's tested, not weaken what each test means.
Concrete instance
In paper-memorytime-mirage iter-1, the agent shrank
duration_secondsfrom the campaign's locked 600 down to 60 (andconcurrency_per_tenantfrom 32 to 8) to get a faster iter-1 — silently invalidating the workload-distribution histogram and backlog-nonempty checks at the iter-1 scale.Desired behavior
Extend
rehearsal_subsetto a richer schema that distinguishes the two modes:The
depth_overridesblock requiresinvalidates_checksto be present and non-empty if any depth-class parameter is overridden. This forces the agent (or campaign author) to be explicit about which apparatus guarantees they're surrendering.Suggested implementation sketch
rehearsal_subset.depth_overrideswith a requiredinvalidates_checkssub-field.depth_overrideswithout an explicitinvalidates_checkslist.invalidates_checksas "not run at design scale" rather than "passed/failed".Acceptance criteria
depth_overridesandinvalidates_checks.Severity
MEDIUM — invalidated scale-dependent checks at iter-1 in this campaign.
Source
friction-report.mdF3, paper-memorytime-mirage campaign (2026-05).Part of friction-report tracking issue #245.