Problem
When the campaign.yaml conflicts with patterns in the target repo's docs (e.g., CLAUDE.md, README examples, sample YAMLs), the DESIGN agent has no explicit hierarchy to follow. It tends to pattern-match on whichever corpus is larger — usually the docs — and produces a bundle that reflects the docs' conventions rather than the campaign's locked spec.
Concrete instance from paper-memorytime-mirage
BLIS's CLAUDE.md uses qwen/qwen3-14b in 10+ example invocations. The campaign locked meta-llama/llama-3.1-8b-instruct. The first DESIGN agent picked qwen because the docs' weight of evidence was larger. Both models are calibrated in BLIS's defaults.yaml, so the choice was internally defensible — but it silently rewrote the experimental physics (different π/δ in the latency model, different K derivation).
This compounds with F1 (no spec-fidelity enforcement): nous didn't catch the deviation because the bundle was self-consistent.
Desired behavior
The DESIGN agent's methodology prompt (the system prompt template under orchestrator/prompts/design.md or equivalent) must include an explicit hierarchy clause:
When the campaign.yaml conflicts with the target repo's documentation, sample configs, or example YAMLs, the campaign.yaml wins. Do not adopt patterns from the target repo's docs (e.g., model names, default concurrency, default block sizes) if they contradict any value declared in the campaign.yaml's workload, target_system, or locked_parameters sections. When in doubt, treat the campaign.yaml as the only source of truth for experiment design; treat the target repo as a source of truth only for how to invoke the system, not what to invoke it with.
Suggested implementation sketch
- Locate the design-phase methodology prompt (likely in
orchestrator/prompts/design.md or wherever the SDK turn's system prompt is assembled).
- Add the hierarchy clause as a top-level section, before any section that instructs the agent to read the target repo.
- Add a worked example: "Example: campaign.yaml says
model: llama-3.1-8b-instruct. Target repo's CLAUDE.md shows qwen/qwen3-14b in 10+ examples. Choose llama. The campaign.yaml wins."
- Update the design-phase regression tests (if any) to verify the prompt contains the hierarchy clause.
Acceptance criteria
Severity
MEDIUM-HIGH — drove F1 in part. Cheap structural fix at the prompt layer.
Source
friction-report.md F2, paper-memorytime-mirage campaign (2026-05).
Part of friction-report tracking issue #245.
Problem
When the campaign.yaml conflicts with patterns in the target repo's docs (e.g.,
CLAUDE.md, README examples, sample YAMLs), the DESIGN agent has no explicit hierarchy to follow. It tends to pattern-match on whichever corpus is larger — usually the docs — and produces a bundle that reflects the docs' conventions rather than the campaign's locked spec.Concrete instance from paper-memorytime-mirage
BLIS's
CLAUDE.mdusesqwen/qwen3-14bin 10+ example invocations. The campaign lockedmeta-llama/llama-3.1-8b-instruct. The first DESIGN agent pickedqwenbecause the docs' weight of evidence was larger. Both models are calibrated in BLIS'sdefaults.yaml, so the choice was internally defensible — but it silently rewrote the experimental physics (different π/δ in the latency model, different K derivation).This compounds with F1 (no spec-fidelity enforcement): nous didn't catch the deviation because the bundle was self-consistent.
Desired behavior
The DESIGN agent's methodology prompt (the system prompt template under
orchestrator/prompts/design.mdor equivalent) must include an explicit hierarchy clause:Suggested implementation sketch
orchestrator/prompts/design.mdor wherever the SDK turn's system prompt is assembled).model: llama-3.1-8b-instruct. Target repo'sCLAUDE.mdshowsqwen/qwen3-14bin 10+ examples. Choose llama. The campaign.yaml wins."Acceptance criteria
Severity
MEDIUM-HIGH — drove F1 in part. Cheap structural fix at the prompt layer.
Source
friction-report.mdF2, paper-memorytime-mirage campaign (2026-05).Part of friction-report tracking issue #245.