Skip to content

Commit 8ee0e0e

Browse files
Add harness contract alignment checks
1 parent 5639077 commit 8ee0e0e

8 files changed

Lines changed: 641 additions & 17 deletions

File tree

opencode/agents/developer.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -44,6 +44,8 @@ You are the senior developer.
4444

4545
When `lead` delegates direct mode for a small, clear, low-risk change without using a slash command, treat it as an approved implementation task.
4646

47+
Later adjustments for that same implementation go back to `developer`; keep continuity and do not expect `lead` to implement them.
48+
4749
Before editing, identify:
4850

4951
- objective;
Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
# Overview - Iteration 016
2+
3+
## Failure Pattern
4+
5+
The harness checker had coverage gaps for command and agent prompt drift. It
6+
verified that agents and commands were documented, but not enough of the
7+
contract-to-prompt alignment was enforced mechanically.
8+
9+
The follow-up review found three concrete risks:
10+
11+
- prose evidence containing markdown filenames could be misread as a missing path;
12+
- ordered flows could pass when the required agent names appeared in the wrong order;
13+
- cross-agent checks could pass when a prompt matched only one invariant.
14+
15+
## Root Cause
16+
17+
The checker grew incrementally. Existing checks protected the lead router,
18+
`/feature`, and `/plan`, but `/evolve`, `/scope`, `/design`, and broader
19+
contract-to-prompt alignment were not covered consistently.
20+
21+
## Applied Fix
22+
23+
The checker now adds bounded checks for `/evolve`, `/scope`, `/design`, and
24+
agent-prompt invariants. The path parser now accepts only strings that look like
25+
standalone repository-relative file paths. Regression tests cover the previously
26+
observed false positive and false negatives.
27+
28+
## Risk
29+
30+
These checks are intentionally bounded regex checks, not a natural-language
31+
parser. Future prompt rewrites may need checker pattern updates when wording
32+
changes but the contract remains valid.
Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,31 @@
1+
{
2+
"iteration": 16,
3+
"evaluates_iteration": 16,
4+
"results": [
5+
{
6+
"change_id": "chg-16-1",
7+
"predicted_fixes_confirmed": [
8+
"evolve-order-drift",
9+
"scope-order-drift",
10+
"agent-invariant-drift",
11+
"prose-evidence-path-false-positive"
12+
],
13+
"predicted_fixes_not_confirmed": [],
14+
"risk_tasks_regressed": [],
15+
"risk_tasks_not_regressed": [
16+
"public-checker-patterns-match-english-prompts",
17+
"no-private-config-or-paths-in-public-artifacts",
18+
"existing-feature-plan-checks-unchanged"
19+
],
20+
"unpredicted_regressions": [],
21+
"decision": "keep",
22+
"evidence": [
23+
"node --test scripts/check-harness.test.mjs passed 4/4 tests.",
24+
"node scripts/check-harness.mjs passed.",
25+
"Repository-level scripts/check.sh passed.",
26+
"Public artifact leak check found no private paths, local MCP config, providers, credentials, or local search wiring."
27+
],
28+
"notes": "The public port keeps only harness checker, prompt-contract, docs, and summary AHE evidence. Private opencode.json, MCP servers, provider config, local memory data, and raw transcripts are intentionally excluded."
29+
}
30+
]
31+
}
Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
{
2+
"iteration": 16,
3+
"changes": [
4+
{
5+
"id": "chg-16-1",
6+
"type": "new",
7+
"description": "Add mechanical checker coverage for /evolve ordering, cross-agent prompt invariants, and /scope and /design command contracts.",
8+
"files": [
9+
"scripts/check-harness.mjs",
10+
"scripts/check-harness.test.mjs",
11+
"agents/developer.md",
12+
"docs/ai/harness/checks.md"
13+
],
14+
"failure_pattern": "Harness prompts and command contracts could drift without detection, and prior path validation treated prose evidence containing markdown filenames as a missing local path.",
15+
"evidence": [
16+
"docs/ai/evolution/runs/iteration-016-contract-prompt-alignment/evaluation.md",
17+
"docs/ai/evolution/runs/iteration-016-contract-prompt-alignment/analysis/overview.md"
18+
],
19+
"root_cause": "The checker validated coverage and a few critical flows, but did not enforce several command contracts or each configured agent invariant.",
20+
"predicted_fixes": [
21+
"evolve-order-drift",
22+
"scope-order-drift",
23+
"agent-invariant-drift",
24+
"prose-evidence-path-false-positive"
25+
],
26+
"risk_tasks": [
27+
"public-checker-patterns-match-english-prompts",
28+
"no-private-config-or-paths-in-public-artifacts",
29+
"existing-feature-plan-checks-unchanged"
30+
],
31+
"constraint_level": "tool",
32+
"why_this_component": "The failure is mechanical checker coverage, so scripts/check-harness.mjs is the narrowest component. The developer prompt receives one explicit continuity rule to keep the documented contract reflected in the prompt."
33+
}
34+
]
35+
}
Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,31 @@
1+
# Iteration 016 - Contract/Prompt Alignment
2+
3+
## Result
4+
5+
Decision: `improve`.
6+
7+
The checker now covers additional harness contracts:
8+
9+
- `/evolve` must preserve the `evaluator -> debugger -> evolver` ordering.
10+
- `/scope` must preserve `researcher -> specifier` ordering.
11+
- `/design` must keep its Open Design contract.
12+
- Agent prompts must keep each configured invariant from `docs/ai/harness/agents.md`.
13+
- AHE JSON evidence prose may mention markdown filenames without being treated as a path.
14+
15+
## Scenarios
16+
17+
| Scenario | Expected | Actual |
18+
| --- | --- | --- |
19+
| `node --check scripts/check-harness.mjs` | pass | pass |
20+
| `node scripts/check-harness.mjs` | pass | pass |
21+
| Prose evidence mentions `PRODUCT.md/DESIGN.md` | pass | pass |
22+
| `/evolve` order is inverted | fail | fail |
23+
| `/scope` order is inverted | fail | fail |
24+
| `lead` prompt keeps only one invariant | fail | fail |
25+
| `git diff --check` | pass | pass |
26+
27+
## Notes
28+
29+
- This public artifact contains summary evidence only.
30+
- No raw transcripts, private providers, MCP configuration, credentials, or local
31+
machine paths are included.

opencode/docs/ai/harness/checks.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,9 @@ The harness check validates:
2727
`docs/ai/harness/commands.md`;
2828
- local `/feature` contract;
2929
- local `/plan` contract;
30+
- local `/evolve` contract;
31+
- minimum consistency between agent contracts and prompts in `agents/*.md`;
32+
- local `/scope` and `/design` contracts;
3033
- main docs in `docs/ai/harness/`;
3134
- benchmark references to replay and evidence taxonomy;
3235
- AHE run lifecycle under `docs/ai/evolution/runs/`;

0 commit comments

Comments
 (0)