Skip to content

Commit 423eef5

Browse files
committed
docs: update Harness ERP follow-up evidence
1 parent 387dbfa commit 423eef5

4 files changed

Lines changed: 132 additions & 52 deletions

File tree

docs/evaluation.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -110,7 +110,7 @@ cheaper to correct after the harness becomes part of the repository.
110110

111111
- [Small harness outcome evidence report](examples/effectiveness-report-small-evidence.md) records three harnessed task outcomes and summarizes a narrow operational evidence pass without treating Harness Doctor scores or passing checks as proof of agent effectiveness.
112112
- [TodayBus harnessed-only dogfood benchmark](examples/effectiveness-report-todaybus-dogfood.md) records three product-task outcomes, excludes a non-comparable setup run, and treats the result as an initial benchmark rather than proof of effectiveness improvement.
113-
- [Harness ERP Spring/Maven dogfood benchmark](examples/effectiveness-report-harness-erp-dogfood.md) records five backend product-task outcomes, one honest boundary miss, prompt hashes, failure-memory linkage, and source tracking as initial benchmark evidence rather than proof of effectiveness improvement.
113+
- [Harness ERP Spring/Maven dogfood benchmark](examples/effectiveness-report-harness-erp-dogfood.md) records five initial and four follow-up backend product-task outcomes, one honest boundary miss, prompt hashes, failure-memory linkage, source tracking, and CI verification evidence while keeping harnessed-only observations separate from effectiveness-improvement claims.
114114

115115
Before adding a new dogfood report to this kit, use
116116
[`docs/checklists/dogfood-evidence-adoption.md`](checklists/dogfood-evidence-adoption.md)

0 commit comments

Comments
 (0)