Commit 214b0f0
authored
feat(reporting): add researchReport executive summary layer (#34)
researchReport composes summaryTable, paretoChart, gainHistogram, held-out
gate decisions, and optional failureClusterView output into one structured
artifact for coding-vertical benchmark runs:
- promote/hold/reject decision with rationale, risks, next actions
- per-candidate stats (Wilcoxon q-value, Cohen's d, paired N, gain CI)
- Pareto frontier flagging
- markdown, HTML, and JSON chart specs
Wired into src/index.ts and the @tangle-network/agent-eval/reporting
subpath. Tests in tests/summary-report.test.ts cover decision logic,
candidate scoring, and output formats.1 parent b66856d commit 214b0f0
5 files changed
Lines changed: 618 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
3 | 13 | | |
4 | 14 | | |
5 | 15 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
735 | 735 | | |
736 | 736 | | |
737 | 737 | | |
| 738 | + | |
738 | 739 | | |
739 | 740 | | |
740 | 741 | | |
| |||
745 | 746 | | |
746 | 747 | | |
747 | 748 | | |
| 749 | + | |
| 750 | + | |
| 751 | + | |
| 752 | + | |
| 753 | + | |
748 | 754 | | |
749 | 755 | | |
750 | 756 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
21 | 21 | | |
22 | 22 | | |
23 | 23 | | |
| 24 | + | |
24 | 25 | | |
25 | 26 | | |
26 | 27 | | |
| |||
29 | 30 | | |
30 | 31 | | |
31 | 32 | | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
32 | 38 | | |
33 | 39 | | |
34 | 40 | | |
| |||
0 commit comments