Commit b08164e
Complete ground truth re-curation (160/160 SDLC + 207/207 Org) and promote to canonical
- Complete remaining 12 SDLC ground truth agent files (8 via Daytona curator,
4 linux kernel tasks manual from canonical)
- Promote all 367 _agent variants to canonical via promote_agent_oracles.py --force
- Hydrate task_spec.json files from promoted oracles
- Fix extract_v2_report_data.py to scan runs/official/_raw/ (was missing all run data)
- Run IR analysis pipeline: normalize_retrieval_events + compute_retrieval_metrics
- IR results: file_recall=0.394, MRR=0.352 (1921 computable tasks)
- MCP advantage confirmed: mcp-remote-artifact recall=0.596 vs baseline recall=0.330
- V2 report: 375 paired tasks, BL=0.459, MCP=0.480, delta=+0.021
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>1 parent b067d06 commit b08164e
File tree
0 file changed
+0
-0
lines changed0 file changed
+0
-0
lines changed
0 commit comments