Skip to content

Commit b08164e

Browse files
sjarmakclaude
andcommitted
Complete ground truth re-curation (160/160 SDLC + 207/207 Org) and promote to canonical
- Complete remaining 12 SDLC ground truth agent files (8 via Daytona curator, 4 linux kernel tasks manual from canonical) - Promote all 367 _agent variants to canonical via promote_agent_oracles.py --force - Hydrate task_spec.json files from promoted oracles - Fix extract_v2_report_data.py to scan runs/official/_raw/ (was missing all run data) - Run IR analysis pipeline: normalize_retrieval_events + compute_retrieval_metrics - IR results: file_recall=0.394, MRR=0.352 (1921 computable tasks) - MCP advantage confirmed: mcp-remote-artifact recall=0.596 vs baseline recall=0.330 - V2 report: 375 paired tasks, BL=0.459, MCP=0.480, delta=+0.021 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent b067d06 commit b08164e

File tree

0 file changed

+0
-0
lines changed

    0 file changed

    +0
    -0
    lines changed

    0 commit comments

    Comments
     (0)