feat: US-008 - Agent-based oracle curation tool

sjarmak · claude · sjarmak · commit 7c931cbc98c8 · 2026-02-20T20:29:33.000Z
Implements scripts/curate_oracle.py — a Sourcegraph-powered oracle
discovery tool that automatically generates exhaustive oracle_answer.json
files for MCP-unique benchmark tasks.

Key features:
- Calls SG GraphQL API (stdlib urllib) for file and symbol search
- Curates file_set_match, symbol_resolution, dependency_chain oracles
- Incremental: merges new findings with existing oracle_answer.json
- Rate limiting with exponential backoff on 429 responses
- --verify mode runs validate_mcp_task_instance.py post-curation
- --dry-run shows planned queries without API calls
- Writes oracle_curation_log.json for full query auditability

Co-Authored-By: Claude Sonnet 4.6 &lt;noreply@anthropic.com&gt;
diff --git a/ralph-mcp-unique/prd.json b/ralph-mcp-unique/prd.json
@@ -199,7 +199,7 @@
         "python3 -m py_compile scripts/curate_oracle.py succeeds"
       ],
       "priority": 8,
-      "passes": false,
+      "passes": true,
       "notes": "This is the key automation that makes closed-world oracles feasible without human involvement. The tool should be thorough: for a file_set_match oracle, it should search EVERY repo in the fixture, not just the ones it expects to find results in. The curation log provides auditability. May not find 100% of items but should get close for well-scoped tasks. Run it, review the log, re-run if gaps are spotted."
     },
     {
diff --git a/ralph-mcp-unique/progress.txt b/ralph-mcp-unique/progress.txt
@@ -160,6 +160,29 @@
 [2026-02-20 20:21:26 UTC] Iteration 1 complete
 [2026-02-20 20:21:28 UTC] Iteration 2 started
 
+## 2026-02-20 - US-008: Agent-based oracle curation tool
+- Created `scripts/curate_oracle.py` (stdlib-only: urllib for SG API)
+- CLI: `--task-dir DIR`, `--task-spec PATH`, `--verify`, `--verbose`, `--dry-run`, `--max-results`
+- Implements SourcegraphClient class with graphql(), search_files(), search_symbols() methods
+- Oracle curation strategies: curate_file_set_match, curate_symbol_resolution, curate_dependency_chain, curate_provenance, curate_keyword_presence
+- Writes oracle_answer.json: {files, symbols, chains, chain, text, _metadata} compatible with oracle_checks.py
+- Writes oracle_curation_log.json: {task_id, sg_url, curation_entries, sg_request_log}
+- Incremental mode: merge_oracle_answers() deduplicates and merges new findings into existing oracle
+- Rate limiting: 0.25s between requests, 3-retry exponential backoff on 429/URLError
+- --verify: runs validate_mcp_task_instance.py on the curated oracle
+- --dry-run: shows planned queries without calling SG API
+- Project root discovery: walks up from task_dir AND from CWD to find fixtures/ directory
+- py_compile: OK, --help works, dry-run works, live SG API tested (requires SOURCEGRAPH_ACCESS_TOKEN)
+- Files changed: `scripts/curate_oracle.py` (new)
+- **Learnings for future iterations:**
+  - SG GraphQL API: `/.api/graphql` with `Authorization: token {token}` header
+  - File search: `query SearchFiles($query: String!)` with `... on FileMatch` fragment
+  - Symbol search: prefix query with `type:symbol` for dedicated symbol search endpoint
+  - Project root detection must try from CWD, not just from task_dir (temp files break otherwise)
+  - oracle_answer.json "chain" (flat list) vs "chains" (array of chain objects) — both needed for oracle_checks.py compat
+  - SG returns 403 without token, not 401 — handle gracefully as empty results
+---
+
 ## 2026-02-20 - US-003: SG indexing verification (completion)
 - Verified all 7 sg-benchmarks mirrors are now indexed in Sourcegraph via list_repos + keyword_search
   - sg-benchmarks/kubernetes-client-go ✓ (indexed, searchable)
diff --git a/scripts/curate_oracle.py b/scripts/curate_oracle.py

Original file line number	Diff line number	Diff line change
`@@ -199,7 +199,7 @@`
`199`	`199`	`"python3 -m py_compile scripts/curate_oracle.py succeeds"`
`200`	`200`	`],`
`201`	`201`	`"priority": 8,`
`202`		`- "passes": false,`
	`202`	`+ "passes": true,`
`203`	`203`	`"notes": "This is the key automation that makes closed-world oracles feasible without human involvement. The tool should be thorough: for a file_set_match oracle, it should search EVERY repo in the fixture, not just the ones it expects to find results in. The curation log provides auditability. May not find 100% of items but should get close for well-scoped tasks. Run it, review the log, re-run if gaps are spotted."`
`204`	`204`	`},`
`205`	`205`	`{`