Commit 2624921
feat: five-way oracle curation bias mitigations
curate_oracle.py:
1. LLM query expansion — Claude Haiku generates N semantically diverse queries
from seed_prompt so no single keyword_search call recovers the full oracle;
falls back to pattern variation when ANTHROPIC_API_KEY absent or --no-llm
2. Content validation via lineMatches — _classify_line_context / _infer_file_tier
classify each hit as comment/string/code and annotate oracle files as
'required' (defines concept) or 'sufficient' (references/tests it)
3. Oracle quality gate — validate_oracle_quality warns on >15 files, >5 files/repo
from a single-term pattern, zero required-tier files, and tier imbalance
4. Two-tier oracle — oracle files carry {"tier": "required"|"sufficient"} so
weighted F1 in the evaluator concentrates score on definition-level files
5. Decouple search_pattern from curation — get_curation_queries checks
params["curation_queries"] first so task authors can specify exact search
queries without exposing them in the agent-visible search_pattern field
New CLI flags: --no-llm, --anthropic-api-key
oracle_checks.py + 215 task copies:
- check_file_set_match: adds weighted_recall, weighted_f1, required_recall,
required_total, required_matched when oracle has tier annotations;
backward-compatible (untiered oracles unchanged)
- _get_primary_score: prefers weighted_f1 over f1 for file_set_match when
tier annotations are present so required files count 2x in composite score
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>1 parent 161e510 commit 2624921
File tree
217 files changed
+13997
-1073
lines changed- benchmarks
- ccb_mcp_compliance
- ccx-compliance-051/tests
- ccx-compliance-052/tests
- ccx-compliance-053/tests
- ccx-compliance-057-ds/tests
- ccx-compliance-057/tests
- ccx-compliance-115/tests
- ccx-compliance-118/tests
- ccx-compliance-124/tests
- ccx-compliance-182/tests
- ccx-compliance-183/tests
- ccx-compliance-184/tests
- ccx-compliance-185/tests
- ccx-compliance-186/tests
- ccx-compliance-187/tests
- ccx-compliance-188/tests
- ccx-compliance-189/tests
- ccx-compliance-190/tests
- ccx-compliance-191/tests
- ccx-compliance-192/tests
- ccx-compliance-193/tests
- ccx-compliance-194/tests
- ccb_mcp_crossorg
- ccx-crossorg-061/tests
- ccx-crossorg-062/tests
- ccx-crossorg-066/tests
- ccx-crossorg-121/tests
- ccx-crossorg-132/tests
- ccx-crossorg-208/tests
- ccx-crossorg-209/tests
- ccx-crossorg-210/tests
- ccx-crossorg-211/tests
- ccx-crossorg-212/tests
- ccx-crossorg-213/tests
- ccx-crossorg-214/tests
- ccx-crossorg-215/tests
- ccx-crossorg-216/tests
- ccx-crossorg-217/tests
- ccx-crossorg-218/tests
- ccx-crossorg-219/tests
- ccx-crossorg-220/tests
- ccx-crossorg-221/tests
- ccx-crossorg-222/tests
- ccb_mcp_crossrepo_tracing
- ccx-config-trace-003/tests
- ccx-config-trace-010/tests
- ccx-dep-trace-001/tests
- ccx-dep-trace-002/tests
- ccx-dep-trace-004/tests
- ccx-dep-trace-102/tests
- ccx-dep-trace-116/tests
- ccx-dep-trace-123/tests
- ccx-dep-trace-133/tests
- ccx-dep-trace-171/tests
- ccx-dep-trace-172/tests
- ccx-dep-trace-173/tests
- ccx-dep-trace-174/tests
- ccx-dep-trace-175/tests
- ccx-dep-trace-176/tests
- ccx-dep-trace-177/tests
- ccx-dep-trace-178/tests
- ccx-dep-trace-179/tests
- ccx-dep-trace-180/tests
- ccx-dep-trace-181/tests
- ccb_mcp_crossrepo
- ccx-dep-trace-106/tests
- ccx-dep-trace-253/tests
- ccx-dep-trace-254/tests
- ccx-dep-trace-255/tests
- ccx-dep-trace-256/tests
- ccx-dep-trace-257/tests
- ccx-dep-trace-258/tests
- ccx-dep-trace-259/tests
- ccx-dep-trace-260/tests
- ccx-dep-trace-261/tests
- ccx-dep-trace-262/tests
- ccx-dep-trace-263/tests
- ccx-dep-trace-264/tests
- ccx-dep-trace-265/tests
- ccx-dep-trace-266/tests
- ccx-dep-trace-267/tests
- ccx-dep-trace-268/tests
- ccx-dep-trace-269/tests
- ccx-dep-trace-270/tests
- ccx-dep-trace-271/tests
- ccb_mcp_domain
- ccx-domain-071/tests
- ccx-domain-072/tests
- ccx-domain-073/tests
- ccx-domain-074/tests
- ccx-domain-101/tests
- ccx-domain-112/tests
- ccx-domain-120/tests
- ccx-domain-129/tests
- ccx-domain-137/tests
- ccx-domain-140/tests
- ccx-domain-151/tests
- ccx-domain-152/tests
- ccx-domain-153/tests
- ccx-domain-154/tests
- ccx-domain-155/tests
- ccx-domain-156/tests
- ccx-domain-157/tests
- ccx-domain-158/tests
- ccx-domain-159/tests
- ccx-domain-160/tests
- ccb_mcp_incident
- ccx-incident-031/tests
- ccx-incident-032/tests
- ccx-incident-033/tests
- ccx-incident-034/tests
- ccx-incident-037/tests
- ccx-incident-108/tests
- ccx-incident-110/tests
- ccx-incident-113/tests
- ccx-incident-125/tests
- ccx-incident-131/tests
- ccx-incident-139/tests
- ccx-incident-142/tests
- ccx-incident-143/tests
- ccx-incident-144/tests
- ccx-incident-145/tests
- ccx-incident-146/tests
- ccx-incident-147/tests
- ccx-incident-148/tests
- ccx-incident-149/tests
- ccx-incident-150/tests
- ccb_mcp_migration
- ccx-migration-022/tests
- ccx-migration-025/tests
- ccx-migration-026/tests
- ccx-migration-027/tests
- ccx-migration-107/tests
- ccx-migration-114/tests
- ccx-migration-117/tests
- ccx-migration-195/tests
- ccx-migration-196/tests
- ccx-migration-197/tests
- ccx-migration-198/tests
- ccx-migration-199/tests
- ccx-migration-200/tests
- ccx-migration-201/tests
- ccx-migration-202/tests
- ccx-migration-203/tests
- ccx-migration-204/tests
- ccx-migration-205/tests
- ccx-migration-206/tests
- ccx-migration-207/tests
- ccb_mcp_onboarding
- ccx-explore-042-ds/tests
- ccx-onboard-041/tests
- ccx-onboard-042/tests
- ccx-onboard-043/tests
- ccx-onboard-044/tests
- ccx-onboard-050-ds/tests
- ccx-onboard-050/tests
- ccx-onboard-103/tests
- ccx-onboard-109/tests
- ccx-onboard-128/tests
- ccx-onboard-134/tests
- ccx-onboard-136/tests
- ccx-onboard-138/tests
- ccb_mcp_org
- ccx-agentic-081/tests
- ccx-agentic-082/tests
- ccx-agentic-083/tests
- ccx-agentic-122/tests
- ccx-agentic-127/tests
- ccx-agentic-223/tests
- ccx-agentic-224/tests
- ccx-agentic-225/tests
- ccx-agentic-226/tests
- ccx-agentic-227/tests
- ccx-agentic-228/tests
- ccx-agentic-229/tests
- ccx-agentic-230/tests
- ccx-agentic-231/tests
- ccx-agentic-232/tests
- ccx-agentic-233/tests
- ccx-agentic-234/tests
- ccx-agentic-235/tests
- ccx-agentic-236/tests
- ccx-agentic-237/tests
- ccb_mcp_platform
- ccx-explore-091-ds/tests
- ccx-platform-091/tests
- ccx-platform-094/tests
- ccx-platform-100/tests
- ccx-platform-104/tests
- ccx-platform-119/tests
- ccx-platform-238/tests
- ccx-platform-239/tests
- ccx-platform-240/tests
- ccx-platform-241/tests
- ccx-platform-242/tests
- ccx-platform-243/tests
- ccx-platform-244/tests
- ccx-platform-245/tests
- ccx-platform-246/tests
- ccx-platform-247/tests
- ccx-platform-248/tests
- ccx-platform-249/tests
- ccx-platform-250/tests
- ccx-platform-251/tests
- ccx-platform-252/tests
- ccb_mcp_security
- ccx-vuln-remed-011/tests
- ccx-vuln-remed-012/tests
- ccx-vuln-remed-013/tests
- ccx-vuln-remed-014/tests
- ccx-vuln-remed-105/tests
- ccx-vuln-remed-111/tests
- ccx-vuln-remed-126/tests
- ccx-vuln-remed-130/tests
- ccx-vuln-remed-135/tests
- ccx-vuln-remed-141/tests
- ccx-vuln-remed-161/tests
- ccx-vuln-remed-162/tests
- ccx-vuln-remed-163/tests
- ccx-vuln-remed-164/tests
- ccx-vuln-remed-165/tests
- ccx-vuln-remed-166/tests
- ccx-vuln-remed-167/tests
- ccx-vuln-remed-168/tests
- ccx-vuln-remed-169/tests
- ccx-vuln-remed-170/tests
- scripts
- ccb_metrics
Some content is hidden
Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
217 files changed
+13997
-1073
lines changedLines changed: 63 additions & 4 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
132 | 132 | | |
133 | 133 | | |
134 | 134 | | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
135 | 138 | | |
136 | 139 | | |
137 | 140 | | |
| |||
142 | 145 | | |
143 | 146 | | |
144 | 147 | | |
145 | | - | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
146 | 156 | | |
147 | 157 | | |
148 | 158 | | |
| |||
159 | 169 | | |
160 | 170 | | |
161 | 171 | | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
162 | 182 | | |
163 | 183 | | |
164 | 184 | | |
| |||
169 | 189 | | |
170 | 190 | | |
171 | 191 | | |
172 | | - | |
| 192 | + | |
173 | 193 | | |
174 | 194 | | |
175 | 195 | | |
| |||
178 | 198 | | |
179 | 199 | | |
180 | 200 | | |
| 201 | + | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
| 209 | + | |
| 210 | + | |
| 211 | + | |
| 212 | + | |
| 213 | + | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
181 | 232 | | |
182 | 233 | | |
183 | 234 | | |
| |||
462 | 513 | | |
463 | 514 | | |
464 | 515 | | |
465 | | - | |
| 516 | + | |
| 517 | + | |
| 518 | + | |
| 519 | + | |
| 520 | + | |
| 521 | + | |
| 522 | + | |
| 523 | + | |
| 524 | + | |
| 525 | + | |
466 | 526 | | |
467 | | - | |
468 | 527 | | |
469 | 528 | | |
470 | 529 | | |
| |||
Lines changed: 63 additions & 4 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
132 | 132 | | |
133 | 133 | | |
134 | 134 | | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
135 | 138 | | |
136 | 139 | | |
137 | 140 | | |
| |||
142 | 145 | | |
143 | 146 | | |
144 | 147 | | |
145 | | - | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
146 | 156 | | |
147 | 157 | | |
148 | 158 | | |
| |||
159 | 169 | | |
160 | 170 | | |
161 | 171 | | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
162 | 182 | | |
163 | 183 | | |
164 | 184 | | |
| |||
169 | 189 | | |
170 | 190 | | |
171 | 191 | | |
172 | | - | |
| 192 | + | |
173 | 193 | | |
174 | 194 | | |
175 | 195 | | |
| |||
178 | 198 | | |
179 | 199 | | |
180 | 200 | | |
| 201 | + | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
| 209 | + | |
| 210 | + | |
| 211 | + | |
| 212 | + | |
| 213 | + | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
181 | 232 | | |
182 | 233 | | |
183 | 234 | | |
| |||
462 | 513 | | |
463 | 514 | | |
464 | 515 | | |
465 | | - | |
| 516 | + | |
| 517 | + | |
| 518 | + | |
| 519 | + | |
| 520 | + | |
| 521 | + | |
| 522 | + | |
| 523 | + | |
| 524 | + | |
| 525 | + | |
466 | 526 | | |
467 | | - | |
468 | 527 | | |
469 | 528 | | |
470 | 529 | | |
| |||
Lines changed: 63 additions & 4 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
132 | 132 | | |
133 | 133 | | |
134 | 134 | | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
135 | 138 | | |
136 | 139 | | |
137 | 140 | | |
| |||
142 | 145 | | |
143 | 146 | | |
144 | 147 | | |
145 | | - | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
146 | 156 | | |
147 | 157 | | |
148 | 158 | | |
| |||
159 | 169 | | |
160 | 170 | | |
161 | 171 | | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
162 | 182 | | |
163 | 183 | | |
164 | 184 | | |
| |||
169 | 189 | | |
170 | 190 | | |
171 | 191 | | |
172 | | - | |
| 192 | + | |
173 | 193 | | |
174 | 194 | | |
175 | 195 | | |
| |||
178 | 198 | | |
179 | 199 | | |
180 | 200 | | |
| 201 | + | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
| 209 | + | |
| 210 | + | |
| 211 | + | |
| 212 | + | |
| 213 | + | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
181 | 232 | | |
182 | 233 | | |
183 | 234 | | |
| |||
462 | 513 | | |
463 | 514 | | |
464 | 515 | | |
465 | | - | |
| 516 | + | |
| 517 | + | |
| 518 | + | |
| 519 | + | |
| 520 | + | |
| 521 | + | |
| 522 | + | |
| 523 | + | |
| 524 | + | |
| 525 | + | |
466 | 526 | | |
467 | | - | |
468 | 527 | | |
469 | 528 | | |
470 | 529 | | |
| |||
0 commit comments