Skip to content

Commit 111b5fe

Browse files
LoCoBench Botclaude
andcommitted
chore: mark US-004 as passing, update progress log
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent ff2c88e commit 111b5fe

2 files changed

Lines changed: 16 additions & 1 deletion

File tree

ralph-gapfill-infra/prd.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -64,7 +64,7 @@
6464
"python3 scripts/generate_manifest.py still runs without errors after archival"
6565
],
6666
"priority": 4,
67-
"passes": false,
67+
"passes": true,
6868
"notes": "Check if there's a configs/repoqa_2config.sh or repoqa_3config.sh. Move it too. Don't delete — archive for reference."
6969
},
7070
{

ralph-gapfill-infra/progress.txt

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -53,3 +53,18 @@
5353
- compare_configs.py imports DIR_PREFIX_TO_SUITE from aggregate_status.py rather than defining its own copy
5454
- generate_manifest.py exits 1 when runs/official doesn't exist — expected in worktrees without run data
5555
---
56+
57+
## 2026-02-16 - US-004
58+
- Archived saturated ccb_repoqa benchmark (1.000/1.000 on both configs = zero signal)
59+
- Moved benchmarks/ccb_repoqa/ → benchmarks/archive/ccb_repoqa/
60+
- Moved configs/repoqa_2config.sh → configs/archive/repoqa_2config.sh
61+
- Removed 10 repoqa entries from configs/selected_benchmark_tasks.json
62+
- Updated benchmarks/README.md: removed from active list, added Archived Benchmarks section, renumbered remaining, updated totals
63+
- Recalculated metadata in selected_benchmark_tasks.json (total_selected, tasks_per_benchmark, language stats, avg MCP score)
64+
- Files changed: benchmarks/ccb_repoqa/ (moved), configs/repoqa_2config.sh (moved), configs/selected_benchmark_tasks.json, benchmarks/README.md
65+
- **Learnings for future iterations:**
66+
- `git mv` preserves history for archived directories — preferred over manual copy+delete
67+
- selected_benchmark_tasks.json metadata.total_selected was stale (171 vs actual 200) — use Python to recount from tasks array when modifying
68+
- DIR_PREFIX_TO_SUITE mappings for repoqa_ were left in place in scripts — archived suites' run data may still exist in runs/official and the scripts should still recognize them
69+
- When archiving suites: move benchmark dir, move config script, remove from selected_benchmark_tasks.json, update benchmarks/README.md
70+
---

0 commit comments

Comments
 (0)