Skip to content

Commit 8f46d49

Browse files
sjarmakclaude
andcommitted
fix: composite score renormalization + CCB weighted sampling
Bug 1: compute_composite_score() was unreachable at 0.65 threshold because chain_recall/symbol_recall defaulted to 0 when unavailable, capping max score at 0.70. Fix: None signals "not measured" and weight is redistributed proportionally to available components. Bug 2: ccb_weighted_sample() duplicated task objects for 2x boost then deduped by identity, so the boost never worked. Rewrote to boost effective stratum allocation sizes and prefer CCB tasks within each stratum. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent d5a8d29 commit 8f46d49

5 files changed

Lines changed: 181 additions & 529 deletions

File tree

Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,48 @@
1+
{
2+
"metadata": {
3+
"title": "Pass 2 local: 2 tasks needing 2nd pass",
4+
"generated_date": "2026-03-02",
5+
"total_tasks": 2
6+
},
7+
"methodology": {
8+
"sdlc_suites": [
9+
"ccb_debug"
10+
]
11+
},
12+
"statistics": {
13+
"total_tasks": 2,
14+
"per_suite": {
15+
"ccb_debug": 2
16+
}
17+
},
18+
"tasks": [
19+
{
20+
"task_id": "teleport-ssh-regression-prove-001",
21+
"benchmark": "ccb_debug",
22+
"task_dir": "ccb_debug/teleport-ssh-regression-prove-001",
23+
"language": "go",
24+
"difficulty": "hard",
25+
"current_bl_runs": 1,
26+
"current_mcp_runs": 3,
27+
"current_paired": 1,
28+
"runs_needed": 2,
29+
"sdlc_phase": "debug",
30+
"repo": "gravitational/teleport",
31+
"mcp_benefit_score": 0.75
32+
},
33+
{
34+
"task_id": "tutanota-search-regression-prove-001",
35+
"benchmark": "ccb_debug",
36+
"task_dir": "ccb_debug/tutanota-search-regression-prove-001",
37+
"language": "typescript",
38+
"difficulty": "hard",
39+
"current_bl_runs": 3,
40+
"current_mcp_runs": 1,
41+
"current_paired": 1,
42+
"runs_needed": 2,
43+
"sdlc_phase": "debug",
44+
"repo": "tutanota/tutanota",
45+
"mcp_benefit_score": 0.75
46+
}
47+
]
48+
}

0 commit comments

Comments
 (0)