Skip to content

Commit d9309fb

Browse files
sjarmakclaude
andcommitted
feat: add calibration report with stratified sampling and bias analysis
Enhances validate_on_contextbench.py with: - Stratified sampling by language and gold context complexity - Comprehensive calibration report (error profile, TPR/FPR) - Go/no-go threshold (file recall >= 0.60) - Systematic gap analysis (missed file categories) - Per-language and per-complexity bias breakdowns - Domain gap warning for polyrepo tasks - Paper-ready statement generation Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent c85ae1b commit d9309fb

File tree

1 file changed

+392
-30
lines changed

1 file changed

+392
-30
lines changed

0 commit comments

Comments
 (0)