Skip to content

Commit 4126431

Browse files
sjarmakclaude
andcommitted
feat: DOE-driven MCP-unique rebalance — 220 tasks (Neyman-optimal)
Neyman-optimal allocation redistributes the fixed 220-task budget toward highest-variance suites for maximum statistical precision: - onboarding 20→28, migration 20→26, security 20→24, crossrepo_tracing 20→22 - domain/incident unchanged at 20 - compliance/platform 20→18, crossorg/org 20→15, crossrepo 20→14 Actions: 20 low-IV tasks → backup, 7 promoted from backup, 13 scaffolded. Oracle answers verified via Sourcegraph for all 13 new tasks. Scripts: doe_select_tasks.py and doe_variance_analysis.py gain --mcp-unique-only and --include-staging flags for MCP-unique analysis. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent 4fd84b5 commit 4126431

File tree

436 files changed

+23675
-26
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

436 files changed

+23675
-26
lines changed

benchmarks/ccb_mcp_compliance/ccx-compliance-057/environment/Dockerfile renamed to benchmarks/backups/ccb_mcp_compliance_doe_trim/ccx-compliance-057/environment/Dockerfile

File renamed without changes.

benchmarks/ccb_mcp_compliance/ccx-compliance-057/environment/Dockerfile.artifact_baseline renamed to benchmarks/backups/ccb_mcp_compliance_doe_trim/ccx-compliance-057/environment/Dockerfile.artifact_baseline

File renamed without changes.

benchmarks/ccb_mcp_compliance/ccx-compliance-057/environment/Dockerfile.artifact_only renamed to benchmarks/backups/ccb_mcp_compliance_doe_trim/ccx-compliance-057/environment/Dockerfile.artifact_only

File renamed without changes.

benchmarks/ccb_mcp_compliance/ccx-compliance-057/environment/Dockerfile.sg_only renamed to benchmarks/backups/ccb_mcp_compliance_doe_trim/ccx-compliance-057/environment/Dockerfile.sg_only

File renamed without changes.

benchmarks/ccb_mcp_compliance/ccx-compliance-057/instruction.md renamed to benchmarks/backups/ccb_mcp_compliance_doe_trim/ccx-compliance-057/instruction.md

benchmarks/ccb_mcp_compliance/ccx-compliance-057/task.toml renamed to benchmarks/backups/ccb_mcp_compliance_doe_trim/ccx-compliance-057/task.toml

File renamed without changes.

benchmarks/ccb_mcp_compliance/ccx-compliance-057/tests/eval.sh renamed to benchmarks/backups/ccb_mcp_compliance_doe_trim/ccx-compliance-057/tests/eval.sh

File renamed without changes.

benchmarks/ccb_mcp_compliance/ccx-compliance-057/tests/oracle_answer.json renamed to benchmarks/backups/ccb_mcp_compliance_doe_trim/ccx-compliance-057/tests/oracle_answer.json

File renamed without changes.

benchmarks/ccb_mcp_compliance/ccx-compliance-057/tests/oracle_checks.py renamed to benchmarks/backups/ccb_mcp_compliance_doe_trim/ccx-compliance-057/tests/oracle_checks.py

File renamed without changes.

benchmarks/ccb_mcp_compliance/ccx-compliance-057/tests/sgonly_verifier_wrapper.sh renamed to benchmarks/backups/ccb_mcp_compliance_doe_trim/ccx-compliance-057/tests/sgonly_verifier_wrapper.sh

File renamed without changes.

0 commit comments

Comments
 (0)