Skip to content

Commit a0b0438

Browse files
committed
docs: update config references and remove claude command
1 parent 4475eb0 commit a0b0438

File tree

2 files changed

+12
-45
lines changed

2 files changed

+12
-45
lines changed

.claude/commands/run-benchmark.md

Lines changed: 0 additions & 40 deletions
This file was deleted.

README.md

Lines changed: 12 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -57,7 +57,7 @@ bash configs/run_selected_tasks.sh --dry-run
5757
### First places to read
5858

5959
- `docs/START_HERE_BY_TASK.md` for task-oriented navigation
60-
- `docs/CONFIGS.md` for the 2-config evaluation matrix
60+
- `docs/reference/CONFIGS.md` for the 2-config evaluation matrix
6161
- `docs/EVALUATION_PIPELINE.md` for scoring and reporting outputs
6262
- `docs/REPO_HEALTH.md` for the pre-push health gate
6363

@@ -103,14 +103,21 @@ See [docs/MCP_UNIQUE_TASKS.md](docs/MCP_UNIQUE_TASKS.md) for the full task syste
103103

104104
## 2-Config Evaluation Matrix
105105

106-
All benchmarks are evaluated across two agent configurations that vary the external context tools available via MCP:
106+
All benchmarks are evaluated across two paper-level configurations (Baseline vs MCP-Full). The concrete run config names differ by task type:
107107

108-
| Paper Config Name | `BASELINE_MCP_TYPE` | MCP Tools Available |
108+
- **SDLC suites** (`ccb_build`, `ccb_fix`, etc.): `baseline-local-direct` + `mcp-remote-direct`
109+
- **MCP-unique suites** (`ccb_mcp_*`): `baseline-local-artifact` + `mcp-remote-artifact`
110+
111+
Legacy run directory names (`baseline`, `sourcegraph_full`, `artifact_full`) may still appear in historical outputs and are handled by analysis scripts.
112+
113+
At the paper level, the distinction is still:
114+
115+
| Paper Config Name | Internal MCP mode | MCP Tools Available |
109116
|-------------------|---------------------|---------------------|
110117
| Baseline | `none` | None (agent uses only built-in tools) |
111-
| MCP-Full | `sourcegraph_full` | All 13 Sourcegraph MCP tools including `sg_deepsearch`, `sg_deepsearch_read` |
118+
| MCP-Full | `sourcegraph_full` / `artifact_full` (task-dependent) | All 13 Sourcegraph MCP tools including `sg_deepsearch`, `sg_deepsearch_read` |
112119

113-
See [docs/CONFIGS.md](docs/CONFIGS.md) for the full tool-by-tool breakdown.
120+
See [docs/reference/CONFIGS.md](docs/reference/CONFIGS.md) for the canonical configuration matrix and tool-by-tool breakdown. (`docs/CONFIGS.md` is a compatibility stub.)
114121

115122
---
116123

0 commit comments

Comments
 (0)