Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 11 additions & 9 deletions docs/architecture.md

Large diffs are not rendered by default.

12 changes: 6 additions & 6 deletions docs/plans/agent-surface-delivery.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,11 +10,11 @@

## Quick resume

| Next action | Detail |
| -------------------- | --------------------------------------------------------------------------- |
| **Review / merge** | [#133](https://github.com/stainless-code/codemap/pull/133) — MCP `affected` |
| **Start next** | **PR 6** — MCP trace tools (`trace` / `explore` / `node`) |
| **Do not start yet** | PR 9 (eval harness) until PR 8 |
| Next action | Detail |
| -------------------- | ----------------------------------------------------------------------------------- |
| **Review / merge** | [#134](https://github.com/stainless-code/codemap/pull/134) — MCP trace tools (PR 6) |
| **Recently merged** | [#133](https://github.com/stainless-code/codemap/pull/133) — MCP `affected` |
| **Do not start yet** | PR 9 (eval harness) until PR 8 |

Update the table below when a PR merges or a new branch opens.

Expand All @@ -40,7 +40,7 @@ Max **3 parallel tracks** at once.
| **3** | [`index-lock-and-error-log`](./index-lock-and-error-log.md) → [`parse-worker-hardening`](./parse-worker-hardening.md) (stack) | merged | [#129](https://github.com/stainless-code/codemap/pull/129), [#130](https://github.com/stainless-code/codemap/pull/130) | 4, 5 |
| **4** | Recipe half of [`mcp-trace-explore-tools`](./mcp-trace-explore-tools.md) (`call-path`, `symbol-neighborhood` SQL + tests) | merged | [#131](https://github.com/stainless-code/codemap/pull/131) | 3, 5 |
| **5** | [`affected-tests-recipe`](./affected-tests-recipe.md) (+ Phase 2 MCP `affected` in [#133](https://github.com/stainless-code/codemap/pull/133)) | merged | [#132](https://github.com/stainless-code/codemap/pull/132), [#133](https://github.com/stainless-code/codemap/pull/133) | 3, 4 |
| **6** | MCP half of trace (`trace` / `explore` / `node` tools) + update instructions | planned | PR 1, PR 4 | — |
| **6** | MCP half of trace (`trace` / `explore` / `node` tools) + update instructions | open | [#134](https://github.com/stainless-code/codemap/pull/134) | PR 1, PR 4 |
| **7** | [`field-qualified-search`](./field-qualified-search.md) | planned | PR 1 | 4, 5 if `mcp-server.ts` untouched |
| **8** | [`agents-init-mcp-wiring`](./agents-init-mcp-wiring.md) | planned | PR 1 | 3–5 |
| **9** | [`agent-eval-harness`](./agent-eval-harness.md) | planned | PR 1, PR 8, allowlist | **last P1** |
Expand Down
43 changes: 17 additions & 26 deletions docs/plans/mcp-trace-explore-tools.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,12 @@
# MCP trace & explore tools — plan

> **Status:** open · **Priority:** P1 · **Effort:** M (~2 weeks)
> **Status:** shipped · **Priority:** P1 · **Effort:** M (~2 weeks)
>
> **Motivator:** Agents often need call-path and multi-symbol survey answers in one round-trip. Codemap has `impact` (radius walk) and `snippet` but no shortest-path or budget-capped multi-file survey. MCP wrappers must not erode Moat A — every wrapper ships with a recipe twin.
>
> **Roadmap:** [§ Backlog — Agent surface & ops](./agent-surface-and-ops.md#p1) · related [call-path-type-hierarchy-recipes](./call-path-type-hierarchy-recipes.md)
>
> **Shipped:** recipes [#131](https://github.com/stainless-code/codemap/pull/131); MCP tools [#134](https://github.com/stainless-code/codemap/pull/134)

---

Expand All @@ -14,12 +16,12 @@
| --- | ---------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------ |
| L.1 | **Recipe twins required** before MCP tools: `call-path`, `symbol-neighborhood` (bundled SQL). | [Moat A](../roadmap.md#moats-load-bearing) |
| L.2 | MCP tools **`trace`**, **`explore`**, **`node`** are thin composers over recipes + `snippet` + existing engines — not opaque graph APIs. | Moat A |
| L.3 | **Output budgets** — cap total response chars (e.g. 15k); truncate with explicit `truncated: true` in JSON. | Agent context economics |
| L.4 | No NL task parsing — `trace` takes `from` / `to` symbol names; `explore` takes symbol list or recipe result rows. | [Floor — No LLM in the box](../roadmap.md#floors-v1-product-shape) |
| L.3 | **Output budgets** — cap snippet `source` chars (default 15k) + explore row cap (500); `truncated: true` with `truncation` detail. | Agent context economics |
| L.4 | No NL task parsing — `trace` takes `from` / `to` symbol names; `explore` takes symbol name list. | [Floor — No LLM in the box](../roadmap.md#floors-v1-product-shape) |

---

## Recipe specs (ship first)
## Recipe specs (shipped #131)

### `call-path`

Expand All @@ -35,38 +37,27 @@

---

## MCP tool specs (ship second)

| Tool | Composes |
| --------- | ---------------------------------------------------------------------------- |
| `trace` | `query_recipe call-path` + `snippet` for each hop |
| `explore` | `query_recipe symbol-neighborhood` (multi-name) + `snippet` with char budget |
| `node` | `show` + one-hop `symbol-neighborhood` + optional inline snippets |

Register in `mcp-server.ts`; document chains in [mcp-server-instructions](./mcp-server-instructions.md).

---
## MCP tool specs (shipped #134)

## Implementation steps
| Tool | Composes |
| --------- | --------------------------------------------------------------------------------- |
| `trace` | `query_recipe call-path` + cross-file `snippet` for hop symbols |
| `explore` | `query_recipe symbol-neighborhood` (deduped multi-name) + snippet budget |
| `node` | `show` + scoped one-hop `symbol-neighborhood` + optional center+neighbor snippets |

1. Add `templates/recipes/call-path.sql` + `.md` frontmatter
2. Add `templates/recipes/symbol-neighborhood.sql` + `.md`
3. Golden-query tests for both recipes
4. Implement MCP handlers in `tool-handlers.ts` (or dedicated module)
5. Output budget helper shared by explore/trace
6. Update agent-content skill with SQL equivalents
Register in `mcp-server.ts` + `http-server.ts`; document chains in [mcp-instructions](../templates/agent-content/mcp-instructions.md).

---

## Acceptance

- [ ] `codemap query --recipe call-path --params from=foo,to=bar` works
- [ ] MCP `trace` returns same path + snippets, respects budget
- [ ] Instructions document recipe-first fallback
- [x] `codemap query --recipe call-path --params from=foo,to=bar` works ([#131](https://github.com/stainless-code/codemap/pull/131))
- [x] MCP/HTTP `trace` returns path + snippets, respects budget ([#134](https://github.com/stainless-code/codemap/pull/134))
- [x] Instructions document recipe-first fallback ([#134](https://github.com/stainless-code/codemap/pull/134))

---

## Dependencies

- [mcp-server-instructions](./mcp-server-instructions.md) should land first or in same PR
- [mcp-server-instructions](./mcp-server-instructions.md) — landed [#126](https://github.com/stainless-code/codemap/pull/126)
- [call-path-type-hierarchy-recipes](./call-path-type-hierarchy-recipes.md) may extend CTE patterns later
2 changes: 1 addition & 1 deletion docs/roadmap.md
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,7 @@ Prioritized agent & indexing ops queue (2026-05). Index: [`plans/agent-surface-a

**P1 — medium**

- [ ] **MCP trace / explore / node** — recipe twins + thin MCP composers. Plan: [`plans/mcp-trace-explore-tools.md`](./plans/mcp-trace-explore-tools.md). Effort: M.
- [x] **MCP trace / explore / node** — recipe twins + thin MCP composers. Plan: [`plans/mcp-trace-explore-tools.md`](./plans/mcp-trace-explore-tools.md). [#134](https://github.com/stainless-code/codemap/pull/134). Effort: M.
- [ ] **Agents init MCP wiring** — `agents init --mcp` + permissions. Plan: [`plans/agents-init-mcp-wiring.md`](./plans/agents-init-mcp-wiring.md). Effort: M.
- [x] **Affected tests recipe** — dep-graph test selection + stdin + MCP `affected` tool. Plan: [`plans/affected-tests-recipe.md`](./plans/affected-tests-recipe.md). Shipped #132 + #133.
- [ ] **Index lock + error log** — cross-process lock, `unlock`, `errors.log`. Plan: [`plans/index-lock-and-error-log.md`](./plans/index-lock-and-error-log.md). Effort: M.
Expand Down
159 changes: 159 additions & 0 deletions src/application/http-server.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -124,6 +124,7 @@ describe("http-server — health + tools catalog", () => {
expect(body.tools.map((t) => t.name)).toContain("query");
expect(body.tools.map((t) => t.name)).toContain("audit");
expect(body.tools.map((t) => t.name)).toContain("affected");
expect(body.tools.map((t) => t.name)).toContain("trace");
});

it("404 for unknown route", async () => {
Expand Down Expand Up @@ -405,6 +406,141 @@ describe("http-server — POST /tool/{other tools}", () => {
expect(r.json.error).not.toContain("--changed-since");
});

function seedTraceGraph() {
writeFileSync(
join(benchDir, "src", "trace.ts"),
"export function alpha() {\n return beta();\n}\nexport function beta() {\n return 1;\n}\n",
);
const db = openDb();
try {
db.run(
`INSERT INTO files (path, content_hash, size, line_count, language, last_modified, indexed_at)
VALUES ('src/trace.ts', 'ht', 100, 6, 'typescript', 1, 1)`,
);
db.run(
`INSERT INTO symbols (name, kind, file_path, line_start, line_end, signature, is_exported, parent_name, visibility)
VALUES ('alpha', 'function', 'src/trace.ts', 1, 3, 'alpha()', 1, NULL, 'export'),
('beta', 'function', 'src/trace.ts', 4, 6, 'beta()', 1, NULL, 'export')`,
);
db.run(
`INSERT INTO calls (file_path, caller_name, caller_scope, callee_name, line_start, column_start, column_end)
VALUES ('src/trace.ts', 'alpha', 'alpha', 'beta', 2, 0, 0)`,
);
} finally {
closeDb(db);
}
}

it("trace returns path and snippets", async () => {
seedTraceGraph();
serverHandle = await startServer();
const r = await postTool(serverHandle.port, "trace", {
from: "alpha",
to: "beta",
});
expect(r.status).toBe(200);
expect(r.json.path).toHaveLength(1);
expect(r.json.snippets.length).toBeGreaterThan(0);
expect(r.json.truncated).toBe(false);
});

it("explore merges neighborhoods", async () => {
seedTraceGraph();
serverHandle = await startServer();
const r = await postTool(serverHandle.port, "explore", {
names: ["alpha", "beta"],
});
expect(r.status).toBe(200);
expect(r.json.names).toEqual(["alpha", "beta"]);
expect(r.json.rows.length).toBeGreaterThan(0);
});

it("node returns center + neighborhood", async () => {
seedTraceGraph();
serverHandle = await startServer();
const r = await postTool(serverHandle.port, "node", {
name: "alpha",
include_snippets: true,
});
expect(r.status).toBe(200);
expect(r.json.center.matches[0]?.name).toBe("alpha");
expect(
r.json.neighborhood.some((row: { name: string }) => row.name === "beta"),
).toBe(true);
});

it("trace with non-integer max_depth → 400 (Zod rejects)", async () => {
serverHandle = await startServer();
const r = await postTool(serverHandle.port, "trace", {
from: "a",
to: "b",
max_depth: 1.5,
});
expect(r.status).toBe(400);
expect(r.json.error).toContain('"trace"');
});

it("explore with empty names → 400 (Zod rejects)", async () => {
serverHandle = await startServer();
const r = await postTool(serverHandle.port, "explore", { names: [] });
expect(r.status).toBe(400);
expect(r.json.error).toContain('"explore"');
});

it("trace sets truncated when budget_chars is tiny", async () => {
seedTraceGraph();
serverHandle = await startServer();
const r = await postTool(serverHandle.port, "trace", {
from: "alpha",
to: "beta",
budget_chars: 1,
});
expect(r.status).toBe(200);
expect(r.json.truncated).toBe(true);
expect(r.json.truncation?.snippets).toBe(true);
});

it("records recipe recency after trace", async () => {
seedTraceGraph();
serverHandle = await startServer();
const r = await postTool(serverHandle.port, "trace", {
from: "alpha",
to: "beta",
});
expect(r.status).toBe(200);
const db = openDb();
try {
const row = db
.query<{ run_count: number }>(
"SELECT run_count FROM recipe_recency WHERE recipe_id = 'call-path'",
)
.get();
expect(row?.run_count).toBeGreaterThanOrEqual(1);
} finally {
closeDb(db);
}
});

it("records recipe recency after explore", async () => {
seedTraceGraph();
serverHandle = await startServer();
const r = await postTool(serverHandle.port, "explore", {
names: ["alpha"],
});
expect(r.status).toBe(200);
const db = openDb();
try {
const row = db
.query<{ run_count: number }>(
"SELECT run_count FROM recipe_recency WHERE recipe_id = 'symbol-neighborhood'",
)
.get();
expect(row?.run_count).toBeGreaterThanOrEqual(1);
} finally {
closeDb(db);
}
});

it("list_baselines returns array (empty when none saved)", async () => {
serverHandle = await startServer();
const r = await postTool(serverHandle.port, "list_baselines", {});
Expand Down Expand Up @@ -582,6 +718,29 @@ describe("http-server — Zod input validation at HTTP boundary", () => {
});
expect(r.status).toBe(400);
});

it("trace without from → 400 with structured error", async () => {
serverHandle = await startServer();
const r = await postTool(serverHandle.port, "trace", { to: "bar" });
expect(r.status).toBe(400);
expect(r.json.error).toContain('"trace"');
expect(r.json.error).toContain("from");
});

it("trace without to → 400 with structured error", async () => {
serverHandle = await startServer();
const r = await postTool(serverHandle.port, "trace", { from: "foo" });
expect(r.status).toBe(400);
expect(r.json.error).toContain('"trace"');
expect(r.json.error).toContain("to");
});

it("node with name=number → 400 (not deep handler crash)", async () => {
serverHandle = await startServer();
const r = await postTool(serverHandle.port, "node", { name: 1 });
expect(r.status).toBe(400);
expect(r.json.error).toContain("name");
});
});

describe("http-server — GET /resources", () => {
Expand Down
27 changes: 27 additions & 0 deletions src/application/http-server.ts
Original file line number Diff line number Diff line change
Expand Up @@ -19,12 +19,16 @@ import {
auditArgsSchema,
contextArgsSchema,
dropBaselineArgsSchema,
exploreArgsSchema,
handleApply,
handleAudit,
handleAffected,
handleContext,
handleDropBaseline,
handleExplore,
handleImpact,
handleNode,
handleTrace,
handleListBaselines,
handleQuery,
handleQueryBatch,
Expand All @@ -34,6 +38,8 @@ import {
handleSnippet,
handleValidate,
impactArgsSchema,
nodeArgsSchema,
traceArgsSchema,
listBaselinesArgsSchema,
queryArgsSchema,
queryBatchArgsSchema,
Expand Down Expand Up @@ -94,6 +100,9 @@ const TOOL_NAMES = [
"snippet",
"impact",
"affected",
"trace",
"explore",
"node",
"apply",
"save_baseline",
"list_baselines",
Expand Down Expand Up @@ -489,6 +498,24 @@ async function dispatchTool(
result = handleAffected(r.value, opts.root);
break;
}
case "trace": {
const r = validate(traceArgsSchema, args, "trace");
if (!r.ok) return writeJson(res, 400, { error: r.error }, opts.version);
result = handleTrace(r.value, opts.root);
break;
}
case "explore": {
const r = validate(exploreArgsSchema, args, "explore");
if (!r.ok) return writeJson(res, 400, { error: r.error }, opts.version);
result = handleExplore(r.value, opts.root);
break;
}
case "node": {
const r = validate(nodeArgsSchema, args, "node");
if (!r.ok) return writeJson(res, 400, { error: r.error }, opts.version);
result = handleNode(r.value, opts.root);
break;
}
case "apply": {
const r = validate(applyArgsSchema, args, "apply");
if (!r.ok) return writeJson(res, 400, { error: r.error }, opts.version);
Expand Down
Loading
Loading