Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
244 changes: 244 additions & 0 deletions docs/plans/agent-transports.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,244 @@
## Plan — `agent-transports`

> Expose codemap's structural-query surface to agents over a wire protocol. **v1 ships an MCP server**; HTTP API (`codemap serve`) is the v1.x slice. Both wrap the same logical operations (`query` / `audit` / `recipe` / `context` / `validate` / `baseline` ops); the plan settles the surface once so each transport inherits the same shape.
>
> Adopted from [`docs/roadmap.md` § Backlog](../roadmap.md#backlog) ("MCP server wrapping `query` …" + "HTTP API …"). Builds on every CLI primitive shipped to date — Tier A flags (PR #26), B.6 baselines (PR #30), B.7 visibility (PR #28), B.5 v1 audit (PR #33).

**Status:** Open — design pass; not yet implemented.
**Cross-refs:** [`docs/roadmap.md` § Non-goals](../roadmap.md#non-goals-v1) (no persistent daemon — HTTP API has to negotiate this), [`docs/architecture.md` § CLI usage](../architecture.md#cli-usage) (each MCP / HTTP tool is a thin wrapper), [`.agents/lessons.md`](../../.agents/lessons.md) (changesets policy: pre-v1 patch unless schema-breaks).

---

## 1. Goal

Agents (Claude Code, Cursor, Codex, generic MCP / HTTP clients) call codemap's structural-query surface **without a Bash round-trip**. Today every agent invocation looks like:

```bash
codemap query --json "SELECT name, file_path FROM symbols WHERE name = 'X'"
```
Comment thread
coderabbitai[bot] marked this conversation as resolved.

After v1:

```jsonc
// MCP tool call (stdio + JSON-RPC)
{
"name": "query",
"arguments": {
"sql": "SELECT name, file_path FROM symbols WHERE name = 'X'",
},
}
```

The wins:

- **Tokens.** No bash framing, no shell quoting, no stdout parsing.
- **Latency.** No process spawn per call (MCP server is already running for the session).
- **Discoverability.** Tools self-describe via JSON Schema; agents don't have to read `--help`.
- **Composition.** Agents call `audit` directly; Codemap's existing CLI surface stays the source of truth.

## 2. Why MCP first, HTTP API as v1.x

| Axis | MCP server | HTTP API (`codemap serve`) |
| ----------------------- | ----------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------- |
| **Real demand today** | High — we use it ourselves through Cursor / Claude Code; MCP has consumer momentum. | Speculative — no concrete user has asked. |
| **Protocol surface** | Settled (MCP spec; `@modelcontextprotocol/sdk`); JSON-RPC 2.0 over stdio. | Codemap-shaped (REST routes, status codes, JSON shapes — design from scratch). |
| **Process model** | Stdio-spawned per session by the agent host. Stays one-shot-per-session; no daemon. | Wants a long-lived process — tangles with the [persistent-daemon non-goal](../roadmap.md#non-goals-v1). |
| **Auth / security** | Trusted parent process (the agent host). Trivial. | Loopback / token / CORS — meaningful design. |
| **Implementation cost** | Lower — SDK does the JSON-RPC; just register tools. | Higher — pick framework, design routes, handle binding. |

**v1 ships MCP only.** HTTP API stays in `roadmap.md § Backlog` until a concrete consumer asks; this plan reserves the design points (tool taxonomy, output shape, audit composition) so HTTP can inherit them when its time comes.

## 3. Tool taxonomy

**Decision: one MCP tool per CLI top-level operation.** Names mirror the CLI verb; `inputSchema` mirrors the CLI flag set.

| MCP tool | Wraps | Notes |
| ----------------------------------------------------------------- | ----------------------------------------------- | -------------------------------------------------------------------------------------------- |
| `query` | `codemap query --json "<SQL>"` | Pure SQL execution; identical error-shape semantics. |
| `query_recipe` | `codemap query --json --recipe <id>` | Separate tool (vs `query` + recipe param) so agents see the recipe surface in tool listings. |
| `audit` | `codemap audit --json --baseline <prefix> ...` | Composes per-delta baseline mapping (see § 6). |
| `save_baseline` / `baseline` / `list_baselines` / `drop_baseline` | `codemap query --save-baseline=<name> ...` etc. | Each is a distinct verb the agent should pick deliberately. |
| `context` | `codemap context --json` | Read-only; same JSON envelope. |
| `validate` | `codemap validate --json [<paths>]` | Optional `paths` array. |

**Rejected alternatives:**

- **One mega-`cli` tool with `command` + `args`.** Loses self-description; agents have to know each subcommand's flag set.
- **Grouped tools (`structural` / `baselines` / `audit`).** Too coarse — agents pick a tool by verb, not by category.

**Not exposed in v1:**

- `index` (re-index from agent). Risk: agent triggers a multi-second reindex in a tight loop. Defer until a concrete use case ("agent edited files, wants to re-query") emerges; today the codemap rule's auto-incremental-index discipline (and audit's prelude) handle the freshness story.
- `agents init` (developer-side; not an agent-runtime operation).
- `version` / `--help` (MCP exposes its own discovery; CLI help is for humans).

## 4. Output shape uniformity

**Decision: every tool returns the same shape its CLI counterpart already returns under `--json`.** No re-mapping.

- Success → the CLI's JSON output verbatim (`[...rows]` for `query`, `{base, head, deltas}` for `audit`, `{count}` / `{group_by, groups}` for `--summary`-flavoured ops, etc.).
- Error → MCP error response with the same `{"error": "..."}` body the CLI emits today, surfaced as the JSON-RPC error's `data` field.

**Why no re-mapping:**

1. The CLI's `--json` output is already the canonical surface; bundled `templates/agents/skills/codemap/SKILL.md` documents it. Re-shaping in MCP would create two surfaces to maintain.
2. Consumers reading docs / running `codemap query` directly see the same shape they get over MCP.
3. Future schema additions to the CLI envelope (e.g. `audit` v1.x verdict field) propagate to MCP automatically.

The MCP `server.tool(...)` registration just calls the existing CLI entry-point function (`runQueryCmd` / `runAuditCmd` / etc.) with stdout captured into the response body.

## 5. Per-tool surface (`inputSchema`)

JSON Schema for each tool mirrors the CLI flag set. Sketch for the three v1 keystones:

```jsonc
// query
{
"name": "query",
"description": "Run read-only SQL against .codemap.db. Returns row array.",
"inputSchema": {
"type": "object",
"properties": {
"sql": {"type": "string", "description": "Read-only SELECT."},
"summary": {"type": "boolean", "default": false, "description": "Return {count: N} instead of rows."},
"changed_since": {"type": "string", "description": "Filter rows to files changed since <ref>."},
"group_by": {"type": "string", "enum": ["owner", "directory", "package"]}
},
"required": ["sql"]
}
}

// query_recipe
{
"name": "query_recipe",
"description": "Run a bundled SQL recipe by id. Recipes carry per-row `actions` hints.",
"inputSchema": {
"type": "object",
"properties": {
"recipe": {"type": "string", "description": "Recipe id (call list_recipes for catalog)."},
"summary": {"type": "boolean", "default": false},
"changed_since": {"type": "string"},
"group_by": {"type": "string", "enum": ["owner", "directory", "package"]}
},
"required": ["recipe"]
}
}

// audit
{
"name": "audit",
"description": "Structural-drift audit. Composes per-delta baselines into {head, deltas}.",
"inputSchema": {
"type": "object",
"properties": {
"baseline_prefix": {"type": "string", "description": "Auto-resolves <prefix>-{files,dependencies,deprecated}."},
"baselines": {
"type": "object",
"description": "Explicit per-delta override. Keys: files | dependencies | deprecated.",
"properties": {
"files": {"type": "string"},
"dependencies": {"type": "string"},
"deprecated": {"type": "string"}
}
},
"summary": {"type": "boolean", "default": false},
"no_index": {"type": "boolean", "default": false}
}
}
}
```

CLI flag → JSON property: kebab-case → snake_case (idiomatic JSON Schema; matches MCP spec examples).

## 6. Audit composition over MCP

The CLI's `--<delta>-baseline <name>` flags become a single structured `baselines: {[deltaKey]: name}` argument — same data shape as the engine's `AuditBaselineMap`. `--baseline <prefix>` stays a top-level `baseline_prefix` argument. The `resolveAuditBaselines` helper already exposed for tests in PR #33 is the layer the MCP wrapper calls — no logic duplication.

## 7. Resources

MCP resources are addressable read-only data the host can fetch ahead of tool calls. Codemap exposes:

| URI | Body | Purpose |
| ------------------------ | ------------------------------------------------------------------- | ----------------------------------------------------------------------- |
| `codemap://recipes` | JSON array (same as `--recipes-json`) | Catalog discovery. |
| `codemap://recipes/{id}` | `{id, description, sql, actions?}` | Single-recipe inspection (replaces `--print-sql <id>`). |
| `codemap://schema` | DDL / column descriptions (lifted from `architecture.md § Schema`) | Tells the agent what tables exist. |
| `codemap://skill` | Full text of the bundled `templates/agents/skills/codemap/SKILL.md` | Agents that don't preload the bundled skill can `read_resource` for it. |

Resources don't take input — they're constant-per-server-instance data. The server caches them at startup.

## 8. Composition with existing CLI

| CLI flag | MCP equivalent |
| ---------------------------------------- | ------------------------------------------------------------------------------------------------- |
| `--json` | Always-on (MCP responses are structured). The CLI's terminal-mode renderer is dead code over MCP. |
| `--summary` | `summary: true` in tool args. |
| `--changed-since <ref>` | `changed_since: "<ref>"` in tool args. |
| `--group-by <mode>` | `group_by: "<mode>"` in tool args. |
| `--baseline <prefix>` (audit) | `baseline_prefix: "<prefix>"`. |
| `--<delta>-baseline <name>` (audit) | `baselines: {<deltaKey>: "<name>"}`. |
| `--no-index` (audit) | `no_index: true`. |
| `--recipe <id>` | `query_recipe` is a separate tool; no overload. |
| `--print-sql <id>` | `codemap://recipes/{id}` resource. |
| `--recipes-json` | `codemap://recipes` resource. |
| `--baselines` / `--drop-baseline <name>` | `list_baselines` / `drop_baseline` tools. |
| `--save-baseline[=<name>]` | `save_baseline` tool with `name` + (`recipe` \| `sql`) inputs. |

Comment thread
coderabbitai[bot] marked this conversation as resolved.
## 9. CLI surface

```text
# v1 (ships first):
codemap mcp [--root <dir>] [--config <file>] # spawned by the agent host over stdio

# v1.x (deferred until a concrete consumer asks):
codemap serve [--port 0] [--host 127.0.0.1] [--token <t>] [--root <dir>] [--config <file>]
```

- `codemap mcp` — the only new CLI verb in v1. Reads JSON-RPC on stdin, writes on stdout. Logs to stderr (per MCP convention).
- All existing CLI commands continue to work unchanged.
- Exit codes: `0` on clean shutdown (stdin EOF), `1` on bootstrap / DB / config errors.
- No new flags on existing commands.

## 10. Implementation deps

- **`@modelcontextprotocol/sdk`** (TypeScript) — official SDK; handles JSON-RPC framing, schema validation, transport. Single new dependency.
- Reuses every CLI entry-point function from `src/cli/cmd-*.ts` (no new business logic).
- New file: `src/cli/cmd-mcp.ts` (CLI dispatch — argv parse + spawn server) + `src/application/mcp-server.ts` (engine — tool registry, resource handlers, response composition). Mirrors the `cmd-audit.ts` ↔ `audit-engine.ts` seam.

## 11. Tracer-bullet sequence

Per [`tracer-bullets`](../../.agents/rules/tracer-bullets.md) and the codemap-audit precedent (~6 commits ship-end-to-end):

1. **CLI scaffold** — `cmd-mcp.ts` + `mcp-server.ts` skeletons. `codemap mcp --help` works; `runMcpCmd` boots `@modelcontextprotocol/sdk` server with one stub tool (e.g. `version` returning `{version: "..."}`). Smoke-test via `npx @modelcontextprotocol/inspector`. Commit.
2. **First tool — `query`** — wires the server to the existing `runQueryCmd` / `printQueryResult` logic; captures stdout into the MCP response body. Tests via SDK in-process. Commit.
3. **`query_recipe`** — separate tool surfacing the recipe catalog. Composes `--summary` / `--changed-since` / `--group-by` via JSON args. Commit.
4. **`audit`** — wraps `runAuditCmd` / `runAudit`; the `baselines` arg becomes the `AuditBaselineMap` directly via `resolveAuditBaselines`. Auto-incremental-index prelude stays. Commit.
5. **Baseline tools** — `save_baseline` / `list_baselines` / `drop_baseline` round-trip via existing helpers. Commit.
6. **Resources** — `codemap://recipes`, `codemap://recipes/{id}`, `codemap://schema`, `codemap://skill`. Commit.
7. **Docs + agents update** — `architecture.md § MCP wiring` paragraph, glossary entry, README CLI block, rule + skill across `.agents/` and `templates/agents/` (Rule 10), patch changeset. Commit.

Estimated total: ~1 day across ~7 commits.

## 12. Open questions (worth a `grill-me` round before code)

- **`context` and `validate` as MCP tools?** Both are CLI commands today. `context` is agent-shaped (returns the existing JSON envelope). `validate` is more dev-shaped (CI gate for stale indices). Worth surfacing both? Just `context`?
- **Should `query` accept multi-statement SQL?** Today's CLI rejects it (one statement per call). MCP could batch — but that's a real semantic shift.
- **Resource caching strategy.** Recipes are constant per server boot. Schema is constant per `SCHEMA_VERSION`. Skill text is constant per package version. Cache once at startup vs per-`read_resource` call?
- **Tool naming convention.** snake_case (matches MCP spec examples) vs kebab-case (matches CLI flags). Picked snake_case in §5; reconsider.
- **`save_baseline` argument shape.** Two sub-shapes: `{name, sql}` and `{name, recipe}`. One tool with optional fields, or two tools (`save_baseline_sql` / `save_baseline_recipe`)?

## 13. Non-goals (v1)

- **HTTP API.** Stays on `roadmap.md § Backlog`. Plan settles the tool taxonomy + output shape so HTTP inherits them.
- **Daemon mode for HTTP.** Even when HTTP ships, codemap stays one-shot per request unless a benchmark proves the spawn cost matters.
- **Tool-level auth on MCP.** The agent host is the trust boundary. If you don't trust your agent host, don't enable codemap.
- **Re-indexing from MCP.** Auto-incremental-index in audit handles the staleness story for now; explicit `index` tool deferred until concrete demand.
- **Streaming responses.** Codemap query results are point-in-time row sets; no streaming use case yet.

## 14. References

- Motivation: [`docs/roadmap.md` § Backlog](../roadmap.md#backlog) (MCP + HTTP API entries).
- MCP spec: <https://modelcontextprotocol.io>.
- Wraps every CLI primitive shipped in PRs #26 / #28 / #30 / #33.
- Audit baseline composition: see [`docs/architecture.md` § Audit wiring](../architecture.md#cli-usage) — same `AuditBaselineMap` shape over MCP.
- Doc lifecycle: this file follows the **Plan** type per [`docs/README.md` § Document Lifecycle](../README.md#document-lifecycle) — **delete on ship**, lift the canonical bits into `architecture.md` per Rule 2.
Loading
Loading