Commit 119db38
authored
feat(mcp): codemap mcp — agent-transports v1 (Tracer 1 of 7) (#35)
* feat(mcp): scaffold codemap mcp + ping stub (Tracer 1 of 7)
First slice of docs/plans/agent-transports.md. Lands the SDK wiring
end-to-end: argv parser, help text, dispatch from main.ts, the
McpServer factory in src/application/mcp-server.ts, and a `ping`
stub tool that confirms stdio + JSON-RPC framing without depending
on any codemap engine.
Subsequent tracers replace the stub with real tools:
- 2: query (wraps runQueryCmd, tested via SDK in-process transport)
- 3: query_recipe (separate tool surface for recipe catalog)
- 4: audit (composes baselines via resolveAuditBaselines)
- 5: save_baseline / list_baselines / drop_baseline
- 6: codemap://recipes / recipes/{id} / schema / skill resources
- 7: docs (architecture.md MCP wiring, glossary, README, agent
rule + skill across .agents/ and templates/agents/ per Rule 10)
Layering follows the cmd-audit.ts <-> audit-engine.ts seam from
PR #33 — cmd-mcp.ts owns argv + lifecycle, mcp-server.ts owns the
tool registry and SDK calls.
Open questions in plan § 12 still pending grill round before tracer 2;
nothing in this scaffold pre-commits any of them.
* docs(plans): settle Q1+Q2 of agent-transports grill round
Q1 — context AND validate as MCP tools (both ship in tracer 4
alongside audit). validate is a thin wrapper that costs ~10 LOC
and unlocks "codemap doctor" agents; defer-by-default would only
save the wrapping cost.
Q2 — query (one-statement, CLI parity) PLUS query_batch (MCP-only).
query_batch uses batch-wide-defaults + per-statement-overrides:
statements: (string | {sql, summary?, changed_since?, group_by?})[]
String items inherit batch-wide flags; object items override on a
per-key basis. Output is an array of N elements, each shaped
exactly like single-query's output for the effective flag set.
SQL-only (no recipe polymorphism — query_recipe_batch is an
additive future change if asked).
Rejected: making query accept ;-delimited batches (would need a
real SQL tokenizer and would diverge query's output shape from
its CLI counterpart — violates plan § 4 uniformity).
Plan §§ 3, 5, 8, 11, 12 updated. Q3-Q5 still open.
* docs(plans): settle Q3 — lazy resource caching
Memoize on first read_resource call. All 4 resources are constant per server-process lifetime, so eager and lazy produce identical observable behavior — lazy just keeps boot lean for sessions that never read resources.
* docs(plans): settle Q4 — snake_case for MCP tool names + input keys
Every MCP reference implementation (spec, GitHub MCP, Cursor built-ins) uses snake. CLI stays kebab; the kebab→snake translation lives at the MCP-arg layer.
* docs(plans): settle Q5 — single polymorphic save_baseline tool
save_baseline({name, sql?, recipe?}) with runtime check that exactly one of sql/recipe is set. Mirrors the CLI's single --save-baseline verb. All 5 grill questions now settled — ready for tracer 2.
* feat(mcp): query + query_batch tools (Tracer 2 of 7)
Wires the first two real MCP tools per plan §§ 3, 5, 11:
- `query` — wraps a single SELECT against .codemap.db. Returns the
exact JSON envelope `codemap query --json` would print: row array
by default, {count} under summary, {group_by, groups} under
group_by. Mirrors plan § 4's uniformity commitment (MCP responses
structurally identical to CLI output).
- `query_batch` — N statements in one round-trip. Items are either
bare SQL strings (inherit batch-wide flags) or objects {sql,
summary?, changed_since?, group_by?} that override on a per-key
basis. Returns N envelopes; per-element shape mirrors single-
query output for that statement's effective flag set. Per-statement
errors are isolated — a failed statement returns {error} in its
slot while siblings still execute (partial-success semantics that
match what an agent expects when batching independent reads).
Layering follows the cmd-audit.ts <-> audit-engine.ts seam from
PR #33:
- src/application/query-engine.ts — pure transport-agnostic
executeQuery + executeQueryBatch returning JSON envelopes (no
console.log). Mirrors printQueryResult's logic but returns
the data instead of printing it.
- src/application/mcp-server.ts — registers both tools, handles
flag merging (string vs object form), resolves --changed-since
refs to file sets via getFilesChangedSince (memoised per-ref
across batch items so a batch with N items sharing the same ref
does one git invocation, not N).
Tests: 8 engine tests cover default rows / summary / group_by /
error / changedFiles paths plus batch isolation; 8 in-process MCP
tests use @modelcontextprotocol/sdk's InMemoryTransport to verify
tools/list, single-query envelope, summary, error payload, batch
defaults, per-statement override, string-form inheritance, and
partial-error isolation.
Replaces the ping stub from Tracer 1.
* feat(mcp): query_recipe tool (Tracer 3 of 7)
Wires the query_recipe MCP tool per plan §§ 3, 5, 11. Looks up the
SQL + per-row actions from src/cli/query-recipes.ts (data registry,
no execution flow crosses cli → application — layer note in
mcp-server.ts), then delegates to executeQuery with recipeActions
threaded through. Per-row actions land verbatim on each output
row exactly as `codemap query --json --recipe <id>` would print.
Composes with the same flag set as `query` (summary, changed_since,
group_by) — same JSON envelope contract, same per-flag shape.
Unknown recipe id returns a structured {error} payload pointing
the agent at the codemap://recipes resource (lands in tracer 6).
Tests: 4 in-process MCP tests for tools/list, unknown-recipe error,
actions-attached-to-rows (via seeded @deprecated symbol), and
summary composition.
* feat(mcp): audit + context + validate tools (Tracer 4 of 7)
Wires three more MCP tools per plan §§ 3, 5, 11.
- audit — composes per-delta baselines (files/dependencies/deprecated)
into the same {head, deltas} envelope `codemap audit --json` prints.
Args: baseline_prefix (auto-resolves <prefix>-<deltaKey>), baselines
(explicit per-delta override), summary (collapses each delta to
{added: N, removed: N}), no_index (skip auto-incremental-index
prelude — default re-indexes so head reflects current source).
Reuses resolveAuditBaselines + runAudit from PR #33's engine
unchanged; no new business logic.
- context — wraps buildContextEnvelope. Returns the same
{codemap: {schema_version, ...}, project: {root, file_count, ...},
hubs?, sample_markers?} envelope `codemap context --json` prints.
The agent-shaped session-bootstrap call: one round-trip replaces
4-5 query calls.
- validate — wraps computeValidateRows. Compares on-disk SHA-256 to
indexed files.content_hash, returns rows with status
('ok'/'changed'/'missing'). Empty `paths: []` validates every
indexed file. Unlocks "codemap doctor" agents that diagnose
stale .codemap.db before issuing structural queries (the use case
surfaced in plan § 12 Q1).
Tests: 5 new in-process MCP tests for tools/list (now expects
audit/context/validate in addition to query/query_batch/query_recipe),
audit's no-baseline-resolves error, audit envelope shape (using a
seeded snap-files baseline that matches the seeded files exactly →
no drift), context envelope shape smoke test, validate happy path.
Layer note: cmd-audit's resolveAuditBaselines, cmd-context's
buildContextEnvelope, and cmd-validate's computeValidateRows are
imported from src/cli/* (their CLI verb owns the function today).
Same layer-reversal allowance as query-recipes — pure functions /
data registry, no execution flow crosses cli → application.
* feat(mcp): save_baseline + list_baselines + drop_baseline (Tracer 5 of 7)
Wires the three baseline MCP tools per plan §§ 3, 5, 11. Settled in
the grill round (plan § 12 Q5): save_baseline ships as ONE polymorphic
tool with optional sql / recipe inputs (mirrors the CLI's single
--save-baseline=<name> verb), with a runtime check that exactly one
of sql / recipe is set. Two-tools alternative was rejected — fragments
the surface for one conceptual operation.
- save_baseline({name, sql? | recipe?}) — runs the SQL (resolved from
recipe id if needed), captures rows, upserts into query_baselines
with name + recipe_id + sql + rows_json + row_count + git_ref +
created_at. Reuses upsertQueryBaseline directly.
- list_baselines() — returns the array `codemap query --baselines
--json` would print (no rows_json payload).
- drop_baseline({name}) — deletes the named baseline. Returns
{dropped: <name>} on success or isError if name doesn't exist.
git_ref capture uses tryGetGitRefSafe (mirrors the helper in
cmd-query.ts; kept local to avoid a cli → application import).
git rev-parse may legitimately fail (no git, detached worktree) —
baselines record git_ref = NULL in that case.
Tests: 7 new in-process MCP tests cover tools/list, the runtime
exclusivity check (both ways), SQL save + list round-trip, recipe
save (recipe_id surfaces in payload), unknown-recipe error, and
drop-then-redrop semantics.
* feat(mcp): codemap:// resources, lazy-cached (Tracer 6 of 7)
Wires the four MCP resources per plan §§ 7, 11. Settled in the grill
round (plan § 12 Q3): lazy memoisation — every resource is constant
for the server-process lifetime, so eager-vs-lazy produce identical
observable behavior; lazy keeps boot lean for sessions that never
call read_resource.
- codemap://recipes — full catalog JSON (same as --recipes-json).
Reuses listQueryRecipeCatalog().
- codemap://recipes/{id} — one recipe ({id, description, sql,
actions?}). Template form: list-callback enumerates one URI per
recipe id so resources/list surfaces the catalog. Replaces
--print-sql for agents.
- codemap://schema — DDL of every indexed table, queried live from
sqlite_schema (lets the agent discover what columns exist without
reading docs).
- codemap://skill — full text of templates/agents/skills/codemap/
SKILL.md via resolveAgentsTemplateDir(). Agents that don't preload
the bundled skill at session start fetch it here.
Caches are per-server-instance Map / single-string memos populated
on first read_resource call. Never invalidated — server process is
short-lived (agent host respawns it on package update or session
restart).
Tests: 5 new in-process tests cover resources/list (3 static + N
templated by recipe id), each resource's payload shape, and the
SKILL.md frontmatter sanity check.
* docs(mcp): architecture/glossary/README/agent rule + skill (Tracer 7 of 7)
Lifts the canonical bits out of docs/plans/agent-transports.md per
docs/README.md Rule 2 (delete plans on ship), with a small self-audit
cleanup pass on src/application/mcp-server.ts (Rule 9 + concise-comments).
Docs:
- architecture.md § CLI usage gains an "MCP wiring" paragraph
documenting the cmd-mcp.ts ↔ mcp-server.ts seam, the engine
reuse pattern (executeQuery, runAudit, etc.), tool naming
(snake_case at MCP layer; CLI stays kebab), and lazy resource
caching.
- glossary.md adds two entries under § M: `codemap mcp` / MCP
server, and `query_batch` (MCP-only tool) — covers the new
domain nouns per Rule 9.
- roadmap.md replaces the "Agent transports v1+v1.x" entry with
just the v1.x slice (codemap serve / HTTP API), since v1
shipped.
- README.md CLI block adds the MCP server example.
- .agents/rules/codemap.md + templates/agents/rules/codemap.md
(mirrored per Rule 10): new MCP section in the table + an
"MCP server (codemap mcp)" reference block listing all tools,
query_batch's polymorphic shape, save_baseline's exclusivity,
the four resources, and the output-shape uniformity guarantee.
- .agents/skills/codemap/SKILL.md + templates/agents/skills/
codemap/SKILL.md (mirrored): full per-tool reference + per-
resource reference, with implementation notes (cmd-mcp.ts ↔
mcp-server.ts seam, changed_since memoisation per (root, ref)).
- docs/plans/agent-transports.md DELETED (Rule 2 — plan content
fully lifted into architecture.md and the agent files).
Self-audit cleanup on mcp-server.ts:
- Outdated header docstring trimmed (was "Tracer 2 wires X;
subsequent tracers add Y" — now all shipped).
- Extracted `isEnginePayloadError` type-guard helper to dedup the
4-line error-payload narrowing (previously inlined in query,
query_recipe, save_baseline handlers).
- mergeBatchItem comment slimmed (was 4 lines, now 1) per
concise-comments rule.
Changeset bumped to minor (was patch placeholder for the scaffold)
and rewritten to describe the complete shipped surface — squash-merge
pulls all 7 tracers into one ship-defining commit.
Known follow-ups (deferred — separate PRs):
- Lift cli/* imports (resolveAuditBaselines, buildContextEnvelope,
computeValidateRows, query-recipes) into application/ — currently
4 layer-reversal imports, each documented but accumulating.
- Server-lifetime changed_since cache (currently per-tool-call) —
needs a staleness-invalidation story before it's safe.
- Pool DB connection (currently openDb/closeDb per tool call) —
measure first; bun:sqlite open is microseconds.
- Split mcp-server.ts when it crosses ~1000 lines (currently 770).
* fix(mcp): address PR #35 CodeRabbit feedback (1 Critical + 1 Major + 6 Minor)
All 8 verified correct against the actual code; all applied.
Critical:
- query-engine.ts: enforce read-only at the engine boundary via
PRAGMA query_only = 1. Without it, agents could DROP / DELETE /
UPDATE through `query` / `query_recipe` / `query_batch` despite
these tools being contractually read-only. SQLite-level
enforcement — parser-proof, doesn't bleed across calls (closeDb
discards the connection). Two new tests confirm DML and DDL
are rejected and the data is preserved.
Major:
- mcp-server.ts: query_batch now isolates changed_since failures
per slot. A bad git ref in statement i lands in slot i with
{error} instead of aborting the whole batch — matches the
per-statement isolation contract documented in plan § 5 and
executeQueryBatch's docstring. Refactored the loop to map over
args.statements with a per-item try/catch. Dropped the now-unused
executeQueryBatch import (the helper still ships as a public
query-engine API for non-MCP callers; tests cover it). New test
uses a deliberately-bad ref to confirm the sibling statement
still runs.
Minor:
- mcp-server.ts: createMcpServer now carries a JSDoc contract
(purpose, opts contract, lifecycle ownership note for tests vs
prod).
- query-engine.ts: QueryResultPayload + ExecuteQueryError exported
types now carry JSDoc covering shape discriminants and narrowing
guidance.
- cmd-mcp.ts: --help text refreshed — was still describing Tracer 1
with only the ping stub and pointing at docs/plans/agent-
transports.md (which was deleted in Tracer 7). Now lists every
shipped tool + resource and points at architecture.md.
- README.md: clarified `query_batch` is MCP-only (it isn't a CLI
verb, so "one tool per CLI verb" read too literally).
- glossary.md / architecture.md / .agents rule + skill / templates
rule + skill (mirrored across all 6 surfaces per Rule 10):
reworded "matches MCP spec" → "Codemap convention matching MCP
spec examples + reference servers (GitHub MCP, Cursor built-ins);
spec is convention-agnostic." The protocol doesn't mandate
snake_case — it's our convention informed by the ecosystem.
Architecture L129 claim ("per-statement errors are isolated") is
no longer narrowed — the underlying behavior in mcp-server.ts now
matches the broad claim, so the docs stay as-is.1 parent 7718691 commit 119db38
20 files changed
Lines changed: 2142 additions & 245 deletions
File tree
- .agents
- rules
- skills/codemap
- .changeset
- docs
- plans
- src
- application
- cli
- templates/agents
- rules
- skills/codemap
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
25 | 25 | | |
26 | 26 | | |
27 | 27 | | |
| 28 | + | |
28 | 29 | | |
29 | 30 | | |
30 | 31 | | |
31 | 32 | | |
32 | 33 | | |
33 | 34 | | |
34 | 35 | | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
35 | 46 | | |
36 | 47 | | |
37 | 48 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
53 | 53 | | |
54 | 54 | | |
55 | 55 | | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
56 | 79 | | |
57 | 80 | | |
58 | 81 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
110 | 110 | | |
111 | 111 | | |
112 | 112 | | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
113 | 120 | | |
114 | 121 | | |
115 | 122 | | |
| |||
0 commit comments