Skip to content

Commit 4ec51d8

Browse files
feat(serve): codemap serve — HTTP API exposing every MCP tool over POST /tool/{name} (#44)
* refactor(mcp): extract pure tool-handlers.ts (Tracer 1 of 7 — codemap serve prereq) Pulls every MCP tool body (11 tools) into application/tool-handlers.ts — pure transport-agnostic functions returning a discriminated ToolResult ({ok: true, format: 'json'|'sarif'|'annotations', payload} | {ok: false, error}). MCP wrapper (mcp-server.ts) now slim: each registerXxxTool call is one wrapToolResult(handle*(args, opts.root)) line. Why: prereq for the HTTP transport coming in Tracers 2-6. Keeping handlers pure means HTTP can dispatch the exact same logic without depending on the MCP SDK. Also moves the formatToolIncompatibility guard + runFormattedQuery sarif/annotations wiring + makeChangedFilesResolver + tryGetGitRefSafe into the pure module. mcp-server.ts shrinks ~640 → ~370 LOC; tool-handlers.ts adds ~600 LOC. All 42 existing MCP server tests still pass — refactor is behavior-preserving (verified via 'bun test src/application/mcp-server.test.ts'). * feat(serve): cmd-serve.ts parser + http-server.ts skeleton (Tracer 2 of 7) Adds the 'codemap serve' CLI verb + boots a long-running node:http listener. Skeleton routes: - GET /health (auth-exempt liveness probe) - GET /tools (catalog of 11 names) - POST /tool/{name} → 501 stub (Tracer 3+ wires individual handlers) - 404 for unknown routes / tools - 401 when --token <secret> is set and the Authorization: Bearer header doesn't match (auth check plumbed; Tracer 5 will add tests + docs) Defaults: 127.0.0.1:7878. SIGINT/SIGTERM → graceful drain. Bare node:http (no Express/Fastify dep). 14 parser tests cover --port / --host / --token (both space and = forms), defaults, error paths, unknown flag rejection. Smoke verified: 'bun src/index.ts serve --port 7879' boots; curl /health returns {ok, version} + X-Codemap-Version header; /tools returns the catalog; /tool/query 501s with the Tracer 3+ message; SIGTERM drains cleanly. * feat(serve): POST /tool/query end-to-end (Tracer 3 of 7) First wired tool. dispatchTool() reads JSON body (1 MiB cap to prevent trivial DoS), routes by name, calls the pure handler from tool-handlers.ts, then writeToolResult() translates ToolResult → HTTP response with the right Content-Type: - format: 'json' → application/json - format: 'sarif' → application/sarif+json (proper IANA media type) - format: 'annotations' → text/plain - !ok → 400 + {error} Smoke verified all four shapes: - POST /tool/query {sql:'SELECT ...'} → 200 [...rows] - POST /tool/query {sql:'...', format:'sarif'} → 200 application/sarif+json + SARIF doc - POST /tool/query {sql:'SELECT * FROM nonexistent'} → 400 {error: 'no such table: ...'} - POST /tool/query (invalid JSON body) → 400 {error: 'invalid JSON body: ...'} Other 10 tools still return 501 — Tracer 4 wires them in one batch. * feat(serve): wire remaining 10 tools (audit, context, validate, show, snippet, query_recipe, query_batch, baseline trio) (Tracer 4 of 7) Switch-dispatch from POST /tool/{name} → corresponding pure handler in tool-handlers.ts. Every tool now responds: - audit (async — runs incremental index unless no_index: true) - context, validate, show, snippet - query_recipe, query_batch - save_baseline, list_baselines, drop_baseline handleRequest exported so tests can attach to their own createServer() (skipping runHttpServer's SIGINT-awaiting outer loop). 17 integration tests cover health + tools catalog + every wired tool + sarif/annotations Content-Type + 400 on bad SQL / invalid JSON / unknown recipe + 404 on unknown tool. Smoke verified all 11 tools end-to-end: every CLI verb is reachable via POST /tool/{name} with the same envelope shape codemap query --json prints. * test(serve): lock --token Bearer auth (Tracer 5 of 7) Auth check was already plumbed in Tracer 2; Tracer 5 adds the integration tests that lock the contract: - POST without Authorization → 401 + {error: "...Bearer..."} - POST with wrong Bearer token → 401 - POST with correct Bearer token → 200 + payload - GET /health is auth-exempt (liveness probes work without leaking the token) - GET /tools requires the token (catalog-leak protection — agents shouldn't enumerate tools without auth) 5 new tests; 22 pass total. * feat(serve): GET /resources/{uri} mirroring MCP resources (Tracer 6 of 7) New application/resource-handlers.ts: pure transport-agnostic resource fetchers shared between MCP and HTTP. Same lazy-cache-on-first-read pattern (resources are constant for server-process lifetime so no invalidation needed); _resetResourceCachesForTests() escape hatch for temp-DB tests. HTTP routes: - GET /resources → catalog ({resources: [{uri, description}]}) - GET /resources/{encoded uri} → payload with the right mimeType - 400 on invalid percent-encoding; 404 on unknown URI mcp-server.ts's registerResources slimmed: 4 static URIs go through registerStaticResource() helper that delegates to readResource(); the recipe-template URI shares the same readResource lookup. ~150 LOC removed in mcp-server.ts; same observable behavior (42 MCP tests still pass). 6 new HTTP integration tests cover catalog + each resource type + 404 paths. * docs: sync architecture / glossary / README / agents (Rule 10) + delete plan + changeset (Tracer 7 of 7) - docs/architecture.md: new 'HTTP wiring' paragraph after MCP wiring; new 'Tool / resource handlers (transport-agnostic)' paragraph documenting the shared tool-handlers.ts + resource-handlers.ts modules; application/ table extended with the four new files (output-formatters, tool-handlers, resource-handlers, http-server). - docs/glossary.md: new 'codemap serve / HTTP server' entry under ## S; 'codemap mcp' entry updated to include show + snippet tools and to reference the shared transport-agnostic handlers. - docs/roadmap.md: 'codemap serve (HTTP API, v1.x)' line removed (shipped per Rule 2). - README.md: 'Daily commands' stripe extended with codemap serve example (port + token + curl). - .agents/rules/codemap.md + templates/agents/rules/codemap.md (Rule 10): new 'HTTP server (for non-MCP)' row in the CLI table. - .agents/skills/codemap/SKILL.md + templates/agents/skills/codemap/SKILL.md: new 'HTTP server' paragraph next to 'MCP server'; documents loopback-default, --token, output-shape difference (no MCP {content: [...]} wrapper), shared-handlers note. - .changeset/codemap-serve.md: minor changeset (new top-level CLI verb). - docs/plans/codemap-serve.md: deleted on ship per docs-governance Rule 3. - src/{application/http-server.ts, cli/cmd-serve.ts}: replaced dangling cross-refs to the deleted plan with cross-refs to architecture.md § HTTP wiring. * fix(serve): CSRF + DNS-rebinding guard on every request (security audit) Self-audit found a real attack vector: HTTP API on 127.0.0.1 is reachable from any locally-running browser tab via fetch. Without an Origin / Sec-Fetch-Site check, a malicious page at evil.com can: - POST /tool/save_baseline / drop_baseline → CSRF (request reaches us, state mutates; CORS only blocks the response from being read, not the request itself). - DNS-rebind to bypass the loopback-only bind (evil.com → 127.0.0.1 after page load; browser sends Host: evil.com:7878). New csrfCheck() runs BEFORE every route (including auth-exempt /health so a malicious page can't even probe for liveness): 1. Sec-Fetch-Site = cross-site / same-site → 403 (modern browsers always send). 2. Host header mismatch on loopback bind → 403 (DNS rebinding). 3. Origin header set + non-null → 403 (older-browser fallback). Non-browser clients (curl, fetch from Node, MCP hosts, CI scripts) don't send any of these headers and pass through. When --host 0.0.0.0 is set the user explicitly opted in to broader exposure; the Host check is skipped (Host could legitimately be any hostname/IP that resolves to the bound interface). Bug found during testing: csrfCheck used opts.port for the allowed-Host set, but tests bind to port 0 (OS picks). Fixed by using req.socket.localPort (always the real listening port). 10 new tests cover every attack vector + every legit pass-through. End-to-end smoke verified with curl: Origin, Sec-Fetch-Site, Host all gate correctly. Also: docs/architecture.md HTTP wiring + docs/glossary.md codemap serve entry + changeset all updated to call out the guard. bun audit clean (no new deps — bare node:http). * fix(serve): Zod validation + 404/500 status + IPv6 host + 7 doc nits (CodeRabbit on #44) 10 CodeRabbit threads. All verified ✅ correct. **Major bugs:** - (#5) IPv6 host bracketing — new URL('/foo', 'http://::1:7878') threw because IPv6 literals need brackets per RFC 3986. Fixed by wrapping when host contains ':' and isn't already bracketed. - (#6) HTTP path bypassed Zod validation — schemas were exported but never applied; handlers received unvalidated 'any'-cast args. Added per-tool validate() helper that wraps the ZodRawShape with z.object() and safeParse()s; failure → structured 400 with '<path>: <message>' joined error string. Mirrors what MCP gets for free via the SDK's inputSchema. 7 new tests cover missing required fields, type mismatches, and per-tool error message format. - (#7) Error classification — ToolResult error arm gained an optional status field (400 default | 404 | 500). query_recipe + save_baseline 'unknown recipe' and drop_baseline 'no baseline named X' now return 404 instead of 400 (semantics matter for HTTP consumers branching on status). All catch-all engine throws (try/catch) marked 500. MCP transport ignores the field; HTTP transport reads it via writeToolResult. **Doc nits (batch):** - (#1) .agents/skills/codemap/SKILL.md — dropped 'v1.x backlog: codemap serve' (it shipped); replaced with pointer to tool-handlers.ts + resource-handlers.ts shared modules. - (#2) docs/architecture.md — buildContextEnvelope/computeValidateRows location updated from src/cli/cmd-*.ts (pre-PR #41) to src/application/{context,validate}-engine.ts. - (#3) docs/glossary.md — softened 'modern browsers always send' to 'send on cross-origin fetches (header presence varies by request type, browser, and privacy settings)'. Accurate without absolutism. - (#4) README.md — TOKEN=$(openssl rand -hex 32) hoisted out of the codemap serve invocation so the curl command actually has a defined $TOKEN. - (#8) bootstrap.ts printCliUsage() — added 'codemap serve' entry so it's discoverable from --help. - (#9) cmd-serve.ts help text — added GET /resources catalog route + extended the error-status enumeration to 400/401/403/404/500 (was 400/401/404/500). - (#10) templates/agents/rules/codemap.md + .agents/rules/codemap.md — added --host flag to the serve table row. 50 HTTP tests + 42 MCP tests still green. No new dependency vulnerabilities (bun audit clean — bare node:http).
1 parent 4061ac3 commit 4ec51d8

18 files changed

Lines changed: 2529 additions & 770 deletions

File tree

.agents/rules/codemap.md

Lines changed: 18 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -12,23 +12,24 @@ A local database (default **`.codemap.db`**) indexes structure: symbols, imports
1212

1313
## CLI (this repository)
1414

15-
| Context | Incremental index | Query |
16-
| ------------------------------ | ------------------ | ---------------------------------------------------------------------------------------------------------------------- |
17-
| **Default** — from this clone | `bun src/index.ts` | `bun src/index.ts query --json "<SQL>"` |
18-
| Same entry | `bun run dev` | (same as first row) |
19-
| Query (ASCII table — optional) || `bun src/index.ts query "<SQL>"` |
20-
| Recipe || `bun src/index.ts query --json --recipe fan-out` (see **`bun src/index.ts query --help`**) |
21-
| Recipe catalog / SQL || `bun src/index.ts query --recipes-json` · `bun src/index.ts query --print-sql fan-out` |
22-
| Counts only || `bun src/index.ts query --json --summary -r deprecated-symbols` |
23-
| PR-scoped rows || `bun src/index.ts query --json --changed-since origin/main -r fan-out` |
24-
| Bucket by owner / dir / pkg || `bun src/index.ts query --json --group-by directory -r fan-in` |
25-
| Save / diff a baseline || `bun src/index.ts query --save-baseline -r visibility-tags` then `… --json --baseline -r visibility-tags` |
26-
| List / drop baselines || `bun src/index.ts query --baselines` · `bun src/index.ts query --drop-baseline <name>` |
27-
| Per-delta audit || `bun src/index.ts audit --json --baseline base` (auto-resolves `base-files` / `base-dependencies` / `base-deprecated`) |
28-
| MCP server (for agent hosts) || `bun src/index.ts mcp` — JSON-RPC on stdio; one tool per CLI verb. See **MCP** section below. |
29-
| Targeted read (metadata) || `bun src/index.ts show <name> [--kind <k>] [--in <path>] [--json]` — file:line + signature |
30-
| Targeted read (source text) || `bun src/index.ts snippet <name> [--kind <k>] [--in <path>] [--json]` — same lookup + source from disk + stale flag |
31-
| SARIF / GH annotations || `bun src/index.ts query --recipe deprecated-symbols --format sarif` · `… --format annotations` |
15+
| Context | Incremental index | Query |
16+
| ------------------------------ | ------------------ | ------------------------------------------------------------------------------------------------------------------------- |
17+
| **Default** — from this clone | `bun src/index.ts` | `bun src/index.ts query --json "<SQL>"` |
18+
| Same entry | `bun run dev` | (same as first row) |
19+
| Query (ASCII table — optional) || `bun src/index.ts query "<SQL>"` |
20+
| Recipe || `bun src/index.ts query --json --recipe fan-out` (see **`bun src/index.ts query --help`**) |
21+
| Recipe catalog / SQL || `bun src/index.ts query --recipes-json` · `bun src/index.ts query --print-sql fan-out` |
22+
| Counts only || `bun src/index.ts query --json --summary -r deprecated-symbols` |
23+
| PR-scoped rows || `bun src/index.ts query --json --changed-since origin/main -r fan-out` |
24+
| Bucket by owner / dir / pkg || `bun src/index.ts query --json --group-by directory -r fan-in` |
25+
| Save / diff a baseline || `bun src/index.ts query --save-baseline -r visibility-tags` then `… --json --baseline -r visibility-tags` |
26+
| List / drop baselines || `bun src/index.ts query --baselines` · `bun src/index.ts query --drop-baseline <name>` |
27+
| Per-delta audit || `bun src/index.ts audit --json --baseline base` (auto-resolves `base-files` / `base-dependencies` / `base-deprecated`) |
28+
| MCP server (for agent hosts) || `bun src/index.ts mcp` — JSON-RPC on stdio; one tool per CLI verb. See **MCP** section below. |
29+
| HTTP server (for non-MCP) || `bun src/index.ts serve [--host 127.0.0.1] [--port 7878] [--token <secret>]` — same tool taxonomy over POST /tool/{name}. |
30+
| Targeted read (metadata) || `bun src/index.ts show <name> [--kind <k>] [--in <path>] [--json]` — file:line + signature |
31+
| Targeted read (source text) || `bun src/index.ts snippet <name> [--kind <k>] [--in <path>] [--json]` — same lookup + source from disk + stale flag |
32+
| SARIF / GH annotations || `bun src/index.ts query --recipe deprecated-symbols --format sarif` · `… --format annotations` |
3233

3334
**Recipe `actions`:** with **`--json`**, recipes that define an `actions` template append it to every row (kebab-case verb + description — e.g. `fan-out``review-coupling`). Under `--baseline`, actions attach to the **`added`** rows only. Inspect via **`--recipes-json`**. Ad-hoc SQL never carries actions.
3435

.agents/skills/codemap/SKILL.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -56,6 +56,8 @@ Each emitted delta carries its own `base` metadata so mixed-baseline audits are
5656

5757
**MCP server (`bun src/index.ts mcp`)** — separate top-level command that exposes the entire CLI surface to agent hosts (Claude Code, Cursor, Codex, generic MCP clients) as JSON-RPC tools over stdio. Eliminates the bash round-trip on every agent call. Bootstrap once at server boot; tool handlers reuse the existing engine entry-points (`executeQuery`, `runAudit`, etc.) so output shape is verbatim from each tool's CLI counterpart's `--json` envelope.
5858

59+
**HTTP server (`bun src/index.ts serve [--host 127.0.0.1] [--port 7878] [--token <secret>]`)** — same tool taxonomy as MCP, exposed over `POST /tool/{name}` for non-MCP consumers (CI scripts, simple `curl`, IDE plugins that don't speak MCP). Loopback-default; optional Bearer-token auth. Output shape is the `codemap query --json` envelope (NOT MCP's `{content: [...]}` wrapper); SARIF / annotations payloads ship with `application/sarif+json` / `text/plain` Content-Type. Resources mirrored at `GET /resources/{encoded-uri}`. `GET /health` is auth-exempt; `GET /tools` / `GET /resources` are catalogs. Same `application/tool-handlers.ts` + `resource-handlers.ts` MCP uses — no engine duplication.
60+
5961
**Tools (snake_case keys — Codemap convention matching MCP spec examples + reference servers; spec is convention-agnostic. CLI stays kebab; translation lives at the MCP-arg layer.):**
6062

6163
- **`query`** — one SQL statement. Args: `{sql, summary?, changed_since?, group_by?, format?}`. Same envelope as `codemap query --json`. Pass `format: "sarif"` or `"annotations"` to receive a formatted text payload (SARIF 2.1.0 doc / `::notice` lines); ad-hoc SQL gets `rule.id = codemap.adhoc`. Format is incompatible with `summary` / `group_by` (parser rejects with a structured `{error}`).
@@ -77,7 +79,7 @@ Each emitted delta carries its own `base` metadata so mixed-baseline audits are
7779
- **`codemap://schema`** — DDL of every table in `.codemap.db` (queried live from `sqlite_schema`).
7880
- **`codemap://skill`** — full text of bundled `templates/agents/skills/codemap/SKILL.md`. Agents that don't preload the skill at session start can fetch it here.
7981

80-
**Implementation:** `src/cli/cmd-mcp.ts` (CLI shell — argv + lifecycle) + `src/application/mcp-server.ts` (engine — tool registry, resource handlers). Mirrors the `cmd-audit.ts ↔ audit-engine.ts` seam. `--changed-since` git lookups are memoised per `(root, ref)` pair across batch items so a `query_batch` of N items sharing the same ref does one git invocation, not N. v1.x backlog: `codemap serve` (HTTP API) reuses the same tool taxonomy + output shape.
82+
**Implementation:** `src/cli/cmd-mcp.ts` (CLI shell — argv + lifecycle) + `src/application/mcp-server.ts` (transportSDK glue). Tool bodies live in `src/application/tool-handlers.ts` (pure transport-agnostic — same handlers `codemap serve` dispatches over HTTP); resource fetchers in `src/application/resource-handlers.ts`. Mirrors the `cmd-audit.ts ↔ audit-engine.ts` seam. `--changed-since` git lookups are memoised per `(root, ref)` pair across batch items so a `query_batch` of N items sharing the same ref does one git invocation, not N.
8183

8284
**Determinism:** Bundled recipes use stable secondary **`ORDER BY`** tie-breakers (and ordered inner **`LIMIT`** samples where applicable). Prefer **`--recipe`** over pasting SQL when you need the maintained ordering. **Canonical SQL** is **`src/cli/query-recipes.ts`** (`QUERY_RECIPES`).
8385

.changeset/codemap-serve.md

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
---
2+
"@stainless-code/codemap": minor
3+
---
4+
5+
`codemap serve` — HTTP server exposing the same tool taxonomy as `codemap mcp` over `POST /tool/{name}`. For non-MCP consumers (CI scripts, simple `curl`, IDE plugins that don't speak MCP).
6+
7+
Default bind `127.0.0.1:7878` (loopback only — refuse `0.0.0.0` unless explicitly opted in via `--host 0.0.0.0`). Optional `--token <secret>` requires `Authorization: Bearer <secret>` on every request; `GET /health` is auth-exempt so liveness probes work without leaking the token. Bare `node:http` (no Express / Fastify dep) — runs on Bun + Node.
8+
9+
**Routes:**
10+
11+
- `POST /tool/{name}` — every MCP tool (query, query_recipe, query_batch, audit, context, validate, show, snippet, save_baseline, list_baselines, drop_baseline). Body `{<args>}`; response = same `codemap query --json` envelope (NOT MCP's `{content: [...]}` wrapper). `format: "sarif"` payloads ship as `application/sarif+json`; `format: "annotations"` as `text/plain`.
12+
- `GET /resources/{encoded-uri}` — mirror of MCP resources (`codemap://recipes`, `codemap://recipes/{id}`, `codemap://schema`, `codemap://skill`).
13+
- `GET /health` — liveness (auth-exempt); `GET /tools` / `GET /resources` — catalogs.
14+
- Errors: `{"error": "..."}` with HTTP status 400 / 401 / 404 / 500.
15+
- Every response carries `X-Codemap-Version: <semver>` so consumers can pin / detect upgrades.
16+
17+
**Internals:** Tool bodies (`application/tool-handlers.ts`) and resource fetchers (`application/resource-handlers.ts`) are pure transport-agnostic — same handlers `codemap mcp` dispatches. No engine duplication; `mcp-server.ts` and `http-server.ts` both wrap the same `ToolResult` discriminated union.
18+
19+
**Security:** CSRF + DNS-rebinding guard rejects requests with `Sec-Fetch-Site: cross-site` / `same-site` (modern-browser CSRF), any `Origin` header that isn't `null` (older-browser CSRF), and `Host` header mismatch on loopback bind (DNS rebinding) — runs on every request including auth-exempt `/health`. Defends against a malicious local webpage `fetch`-ing the API while the developer is browsing. Non-browser clients (curl, MCP hosts, CI scripts) don't send those headers and pass through. SIGINT / SIGTERM → graceful drain. 1 MiB request-body cap (DoS protection). SQLite reader concurrency handles parallel requests; `PRAGMA query_only = 1` set per connection.

README.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -111,6 +111,14 @@ codemap audit --baseline base --no-index # skip the auto-
111111
# path / to_path / from_path; rule.id is codemap.<recipe-id> (or codemap.adhoc for ad-hoc).
112112
codemap query --recipe deprecated-symbols --format sarif > findings.sarif
113113
codemap query --recipe deprecated-symbols --format annotations # one ::notice per row
114+
# HTTP API — same tool taxonomy as `codemap mcp`, exposed over POST /tool/{name} for
115+
# non-MCP consumers (CI scripts, curl, IDE plugins). Loopback default; optional --token.
116+
TOKEN=$(openssl rand -hex 32)
117+
codemap serve --port 7878 --token "$TOKEN" &
118+
curl -s -X POST http://127.0.0.1:7878/tool/query \
119+
-H 'Content-Type: application/json' \
120+
-H "Authorization: Bearer $TOKEN" \
121+
-d '{"sql":"SELECT name, file_path FROM symbols LIMIT 5"}'
114122
# List bundled recipes as JSON, or print one recipe's SQL (no DB required)
115123
codemap query --recipes-json
116124
codemap query --print-sql fan-out

0 commit comments

Comments
 (0)