You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/agents.md
+2Lines changed: 2 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -133,6 +133,8 @@ See [architecture.md § Session lifecycle wiring](./architecture.md#session-life
133
133
134
134
**`context.index_freshness`** — session bootstrap includes index-level freshness metadata: `commit_drift` (HEAD ≠ `last_indexed_commit`), `pending_sync` (watcher debounce queue or in-flight reindex), optional disk-drift counts when watch is off, and a single `warning` string when agents should pause or re-index. **`context.start_here`** (non-compact) adds inline index summary, intent-ranked `query_recipe` cards, and top hub files with export signatures (adaptive caps by file count; optional MCP/HTTP `include_snippets` for one-line previews). Debug intent biases `sample_markers` toward FIXME/TODO. **MCP:** array-shaped JSON tools (`query`, …) keep row payloads verbatim and append a second `content` block prefixed `@codemap/index_freshness`; object-shaped tools merge `index_freshness` inline. **HTTP:** `POST /tool/*` adds `X-Codemap-Pending-Sync`, `X-Codemap-Commit-Drift`, and `X-Codemap-Warning` headers without changing JSON bodies; **`GET /health`** includes full cheap `index_freshness` when the DB is readable. Complements per-file `validate` / snippet `stale`. See [architecture.md § Context wiring](./architecture.md#context-wiring).
135
135
136
+
**MCP ToolAnnotations** — `tools/list` (and HTTP `GET /tools`) expose advisory `readOnlyHint` / `destructiveHint` / `idempotentHint` per tool so clients can gate auto-approval. Read paths (`query`, `show`, `audit`, …) → `readOnlyHint: true`; disk-write apply tools → `destructiveHint: true` (writes still require `yes: true`); index mutators (`save_baseline`, `drop_baseline`, `ingest_coverage`) → `readOnlyHint: false` without `destructiveHint`.
137
+
136
138
**`CODEMAP_MCP_TOOLS`** — comma-separated snake_case MCP tool names. When set, only listed tools register (stderr lists the active set). Unknown names are ignored with a warning. Unset = all tools (default). **`query_batch`** registers only when listed or when unset (eval ablation).
Copy file name to clipboardExpand all lines: docs/architecture.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -194,9 +194,9 @@ Three **mutually exclusive** CLI entry shapes; all converge on `applyDiffPayload
194
194
195
195
**Tool / resource handlers (transport-agnostic):****`src/application/tool-handlers.ts`** + **`src/application/resource-handlers.ts`** — pure functions that take the args object an MCP tool / resource URI accepts and return a discriminated **`ToolResult`** (`{ok: true, format: 'json'|'sarif'|'annotations'|'mermaid'|'diff'|'diff-json', payload}` / `{ok: false, error}`) or a **`ResourcePayload`** (`{mimeType, text}`). MCP and HTTP both wrap the same handlers — MCP translates to `{content: [{type: "text", text}]}`, HTTP translates to `(status, body)` with the right `Content-Type`. Engine layer untouched; transport changes don't ripple into the SQL.
196
196
197
-
**MCP wiring:** **`src/cli/cmd-mcp.ts`** (argv — `--watch` / `--no-watch` / `--debounce` + `--help`; bootstrap absorbs `--root`/`--config`) + **`src/application/mcp-server.ts`** (transport — tool / resource registry, SDK glue). Mirrors the `cmd-audit.ts ↔ audit-engine.ts` seam — CLI parses + lifecycle; engine owns the SDK. **`runMcpServer`** bootstraps codemap once at server boot (config + resolver + DB access become module-level state), instantiates `McpServer` from **`@modelcontextprotocol/sdk`**, attaches a **`StdioServerTransport`**, and resolves on client disconnect via **`src/application/session-lifecycle.ts`** (`createStdioDisconnectMonitor` — stdin EOF, stdout EPIPE, parent-PID poll — plus SDK `transport.onclose` and SIGINT/SIGTERM). With `--watch`, **`createManagedWatchSession`** holds one client for the stdio session and **`forceStop`** drains the watcher on exit. Tool handlers reuse the existing engine entry-points: **`query`** / **`query_recipe`** call **`executeQuery`** in **`src/application/query-engine.ts`** (same `[...rows]` / `{count}` / `{group_by, groups}` envelope `--json` would print) unless **`baseline`** is set — then **`compareQueryBaseline`** in **`src/application/query-baseline.ts`** (incompatible with non-`json` **`format`** / **`group_by`**); **`ingest_coverage`** calls **`runIngestCoverageOnDb`** in **`src/application/ingest-coverage-run.ts`** (CLI twin: `codemap ingest-coverage --json`); **`query_batch`** loops per statement via **`handleQueryBatch`** → **`executeQuery`** (batch-wide defaults + per-item overrides; items are `string | {sql, summary?, changed_since?, group_by?}`); **`audit`** runs `resolveAuditBaselines` + `runAudit` from PR #33 unchanged; **`context`** / **`validate`** call `buildContextEnvelope` / `computeValidateRows` from **`src/application/context-engine.ts`** + **`src/application/validate-engine.ts`** (lifted out of `src/cli/cmd-*.ts` in PR #41 — see § Tool / resource handlers above). **`save_baseline`** is one polymorphic tool (`{name, sql? | recipe?}`) with a runtime exclusivity check — mirrors the CLI's single `--save-baseline=<name>` verb. **Tool naming**: snake_case throughout — Codemap convention matching the patterns in MCP spec examples and reference servers (GitHub MCP, Cursor built-ins); the spec itself doesn't mandate it. CLI stays kebab — translation lives at the MCP-arg layer. **Resources** split by freshness contract: `codemap://schema`, `codemap://skill`, `codemap://rule`, and `codemap://mcp-instructions` use **lazy memoisation** — first `read_resource` populates a per-server-instance cache; constant for the server-process lifetime so eager-vs-lazy produce identical observable behavior. `codemap://recipes`, `codemap://recipes/{id}`, `codemap://files/{+path}`, and `codemap://symbols/{name}` are **live read-per-call** (no cache) so inline recency fields and index mutations under `--watch` don't freeze at first-read. `codemap://schema` queries `sqlite_schema` live (on first read, then cached); `codemap://skill` / `codemap://rule` / `codemap://mcp-instructions` call `assembleAgentContent(kind)` from `application/agent-content.ts`, which concatenates section files under `templates/agent-content/<kind>/` and dispatches `*.gen.md` files through `RENDERERS` (live recipe catalog, live `createTables()` DDL) — see [agents.md § Section assembler](./agents.md#section-assembler-and-genmd). Output shape: each tool returns the JSON payload its CLI counterpart would print (`query batch`, `trace`, `explore`, `node`, `file`, `schema`, `context --include-snippets`, `ingest-coverage`); MCP wraps via `content: [{type: "text", text: JSON.stringify(payload)}]`. `--changed-since` git lookups are memoised per `(root, ref)` pair across batch items so a `query_batch` of N items sharing the same ref does one git invocation, not N. Per-statement errors in `query_batch` are isolated — failed statements return `{error}` in their slot while siblings still execute.
197
+
**MCP wiring:** **`src/cli/cmd-mcp.ts`** (argv — `--watch` / `--no-watch` / `--debounce` + `--help`; bootstrap absorbs `--root`/`--config`) + **`src/application/mcp-server.ts`** (transport — tool / resource registry, SDK glue). Mirrors the `cmd-audit.ts ↔ audit-engine.ts` seam — CLI parses + lifecycle; engine owns the SDK. **`runMcpServer`** bootstraps codemap once at server boot (config + resolver + DB access become module-level state), instantiates `McpServer` from **`@modelcontextprotocol/sdk`**, attaches a **`StdioServerTransport`**, and resolves on client disconnect via **`src/application/session-lifecycle.ts`** (`createStdioDisconnectMonitor` — stdin EOF, stdout EPIPE, parent-PID poll — plus SDK `transport.onclose` and SIGINT/SIGTERM). With `--watch`, **`createManagedWatchSession`** holds one client for the stdio session and **`forceStop`** drains the watcher on exit. Tool handlers reuse the existing engine entry-points: **`query`** / **`query_recipe`** call **`executeQuery`** in **`src/application/query-engine.ts`** (same `[...rows]` / `{count}` / `{group_by, groups}` envelope `--json` would print) unless **`baseline`** is set — then **`compareQueryBaseline`** in **`src/application/query-baseline.ts`** (incompatible with non-`json` **`format`** / **`group_by`**); **`ingest_coverage`** calls **`runIngestCoverageOnDb`** in **`src/application/ingest-coverage-run.ts`** (CLI twin: `codemap ingest-coverage --json`); **`query_batch`** loops per statement via **`handleQueryBatch`** → **`executeQuery`** (batch-wide defaults + per-item overrides; items are `string | {sql, summary?, changed_since?, group_by?}`); **`audit`** runs `resolveAuditBaselines` + `runAudit` from PR #33 unchanged; **`context`** / **`validate`** call `buildContextEnvelope` / `computeValidateRows` from **`src/application/context-engine.ts`** + **`src/application/validate-engine.ts`** (lifted out of `src/cli/cmd-*.ts` in PR #41 — see § Tool / resource handlers above). **`save_baseline`** is one polymorphic tool (`{name, sql? | recipe?}`) with a runtime exclusivity check — mirrors the CLI's single `--save-baseline=<name>` verb. **Tool naming**: snake_case throughout — Codemap convention matching the patterns in MCP spec examples and reference servers (GitHub MCP, Cursor built-ins); the spec itself doesn't mandate it. CLI stays kebab — translation lives at the MCP-arg layer. **Resources** split by freshness contract: `codemap://schema`, `codemap://skill`, `codemap://rule`, and `codemap://mcp-instructions` use **lazy memoisation** — first `read_resource` populates a per-server-instance cache; constant for the server-process lifetime so eager-vs-lazy produce identical observable behavior. `codemap://recipes`, `codemap://recipes/{id}`, `codemap://files/{+path}`, and `codemap://symbols/{name}` are **live read-per-call** (no cache) so inline recency fields and index mutations under `--watch` don't freeze at first-read. `codemap://schema` queries `sqlite_schema` live (on first read, then cached); `codemap://skill` / `codemap://rule` / `codemap://mcp-instructions` call `assembleAgentContent(kind)` from `application/agent-content.ts`, which concatenates section files under `templates/agent-content/<kind>/` and dispatches `*.gen.md` files through `RENDERERS` (live recipe catalog, live `createTables()` DDL) — see [agents.md § Section assembler](./agents.md#section-assembler-and-genmd). Output shape: each tool returns the JSON payload its CLI counterpart would print (`query batch`, `trace`, `explore`, `node`, `file`, `schema`, `context --include-snippets`, `ingest-coverage`); MCP wraps via `content: [{type: "text", text: JSON.stringify(payload)}]`. **`tools/list` ToolAnnotations** — advisory `readOnlyHint` / `destructiveHint` / `idempotentHint` per tool from **`src/application/mcp-tool-annotations.ts`** (central map beside **`mcp-tool-allowlist.ts`**); read paths (`query`, `show`, `audit`, …) → `readOnlyHint: true`; disk-write apply tools → `destructiveHint: true` (writes still require `yes: true`); index user-data mutators (`save_baseline`, `drop_baseline`, `ingest_coverage`) → `readOnlyHint: false` without `destructiveHint`. Omitted when an older `@modelcontextprotocol/sdk` lacks annotation fields (M.6 guard). `--changed-since` git lookups are memoised per `(root, ref)` pair across batch items so a `query_batch` of N items sharing the same ref does one git invocation, not N. Per-statement errors in `query_batch` are isolated — failed statements return `{error}` in their slot while siblings still execute.
198
198
199
-
**HTTP wiring:** **`src/cli/cmd-serve.ts`** (argv — `--host` / `--port` / `--token`; bootstrap absorbs `--root`/`--config`) + **`src/application/http-server.ts`** (transport — bare `node:http`; routes `POST /tool/{name}` to `tool-handlers`, `GET /resources/{encoded-uri}` to `resource-handlers`, plus `GET /health` / `GET /tools` / `GET /resources`). Default bind **`127.0.0.1:7878`** (loopback only — refuse `0.0.0.0` unless explicitly opted in via `--host 0.0.0.0`). Optional **`--token <secret>`** requires `Authorization: Bearer <secret>` on every request; `GET /health` is auth-exempt so liveness probes work without leaking the token. **CSRF + DNS-rebinding guard** (`csrfCheck`) runs before every route — rejects `Sec-Fetch-Site: cross-site` / `same-site` (modern-browser CSRF), any present `Origin` header (including the opaque string `null`; older-browser CSRF fallback), and `Host` header mismatch on loopback bind (DNS rebinding). Non-browser clients (curl, fetch from Node, MCP hosts, CI scripts) don't send those headers and pass through. The guard runs even on `/health` so a malicious local webpage can't probe for liveness. Output shape: HTTP returns each tool's native JSON payload directly (NOT MCP's `{content: [...]}` wrapper — HTTP doesn't need that transport artifact); `query` / `query_recipe` match `codemap query --json` row arrays (or `{count}` / `{group_by,groups}` when `summary` / `group_by` is set, or baseline diff when `baseline` is set — incompatible with non-`json` `format` / `group_by`; save/list/drop remain separate tools); other tools match their CLI `--json` envelopes; `format: "sarif"` payloads ship as `application/sarif+json`, `format: "annotations"` / `"mermaid"` / `"diff"` as `text/plain; charset=utf-8`, `format: "diff-json"` as `application/json; charset=utf-8`, JSON otherwise. Per-request DB lifecycle: open / `PRAGMA query_only = 1` / close per call (SQLite reader concurrency); 1 MiB request-body cap rejects trivial DoS. SIGINT / SIGTERM → graceful drain via `server.close()`. Every response carries **`X-Codemap-Version: <semver>`** so consumers can pin / detect upgrades.
199
+
**HTTP wiring:** **`src/cli/cmd-serve.ts`** (argv — `--host` / `--port` / `--token`; bootstrap absorbs `--root`/`--config`) + **`src/application/http-server.ts`** (transport — bare `node:http`; routes `POST /tool/{name}` to `tool-handlers`, `GET /resources/{encoded-uri}` to `resource-handlers`, plus `GET /health` / `GET /tools` / `GET /resources`). Default bind **`127.0.0.1:7878`** (loopback only — refuse `0.0.0.0` unless explicitly opted in via `--host 0.0.0.0`). Optional **`--token <secret>`** requires `Authorization: Bearer <secret>` on every request; `GET /health` is auth-exempt so liveness probes work without leaking the token. **CSRF + DNS-rebinding guard** (`csrfCheck`) runs before every route — rejects `Sec-Fetch-Site: cross-site` / `same-site` (modern-browser CSRF), any present `Origin` header (including the opaque string `null`; older-browser CSRF fallback), and `Host` header mismatch on loopback bind (DNS rebinding). Non-browser clients (curl, fetch from Node, MCP hosts, CI scripts) don't send those headers and pass through. The guard runs even on `/health` so a malicious local webpage can't probe for liveness. Output shape: HTTP returns each tool's native JSON payload directly (NOT MCP's `{content: [...]}` wrapper — HTTP doesn't need that transport artifact); `query` / `query_recipe` match `codemap query --json` row arrays (or `{count}` / `{group_by,groups}` when `summary` / `group_by` is set, or baseline diff when `baseline` is set — incompatible with non-`json` `format` / `group_by`; save/list/drop remain separate tools); other tools match their CLI `--json` envelopes; `format: "sarif"` payloads ship as `application/sarif+json`, `format: "annotations"` / `"mermaid"` / `"diff"` as `text/plain; charset=utf-8`, `format: "diff-json"` as `application/json; charset=utf-8`, JSON otherwise. Per-request DB lifecycle: open / `PRAGMA query_only = 1` / close per call (SQLite reader concurrency); 1 MiB request-body cap rejects trivial DoS. **`GET /tools`** returns the same advisory hint fields as MCP `tools/list` (`readOnlyHint` / `destructiveHint` / `idempotentHint` per entry via **`buildHttpToolCatalogEntry`**). SIGINT / SIGTERM → graceful drain via `server.close()`. Every response carries **`X-Codemap-Version: <semver>`** so consumers can pin / detect upgrades.
200
200
201
201
**Watch wiring:** **`src/cli/cmd-watch.ts`** (argv — `--debounce <ms>` / `--quiet`; bootstrap absorbs `--root`/`--config`) + **`src/application/watcher.ts`** (engine — pure debouncer + glob filter + injectable backend; production wires [chokidar v5](https://github.com/paulmillr/chokidar) selected via the 6-watcher audit in PR #46 — pure JS, runs identically on Bun + Node, ~30M repos use it). On every change/add/unlink event chokidar emits, the engine filters via `shouldIndexPath` (same indexed extensions as the indexer + project-local recipes; skips `node_modules` / `.git` / `dist`), debounces with a sliding window (default 250 ms), then calls `createReindexOnChange` which opens a DB, runs `runCodemapIndex({mode: 'files', files: [...changed]})`, closes the DB, and logs `reindex N file(s) in Mms` to stderr unless `--quiet`. SIGINT / SIGTERM drains pending edits via `flushNow()` before the watcher closes. **Default-ON for `mcp` / `serve` since 2026-05:** both transports embed the watcher via **`createManagedWatchSession`** in **`session-lifecycle.ts`** — MCP holds one client for the stdio session; HTTP acquires per request (excluding `/health`) and stops the watcher after the last client plus a 5s release grace (not an MCP idle shutdown). Opt out with `--no-watch`, `CODEMAP_WATCH=0`, or `CODEMAP_NO_WATCH=1`. **`src/application/watch-policy.ts`** disables the watcher on WSL2 Windows drive mounts (`/mnt/*`) unless `CODEMAP_FORCE_WATCH=1`; stderr points at `codemap agents init --git-hooks` for git-triggered freshness. Standalone `codemap watch` runs the watcher decoupled from a transport for users wiring it next to a separate MCP / HTTP process. **Audit prelude optimization:** module-level `watchActive` flag; `handleAudit` skips its incremental-index prelude when active (and marks the close as readonly to avoid a wasted checkpoint). Explicit `no_index: false` still forces the prelude.
0 commit comments