Skip to content

Commit 7460b46

Browse files
feat(show + snippet): targeted-read CLI verbs + MCP tools (#39)
* docs(plans): draft targeted-read-cli (codemap show) One-step CLI verb for 'where is this symbol' — codemap show <symbol> returns file_path:line_start-line_end + signature. Pure ergonomic affordance over SELECT … FROM symbols WHERE name = ?; no schema change. Plan covers surface (show + --all + --kind + --in flags), wiring (cmd-show.ts + show-engine.ts mirroring cmd-context/cmd-validate), MCP integration via the plan §35 pattern, and a 4-commit tracer-bullet sequence (~half day). 5 open questions worth a grill round before code: MCP tool registration, multiple-match UX (error vs list), exact vs fuzzy matching, file-scope filter, snippet-sibling timing. Status: design pass; not yet implemented. * docs(plans): settle Q-1 — show ships as a dedicated MCP tool Mirrors the every-verb-becomes-a-tool pattern from PR #35. Discoverability win matters for agents that don't know the symbols schema; token savings compound. ~25 LOC registration; reuses the engine helper. * docs(plans): settle Q-2 — always-wrap {matches, disambiguation?} envelope Agent-first reframing: 'error by default' was 2023-era reasoning; today's frontier models reason fine over 2-5 candidates given context. Always-wrap gives a single shape to learn / document / test, plus forward extensibility for future disambiguation aids (nearest_to_cursor, most_recently_modified, caller_count) without breaking the contract. Single match: {matches: [{...}]}. Multi-match: {matches: [...], disambiguation: {n, by_kind, files, hint}}. Agent reads result.matches[0] either way. * docs(plans): settle Q-3 — exact match only; fuzzy stays in query show contract is sharp: 'I know the name → I want to know where it lives.' Agents have the exact name 95% of the time (stack traces, import statements, prior query results). Error message points at query+LIKE for fuzzy so the agent's next move is explicit. Avoids burning a flag on a feature query already does. * docs(plans): settle Q-4 — ship --in <path> file-scope filter Closes the loop with the Q-2 disambiguation envelope: agent sees candidate files in disambiguation.files, narrows with --in via parameter add (not tool-switch to query). --kind handles 'function vs const' ambiguity; --in handles 'this folder vs that folder' (the common case). ~5 LOC. Match rule: prefix if ends with / or names a directory, else exact file. * docs(plans): expand to show + snippet, settle Q-5, open Q-6, fold fact-check refinements After fact-checking against the refreshed codemap index, snippet's marginal cost is smaller than initially framed: - findSymbolsByName (Q-1 helper) is shared with show — free reuse - readFileSync + toProjectRelative + hashContent + files.content_hash IS the literal pattern cmd-validate.ts already uses for stale detection — pure copy-paste - ~2-3 hours marginal cost on top of show; splitting into a follow-up PR would duplicate docs / changeset / Rule-10 mirror overhead Q-5 settled: ship snippet alongside show in v1. Output is {matches: [{...metadata, source, stale?}]} — additive on Q-2's envelope, no shape divergence. Q-2 updated: explicit requirement that BOTH the CLI's --json mode AND the MCP tool wrap in {matches, disambiguation?} — required to preserve plan §4 uniformity (CLI prints array AND MCP returns envelope = uniformity broken). Q-4 updated: --in <path> normalization via existing toProjectRelative(projectRoot, p) helper (verified — already handles leading ./, trailing /, Windows backslash → POSIX). No reinventing. Q-6 opened: stale-file behavior for snippet — read+flag (1) vs refuse (2) vs auto-reindex (3). Bias toward (1) per agent-first lens (no hostile round-trip, no hidden side-effects). Tracer-bullet sequence expanded from 4 → 6 commits (~1 day total). Non-goals updated: snippet no longer deferred; --with-source flag explicitly rejected per Q-5; auto-reindex on stale explicitly rejected pending Q-6 confirmation; glob characters in --in explicitly out of scope. * docs(plans): settle Q-6 — read + flag stale snippets Agent-first: gives data + structured warning; preserves agent autonomy (e.g. 'I want stale to compare with what changed'). Refuse + auto-reindex both rejected — refuse forces 3 round-trips for content already on disk; auto-reindex hides side-effects from a read tool and breaks the read/write separation we kept clean across PRs #33 / #35 / #37. All 6 grill questions now settled — ready for tracer 1. * feat(show): show-engine.ts findSymbolsByName + tests (Tracer 1 of 6) Pure transport-agnostic lookup engine — same shape audit-engine.ts / query-engine.ts use (PRs #33 / #35). findSymbolsByName({db, name, kind?, inPath?}) returns SymbolMatch[] with deterministic order (file_path ASC, line_start ASC) so callers slice for stable disambiguation output. Per Q-3 settled: name match is case-sensitive (exact). Per Q-4 settled: inPath uses a directory-vs-file heuristic — trailing slash OR no extension in trailing segment treats as prefix (LIKE 'src/cli/%'); else exact file match (file_path = ?). Caller normalizes via toProjectRelative before passing. 12 unit tests cover: single match, unknown name, ambiguous (3-match deterministic order), kind filter narrowing, inPath as directory (no slash + with slash), inPath as file (exact + miss), kind+inPath compose AND, returned columns, case-sensitivity. Reuses the symbols table directly. No schema change. Tracer 2 wires the CLI verb on top. * feat(show): codemap show <name> CLI verb (Tracer 2 of 6) Implements the show CLI verb per the settled grill round: - parseShowRest — argv parser supporting <name> + --kind + --in + --json (+ --help / -h). Errors on missing name, extra positional, unknown flags, and missing flag values. - buildShowResult — wraps engine output in the {matches, disambiguation?} envelope (Q-2 settled). Single-match → {matches}; multi-match adds n / by_kind / files / hint structured aids. - runShowCmd — bootstraps codemap, normalizes --in via toProjectRelative (Q-4), runs findSymbolsByName, renders. JSON mode prints the envelope verbatim; terminal mode prints path:line-line + signature per row + a stderr disambiguation hint on multi-match. - Error UX (Q-3): unknown name → routed-error message pointing at `codemap query --json "SELECT … LIKE '%name%'"` so the agent's next step is explicit. Wired into main.ts dispatch + bootstrap.ts validateIndexModeArgs known-verbs list + help text. toProjectRelative exported from cmd-validate.ts (was private). 13 unit tests cover parser (help/missing/extra/unknown-flag/--kind/--in/order-independence/throws-if-not-show) + buildShowResult envelope (single / zero / multi / file dedup). Smoke tested: show runQueryCmd / --json / --in / unknown-name all behave per spec. * feat(show): readSymbolSource + getIndexedContentHash with stale detection (Tracer 3 of 6) Adds the snippet-side engine helpers per Q-5 (ship snippet alongside show) + Q-6 (read + flag stale, never refuse + never auto-reindex): - readSymbolSource({match, projectRoot, indexedContentHash?}) returns {source, stale, missing}. Reuses readFileSync + hashContent + the same FS pattern cmd-validate.ts uses (verified during fact-check). Line slicing is 1-indexed inclusive matching symbols.line_start/line_end. Clamps line_end past EOF instead of throwing. - getIndexedContentHash(db, filePath) — convenience helper for the same SELECT cmd-validate.ts uses. Stale semantics (Q-6): source is ALWAYS returned when the file exists; stale: true is just a metadata flag the agent reads. Missing file → {source: undefined, stale: true, missing: true}. indexedContentHash undefined → never marks stale (caller opts out of staleness checks). 7 new unit tests cover line slicing happy path, missing file, hash-match (stale: false), hash-mismatch (stale: true + source still returned), EOF clamping, opt-out via undefined hash, and getIndexedContentHash lookup. Total now 19 pass on show-engine. Tracer 4 next: cmd-snippet.ts CLI verb on top of these helpers. * feat(snippet): codemap snippet <name> CLI verb (Tracer 4 of 6) Sibling to show: same lookup contract (name + kind + in + json) but returns source text from disk per match. Output envelope: {matches: [{...metadata, source, stale, missing}], disambiguation?: {...}} — additive on Q-2's envelope (one source/stale/missing field per row, never a shape divergence). - parseSnippetRest mirrors parseShowRest's parser (same flags, same errors). - buildSnippetResult enriches each SymbolMatch with source/stale/missing via getIndexedContentHash + readSymbolSource (Tracer 3 helpers). Per Q-6: source ALWAYS returned when file exists; stale/missing are pure metadata flags the agent reads. - runSnippetCmd mirrors runShowCmd's bootstrap + lookup + render. Terminal mode prints path:line-line[STALE/MISSING flags] + source; --json mode emits the envelope verbatim. Stderr hint when any row is stale points at codemap / codemap --files <path> for refresh. Wired into main.ts dispatch + bootstrap.ts known-verbs + help text. 11 unit tests cover parser (help/missing/extra/unknown/--kind/--in/order/throws-not-snippet) + buildSnippetResult (single match w/ source, stale flag on hash drift, missing flag on rm'd file, multi-match disambiguation envelope). Smoke tested: bun src/index.ts snippet runQueryCmd --json returns the function source + metadata + stale: false. * feat(mcp): show + snippet MCP tools (Tracer 5 of 6) Wires the show + snippet CLI verbs as MCP tools per Q-1 settled. Both follow the established cmd-* ↔ register*Tool pattern from PR #35; both reuse the same engine helpers (findSymbolsByName, buildShowResult, buildSnippetResult) so output shape is verbatim from each tool's CLI counterpart's --json envelope. - registerShowTool — args {name, kind?, in?}, returns the {matches, disambiguation?} envelope. Tool description teaches: 'Use snippet for source text; use query with LIKE for fuzzy lookup' so agents know when to reach for which tool. - registerSnippetTool — args {name, kind?, in?}, returns the same envelope with source/stale/missing on each match. Description spells out the stale semantics (read + flag, agent decides) since that's the one non-obvious bit. Both tools route the in arg through toProjectRelative(opts.root, args.in) so MCP callers get the same path-shape leniency as the CLI (--in ./src/cli/, --in src/cli, --in src/cli/cmd-show.ts all work identically). 8 new in-process MCP tests via @modelcontextprotocol/sdk's InMemoryTransport: tools/list lists both, single-match envelope, multi-match disambiguation, in-filter narrows, unknown-name returns empty, snippet source on fresh file (stale: false), stale flag on hash drift, missing flag on rm'd file. Total now 38 MCP tests pass. * docs(show + snippet): architecture / glossary / README / agent rule + skill (Tracer 6 of 6) Lifts canonical bits out of docs/plans/targeted-read-cli.md per docs/README.md Rule 2 (delete plans on ship). Surfaces touched: - architecture.md § CLI usage gains a 'Show / snippet wiring' paragraph documenting the cmd-show ↔ cmd-snippet ↔ show-engine seam, the {matches, disambiguation?} envelope, the toProjectRelative + hashContent primitive reuse from cmd-validate.ts, and the stale-file behavior (read + flag, no auto-reindex). - glossary.md § S: new entries 'show' and 'snippet' with disambiguation envelope reference + cross-link to architecture.md. - roadmap.md: removed the targeted-read-cli backlog entry (now shipped). - README.md CLI block: added show + snippet examples covering the metadata vs source-text distinction and the disambiguation envelope shape. - .agents/rules/codemap.md + templates/agents/rules/codemap.md (mirrored per Rule 10): added two CLI table rows (Targeted read metadata, Targeted read source text) + a 'Targeted reads' section documenting the envelope, --kind / --in flags, exact-match semantics, and snippet stale-file behavior. - .agents/skills/codemap/SKILL.md + templates/agents/skills/codemap/SKILL.md (mirrored): MCP tools list extended with show + snippet entries describing args, envelope shape, and stale semantics. Tools list in agent rule extended too. - docs/plans/targeted-read-cli.md DELETED (Rule 2 — plan content fully lifted into architecture / glossary / agent files). - Minor changeset added (additive features, no schema breaks). * chore(security): defence-in-depth fixes from PR self-audit Three small hygiene fixes from the security audit on PR #39: 1. agents-init.ts relPathToAbsSegments — now rejects '..' and '.' segments instead of just filtering empty strings. Defence in depth: today's callers source rel from listRegularFilesRecursive (package-controlled, never produces '..'), but a future caller passing user-provided relative paths would otherwise allow join(destRoot, '..', 'etc', 'passwd') to write outside destRoot. Throws loud instead of silently writing somewhere unexpected. 5 new unit tests cover happy path, empty-segment filter, '..' at start, '..' in middle, and '.' rejection. 2. cmd-show.ts + cmd-snippet.ts unknown-name error — escapes single-quotes (SQLite '' convention) before embedding the user-provided name into the suggested SQL hint. No execution risk (the message is just text), but the previous version emitted SQL like LIKE '%'; DROP TABLE symbols; --%' which looks injection-y in agent traces and breaks if the agent copy-pastes the hint. Now safe for names like O'Brien. 3. .github/workflows/ci.yml — added an audit job running 'bun audit' on every PR. Marked continue-on-error: true (non-blocking) so transient registry issues or low-severity transitive CVEs don't gate merges. Promote to a hard gate once the team agrees on a vulnerability budget. Verified bun audit works locally + reports zero vulnerabilities today. All three are tiny, additive, and follow defence-in-depth rather than fixing live exploits — the original audit found no exploitable vulnerabilities in the codebase. * fix(show): escape SQL LIKE wildcards in --in path (PR #39 CodeRabbit feedback, Major) Real bug verified against actual SQLite semantics: when --in src/__tests__ became LIKE 'src/__tests__/%', the underscores matched ANY single char so the query also matched src/aatestsZZ/foo.ts. Underscores are ubiquitous in TS layouts (__tests__, __mocks__, _utils, _helpers). Fix: new escapeLikeLiteral helper escapes _, %, and \ (the escape char itself); the LIKE clause now uses ESCAPE '\'. Trailing % we append stays an unescaped wildcard. Symmetric handling so paths with literal '%' (rare but possible in OS file names) also match exactly. Tests: 1 integration test seeds both src/__tests__/setup.ts and a same-shape decoy src/aatestsZZ/decoy.ts; --in src/__tests__ now returns only the real one. 4 unit tests cover the escape helper (underscore, percent, backslash, identity).
1 parent abee731 commit 7460b46

23 files changed

Lines changed: 1824 additions & 7 deletions

.agents/rules/codemap.md

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,8 @@ A local database (default **`.codemap.db`**) indexes structure: symbols, imports
2626
| List / drop baselines || `bun src/index.ts query --baselines` · `bun src/index.ts query --drop-baseline <name>` |
2727
| Per-delta audit || `bun src/index.ts audit --json --baseline base` (auto-resolves `base-files` / `base-dependencies` / `base-deprecated`) |
2828
| MCP server (for agent hosts) || `bun src/index.ts mcp` — JSON-RPC on stdio; one tool per CLI verb. See **MCP** section below. |
29+
| Targeted read (metadata) || `bun src/index.ts show <name> [--kind <k>] [--in <path>] [--json]` — file:line + signature |
30+
| Targeted read (source text) || `bun src/index.ts snippet <name> [--kind <k>] [--in <path>] [--json]` — same lookup + source from disk + stale flag |
2931

3032
**Recipe `actions`:** with **`--json`**, recipes that define an `actions` template append it to every row (kebab-case verb + description — e.g. `fan-out``review-coupling`). Under `--baseline`, actions attach to the **`added`** rows only. Inspect via **`--recipes-json`**. Ad-hoc SQL never carries actions.
3133

@@ -48,9 +50,11 @@ Validation: SQL is rejected at load time if it starts with DML/DDL (DELETE/DROP/
4850

4951
**Audit (`bun src/index.ts audit`)**: structural-drift command; emits `{head, deltas: {files, dependencies, deprecated}}` (each delta carries its own `base` metadata). Reuses B.6 baselines as the snapshot source. Two CLI shapes — `--baseline <prefix>` auto-resolves `<prefix>-files` / `<prefix>-dependencies` / `<prefix>-deprecated`; `--<delta>-baseline <name>` is the explicit per-delta override. v1 ships no `verdict` / threshold config — consumers compose `--json` + `jq` for CI exit codes. Auto-runs an incremental index before the diff (use `--no-index` to skip for frozen-DB CI).
5052

53+
**Targeted reads (`show` / `snippet`)**: precise lookup by exact symbol name without composing SQL. `show` returns metadata (`file_path:line_start-line_end` + `signature`); `snippet` returns the source text from disk plus `stale` / `missing` flags. Both share the same flag set (`--kind <k>` to filter by `symbols.kind`, `--in <path>` for file-scope filter — directory prefix or exact file). Output envelope is `{matches, disambiguation?}` — single match → `{matches: [{...}]}`; multi-match adds `disambiguation: {n, by_kind, files, hint}` so agents narrow without re-scanning. Name match is exact / case-sensitive — for fuzzy use `query` with `LIKE '%name%'`. Snippet stale-file behavior: `source` is always returned when the file exists; `stale: true` means the line range may have shifted (re-index with `bun src/index.ts` or `--files <path>` before acting on the source).
54+
5155
**MCP server (`bun src/index.ts mcp`)**: stdio MCP (Model Context Protocol) server — agents call codemap as JSON-RPC tools instead of shelling out to the CLI on every read. v1 ships one tool per CLI verb plus four lazy-cached resources:
5256

53-
- **Tools:** `query` / `query_batch` / `query_recipe` / `audit` / `save_baseline` / `list_baselines` / `drop_baseline` / `context` / `validate`. Snake_case keys (Codemap convention matching MCP spec examples + reference servers — spec is convention-agnostic; CLI stays kebab).
57+
- **Tools:** `query` / `query_batch` / `query_recipe` / `audit` / `save_baseline` / `list_baselines` / `drop_baseline` / `context` / `validate` / `show` / `snippet`. Snake_case keys (Codemap convention matching MCP spec examples + reference servers — spec is convention-agnostic; CLI stays kebab).
5458
- **`query_batch` (MCP-only):** N statements in one round-trip. Items are `string | {sql, summary?, changed_since?, group_by?}` — string form inherits batch-wide flag defaults, object form overrides on a per-key basis. Per-statement errors are isolated.
5559
- **`save_baseline` (polymorphic):** one tool, `{name, sql? | recipe?}` with runtime exclusivity check (mirrors the CLI's single `--save-baseline=<name>` verb).
5660
- **Resources:** `codemap://recipes` (catalog), `codemap://recipes/{id}` (one recipe), `codemap://schema` (live DDL from `sqlite_schema`), `codemap://skill` (bundled SKILL.md text). Lazy-cached on first `read_resource`.

.agents/skills/codemap/SKILL.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -67,6 +67,8 @@ Each emitted delta carries its own `base` metadata so mixed-baseline audits are
6767
- **`drop_baseline`**`{name}`. Returns `{dropped: <name>}` on success or `isError` if the name doesn't exist.
6868
- **`context`**`{compact?, intent?}`. Returns the project-bootstrap envelope (codemap version, schema version, file count, language breakdown, hubs, sample markers). Designed for agent session-start — one call replaces 4-5 `query` calls.
6969
- **`validate`**`{paths?: string[]}`. Compares on-disk SHA-256 to indexed `files.content_hash`; empty `paths` validates everything. Returns rows with status (`ok`/`stale`/`missing`/`unindexed`).
70+
- **`show`**`{name, kind?, in?}`. Exact, case-sensitive symbol name lookup. Returns `{matches: [{name, kind, file_path, line_start, line_end, signature, ...}], disambiguation?: {n, by_kind, files, hint}}`. Single match → `{matches: [{...}]}`; multi-match adds the disambiguation envelope so you narrow without re-scanning. Fuzzy lookup belongs in `query` with `LIKE`.
71+
- **`snippet`**`{name, kind?, in?}`. Same lookup as `show` but each match also carries `source` (file lines from disk at `line_start..line_end`), `stale` (true when content_hash drifted since indexing — line range may have shifted), `missing` (true when file is gone). Per Q-6 (settled): `source` is always returned when the file exists; agent decides whether to act on stale content or run `codemap` / `codemap --files <path>` to re-index first. No auto-reindex side-effects from this read tool.
7072

7173
**Resources (lazy-cached on first `read_resource`; constant for server-process lifetime):**
7274

.changeset/targeted-read-cli.md

Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
---
2+
"@stainless-code/codemap": minor
3+
---
4+
5+
feat(show + snippet): targeted-read CLI verbs + MCP tools
6+
7+
Two sibling verbs that close the "agent wants to read this thing" loop
8+
without composing SQL:
9+
10+
- **`codemap show <name>`** — returns metadata
11+
(`file_path:line_start-line_end` + `signature` + `kind`) for the
12+
symbol(s) matching the exact name (case-sensitive).
13+
- **`codemap snippet <name>`** — same lookup; each match also carries
14+
`source` (file lines from disk), `stale` (true when content_hash
15+
drifted since indexing), `missing` (true when file is gone).
16+
17+
Both share the same flag set (`--kind <k>` filter, `--in <path>` file
18+
scope — directory prefix or exact file, normalized via the existing
19+
`toProjectRelative` helper for cross-platform consistency).
20+
21+
Output is the agent-friendly `{matches, disambiguation?}` envelope on
22+
both CLI `--json` and MCP responses (uniformity contract per the MCP
23+
plan). Single match → `{matches: [{...}]}`; multi-match adds
24+
`disambiguation: {n, by_kind, files, hint}` — structured aids so the
25+
agent narrows without scanning every row. Forward-extensible (future
26+
`nearest_to_cursor` / `most_recently_modified` / `caller_count` fields
27+
land as additive keys).
28+
29+
MCP tools `show` and `snippet` register parallel to the CLI verbs and
30+
auto-inherit the same envelope shape.
31+
32+
Stale-file behavior on snippet: `source` is always returned when the
33+
file exists; `stale: true` is metadata the agent reads. No refusal,
34+
no auto-reindex side-effects — read tool stays read-only.
35+
36+
Architecturally: pure transport-agnostic engine in
37+
`src/application/show-engine.ts` (mirrors the cmd-__-engine seam
38+
from PRs #33 / #35 / #37); thin CLI verbs in `src/cli/cmd-show.ts`
39+
40+
- `src/cli/cmd-snippet.ts`. Reuses `findSymbolsByName`, `hashContent`
41+
(from `src/hash.ts`), `toProjectRelative` (now exported from
42+
`cmd-validate.ts`), and `files.content_hash` — same primitives the
43+
existing `validate` command already uses for stale detection. No
44+
schema change.
45+
46+
Test coverage: 19 engine tests (lookup variants, line slicing, stale
47+
detection, missing files), 13 cmd-show parser/envelope tests, 11
48+
cmd-snippet parser/envelope/stale tests, 8 in-process MCP integration
49+
tests via `@modelcontextprotocol/sdk`'s `InMemoryTransport`.

.github/workflows/ci.yml

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -141,6 +141,24 @@ jobs:
141141
bun run dev --full
142142
bun run benchmark
143143
144+
audit:
145+
# Non-blocking — visibility into transitive-dep CVEs without gating PRs.
146+
# Promote to a hard gate once the team agrees on a vulnerability budget.
147+
name: 🛡 Audit (non-blocking)
148+
needs: skip-ci
149+
if: needs['skip-ci'].outputs.skip != 'true'
150+
runs-on: ubuntu-latest
151+
continue-on-error: true
152+
steps:
153+
- name: Checkout
154+
uses: actions/checkout@v4
155+
156+
- name: Setup
157+
uses: ./.github/actions/setup
158+
159+
- name: bun audit
160+
run: bun audit
161+
144162
ci-complete:
145163
name: CI complete
146164
needs: [skip-ci, format, lint, typecheck, test, build, benchmark]

README.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -118,6 +118,14 @@ echo "SELECT path FROM files WHERE language IN ('ts', 'tsx') AND line_count > 50
118118
> .codemap/recipes/big-ts-files.sql
119119
codemap query --recipe big-ts-files # auto-discovered alongside bundled
120120

121+
# Targeted reads — precise lookup by symbol name without composing SQL
122+
codemap show runQueryCmd # metadata: file:line + signature
123+
codemap show foo --kind function --in src/cli # narrow ambiguous matches
124+
codemap snippet runQueryCmd # same lookup + source text from disk
125+
codemap snippet foo --json # {matches: [{...metadata, source, stale, missing}]}
126+
# Output envelope is always {matches, disambiguation?} — single match → {matches: [{...}]};
127+
# multi-match adds disambiguation: {n, by_kind, files, hint} for agent-friendly narrowing.
128+
121129
# MCP server (Model Context Protocol) — for agent hosts (Claude Code, Cursor, Codex, generic MCP clients)
122130
codemap mcp # JSON-RPC on stdio; one tool per CLI verb plus query_batch
123131
# Tools: query, query_batch (MCP-only — N statements in one round-trip), query_recipe, audit,

docs/architecture.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -125,6 +125,8 @@ A local SQLite database (`.codemap.db`) indexes the project tree and stores stru
125125

126126
**Context wiring:** **`src/cli/cmd-context.ts`****`buildContextEnvelope`** composes the JSON envelope from existing recipes (`fan-in` for `hubs`, `markers` SELECT for `sample_markers`, `QUERY_RECIPES` map for the catalog). **`classifyIntent`** maps `--for "<text>"` to one of `refactor | debug | test | feature | explore | other` via regex against the trimmed input; whitespace-only intents are rejected. `--compact` drops `hubs` + `sample_markers` and emits one-line JSON; otherwise pretty-prints with 2-space indent.
127127

128+
**Show / snippet wiring:** **`src/cli/cmd-show.ts`** + **`src/cli/cmd-snippet.ts`** — sibling CLI verbs sharing the same parser shape (`<name>` + `--kind` + `--in <path>` + `--json`) and the pure engine **`src/application/show-engine.ts`** (`findSymbolsByName({db, name, kind?, inPath?})` for the lookup; `readSymbolSource({match, projectRoot, indexedContentHash?})` + `getIndexedContentHash(db, filePath)` for the snippet-side FS read). Both verbs return the same `{matches, disambiguation?}` envelope per plan § 4 uniformity — single match → `{matches: [{...}]}`; multi-match adds `{n, by_kind, files, hint}`. Snippet matches add `source` / `stale` / `missing` fields (additive — no shape divergence). **`--in <path>`** is normalized through `toProjectRelative(projectRoot, p)` (exported from **`src/cli/cmd-validate.ts`**) so `--in ./src/cli/`, `--in src/cli`, and `--in src/cli/cmd-show.ts` all resolve identically. Stale-file behavior on `snippet`: `hashContent` (from **`src/hash.ts`** — same primitive `cmd-validate.ts` uses) compares the on-disk content_hash against `files.content_hash`; mismatch sets `stale: true` but the source IS still returned (read tool, no auto-reindex side-effects). MCP tools `show` and `snippet` register parallel to the CLI surface (see [§ MCP wiring](#cli-usage)).
129+
128130
**Recipes wiring:** **`src/application/recipes-loader.ts`** (pure transport-agnostic loader) + **`src/cli/query-recipes.ts`** (shim — caches the loader output, exposes `getQueryRecipeSql` / `getQueryRecipeActions` / `listQueryRecipeIds` / `listQueryRecipeCatalog` / `getQueryRecipeCatalogEntry`). Recipes live as file pairs: **`<id>.sql`** + optional **`<id>.md`**. The loader reads `templates/recipes/` (bundled, ships in npm package next to `templates/agents/`) and `<projectRoot>/.codemap/recipes/` (project-local — root-only resolution per the registry plan, no walk-up). Project recipes win on id collision; entries that override a bundled id carry **`shadows: true`** in the catalog so agents reading `codemap://recipes` at session start see when a recipe behaves differently from the documented bundled version. Per-row **`actions`** templates (kebab-case verb + description) live in YAML frontmatter on each `<id>.md` — uniform shape across bundled + project. Hand-rolled YAML parser scoped to `actions: [{type, auto_fixable?, description?}]` only (no `js-yaml` dep). Load-time validation rejects empty SQL and DML / DDL keywords (`INSERT` / `UPDATE` / `DELETE` / `DROP` / `CREATE` / `ALTER` / `ATTACH` / `DETACH` / `REPLACE` / `TRUNCATE` / `VACUUM` / `PRAGMA`) with recipe-aware error messages — defence in depth alongside the runtime `PRAGMA query_only=1` backstop in `query-engine.ts` (PR #35). `.codemap.db` is gitignored; `.codemap/recipes/` is NOT (verified via `git check-ignore`) — recipes are git-tracked source code authored for human review.
129131

130132
**MCP wiring:** **`src/cli/cmd-mcp.ts`** (argv — `--help` only; bootstrap absorbs `--root`/`--config`) + **`src/application/mcp-server.ts`** (engine — tool registry, resource handlers, response composition). Mirrors the `cmd-audit.ts ↔ audit-engine.ts` seam — CLI parses + lifecycle; engine owns the SDK. **`runMcpServer`** bootstraps codemap once at server boot (config + resolver + DB access become module-level state), instantiates `McpServer` from **`@modelcontextprotocol/sdk`**, attaches a **`StdioServerTransport`**, and resolves when stdin closes (clean shutdown). Tool handlers reuse the existing engine entry-points: **`query`** + **`query_recipe`** call **`executeQuery`** in **`src/application/query-engine.ts`** (a pure transport-agnostic engine extracted from `printQueryResult`'s JSON branch — same `[...rows]` / `{count}` / `{group_by, groups}` envelope `--json` would print); **`query_batch`** loops via **`executeQueryBatch`** with batch-wide-defaults + per-statement-overrides (items are `string | {sql, summary?, changed_since?, group_by?}`); **`audit`** runs `resolveAuditBaselines` + `runAudit` from PR #33 unchanged; **`context`** / **`validate`** call `buildContextEnvelope` / `computeValidateRows` (pure functions in `src/cli/cmd-*.ts` — same layer-reversal allowance as `query-recipes`). **`save_baseline`** is one polymorphic tool (`{name, sql? | recipe?}`) with a runtime exclusivity check — mirrors the CLI's single `--save-baseline=<name>` verb. **Tool naming**: snake_case throughout — Codemap convention matching the patterns in MCP spec examples and reference servers (GitHub MCP, Cursor built-ins); the spec itself doesn't mandate it. CLI stays kebab — translation lives at the MCP-arg layer. **Resources** (`codemap://recipes`, `codemap://recipes/{id}`, `codemap://schema`, `codemap://skill`) use **lazy memoisation** — first `read_resource` populates a per-server-instance cache; constant for the server-process lifetime so eager-vs-lazy produce identical observable behavior. `codemap://schema` queries `sqlite_schema` live; `codemap://skill` reads from `resolveAgentsTemplateDir() + skills/codemap/SKILL.md`. Output shape uniformity (plan § 4): every tool returns the JSON envelope its CLI counterpart's `--json` flag prints, surfaced via `content: [{type: "text", text: JSON.stringify(payload)}]`. `--changed-since` git lookups are memoised per `(root, ref)` pair across batch items so a `query_batch` of N items sharing the same ref does one git invocation, not N. Per-statement errors in `query_batch` are isolated — failed statements return `{error}` in their slot while siblings still execute.

docs/glossary.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -368,6 +368,14 @@ Conceptually, the structure of the SQLite database — every table, column, cons
368368

369369
Integer constant in `src/db.ts`. Bumped whenever the DDL changes. `createSchema()` reads `meta.schema_version` and triggers a full rebuild on mismatch.
370370

371+
### show
372+
373+
`codemap show <name>` — one-step lookup that returns metadata (`file_path:line_start-line_end` + `signature` + `kind`) for the symbol(s) matching `<name>` (exact, case-sensitive). Output is the `{matches, disambiguation?}` envelope (single match → `{matches: [{...}]}`; multi-match adds `disambiguation: {n, by_kind, files, hint}` so agents narrow without scanning every row). Flags: `--kind <kind>` (filter by `symbols.kind`), `--in <path>` (file-scope filter — directory prefix or exact file). Distinct from **snippet** (returns source text, not just metadata) and from `query` with `WHERE name = ?` (one verb vs SQL composition; see [`architecture.md` § Show / snippet wiring](./architecture.md#cli-usage)).
374+
375+
### snippet
376+
377+
`codemap snippet <name>` — same lookup as **show**, but each match also carries `source` (file lines from disk at `line_start..line_end`), `stale` (true when content_hash drifted since last index — line range may have shifted), and `missing` (true when file is gone). Per-execution shape mirrors `show`'s envelope; source/stale/missing are additive fields. Stale-file behavior: `source` is ALWAYS returned when the file exists; `stale: true` is metadata the agent reads (no refusal, no auto-reindex side-effects from a read tool — agent decides whether to act on possibly-shifted lines or run `codemap` first). See [`architecture.md` § Show / snippet wiring](./architecture.md#cli-usage).
378+
371379
### skill
372380

373381
A `.agents/skills/<name>/SKILL.md` file with YAML frontmatter. Longer than a rule; describes a complete agent workflow. Distinct from a **rule** (shorter, normative).

docs/roadmap.md

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -39,7 +39,6 @@ Codemap stays a structural-index primitive that other tools can consume. Out of
3939
- [ ] **`codemap audit --base <ref>`** (v1.x) — worktree+reindex snapshot strategy. v1 shipped `--baseline <prefix>` / `--<delta>-baseline <name>` (B.6 reuse) — see [`architecture.md` § Audit wiring](./architecture.md#cli-usage). v1.x adds `--base <ref>` for "audit against an arbitrary ref I haven't pre-baselined" (defers worktree spawn + cache decision until a real consumer asks).
4040
- [ ] **`codemap audit` verdict + thresholds** (v1.x) — `verdict: "pass" | "warn" | "fail"` driven by `codemap.config.audit.deltas[<key>].{added_max, action}`. Triggers: two consumers ship `jq`-based threshold scripts with similar shapes, OR one consumer asks with a concrete config sketch. Until then, raw deltas + consumer-side `jq` is the CI exit-code idiom.
4141
- [ ] **`codemap serve` (HTTP API, v1.x)** — same tool taxonomy + output shape as `codemap mcp` (shipped in v1), exposed over `POST /tool/{name}` with loopback default and optional `--token`. Defer until a concrete non-MCP consumer asks; design points are reserved in [`architecture.md` § MCP wiring](./architecture.md#cli-usage) so HTTP inherits them when its turn comes.
42-
- [ ] **Targeted-read CLI**`codemap show <symbol>` / `codemap snippet <name>` returns `file_path:line_start-line_end` + `signature` for one symbol. Same data as `SELECT … FROM symbols WHERE name = ?`, but a one-step CLI keeps agents from composing SQL for trivial precise reads
4342
- [ ] **Watch mode** for dev — `node:fs.watch` recursive + `--files` re-index loop; Linux `recursive` requires Node 19.1+
4443
- [ ] **Monorepo / workspace awareness** — discover workspaces from `pnpm-workspace.yaml` / `package.json` and index per-workspace dependency graphs
4544
- [ ] **Cross-agent handoff artifact**_speculative_; layered prefix/delta JSON written on session-stop, read on session-start. Complementary to indexing rather than core to it; revisit if user demand emerges

0 commit comments

Comments
 (0)