Skip to content

Commit b509e4c

Browse files
feat: codebase map bootstrap (map_id + routing hints) (#183)
* feat: add map_id and codebase_map to context bootstrap Agents get a hash-stable routing card on context (CLI/MCP/HTTP) plus MCP initialize instructions appendix; opt out via --no-codebase-map or compact. * harden: parity and watch-ordering for codebase map bootstrap Align skill shard, MCP tool description, and HTTP tests with map_id fields; prime watch before baking initialize instructions appendix. * fix: honor opts.instructions in runMcpServer Only assemble codebase-map appendix when caller did not supply instructions. * harden: test resolveMcpInitializeInstructions override path Extract instructions resolver from runMcpServer so CodeRabbit fix has regression coverage without stdio integration harness.
1 parent e4f6087 commit b509e4c

21 files changed

Lines changed: 509 additions & 164 deletions
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
---
2+
"@stainless-code/codemap": patch
3+
---
4+
5+
Add hash-stable `map_id` and `codebase_map` routing hints to `context` responses (CLI, MCP, HTTP). MCP initialize instructions now include `map_id` and top hub paths. Opt out with `--no-codebase-map` or `include_codebase_map: false`; omitted when `compact`.

docs/agents.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -131,7 +131,7 @@ See [architecture.md § Session lifecycle wiring](./architecture.md#session-life
131131

132132
## MCP tool allowlist
133133

134-
**`context.index_freshness`** — session bootstrap includes index-level freshness metadata: `commit_drift` (HEAD ≠ `last_indexed_commit`), `pending_sync` (watcher debounce queue or in-flight reindex), optional disk-drift counts when watch is off, and a single `warning` string when agents should pause or re-index. **`context.start_here`** (non-compact) adds inline index summary, intent-ranked `query_recipe` cards, and top hub files with export signatures (adaptive caps by file count; optional MCP/HTTP `include_snippets` for one-line previews). Debug intent biases `sample_markers` toward FIXME/TODO. **MCP:** array-shaped JSON tools (`query`, …) keep row payloads verbatim and append a second `content` block prefixed `@codemap/index_freshness`; object-shaped tools merge `index_freshness` inline. **HTTP:** `POST /tool/*` adds `X-Codemap-Pending-Sync`, `X-Codemap-Commit-Drift`, and `X-Codemap-Warning` headers without changing JSON bodies; **`GET /health`** includes full cheap `index_freshness` when the DB is readable. Complements per-file `validate` / snippet `stale`. See [architecture.md § Context wiring](./architecture.md#context-wiring).
134+
**`context.index_freshness`** — session bootstrap includes index-level freshness metadata: `commit_drift` (HEAD ≠ `last_indexed_commit`), `pending_sync` (watcher debounce queue or in-flight reindex), optional disk-drift counts when watch is off, and a single `warning` string when agents should pause or re-index. **`context.start_here`** (non-compact) adds inline index summary, intent-ranked `query_recipe` cards, and top hub files with export signatures (adaptive caps by file count; optional MCP/HTTP `include_snippets` for one-line previews). **`context.map_id`** + **`context.codebase_map`** (non-compact, default on) add a hash-stable routing card: top hub paths plus codemap CLI outcome aliases and session-start MCP tools — omit with `--no-codebase-map` / `include_codebase_map: false` or `--compact`. MCP initialize **`instructions`** also carry `map_id` and top three hubs; call **`context`** for the full map. Debug intent biases `sample_markers` toward FIXME/TODO. **MCP:** array-shaped JSON tools (`query`, …) keep row payloads verbatim and append a second `content` block prefixed `@codemap/index_freshness`; object-shaped tools merge `index_freshness` inline. **HTTP:** `POST /tool/*` adds `X-Codemap-Pending-Sync`, `X-Codemap-Commit-Drift`, and `X-Codemap-Warning` headers without changing JSON bodies; **`GET /health`** includes full cheap `index_freshness` when the DB is readable. Complements per-file `validate` / snippet `stale`. See [architecture.md § Context wiring](./architecture.md#context-wiring).
135135

136136
**MCP ToolAnnotations**`tools/list` (and HTTP `GET /tools`) expose advisory `readOnlyHint` / `destructiveHint` / `idempotentHint` per tool so clients can gate auto-approval. Read paths (`query`, `show`, `audit`, …) → `readOnlyHint: true`; disk-write apply tools → `destructiveHint: true` (writes still require `yes: true`); index mutators (`save_baseline`, `drop_baseline`, `ingest_coverage`, `ingest_churn`) → `readOnlyHint: false` without `destructiveHint`.
137137

docs/architecture.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -134,7 +134,7 @@ A local SQLite database (`.codemap/index.db`) indexes the project tree and store
134134

135135
#### Context wiring
136136

137-
**`src/cli/cmd-context.ts`** (argv + render) → **`handleContext`** in **`tool-handlers.ts`** → **`src/application/context-engine.ts`** (engine — **`buildContextEnvelope`**, **`classifyIntent`**, **`composeStartHere`**, **`resolveContextBudget`**, `ContextEnvelope` type). `buildContextEnvelope` composes the JSON envelope from existing recipes (legacy **`hubs`** at the bundled `fan-in` recipe default limit; budget-capped **`start_here.hub_leaders`** via **`resolveContextBudget(file_count)`**), intent-scoped `sample_markers`, `QUERY_RECIPES` catalog, **`start_here`** (inline `index-summary`, intent-ranked `query_recipe` cards, hub leaders with exported-symbol signatures — optional one-line **`include_snippets`** via CLI `--include-snippets` or MCP/HTTP `include_snippets`, path-contained disk reads with `stale`/`missing` flags), and **`index_freshness`** via **`src/application/index-freshness.ts`**. Debug `--for` / MCP `intent` biases markers toward FIXME/TODO kinds; whitespace-only intent is treated as no intent on all transports. **`classifyIntent`** maps free text to `refactor | debug | test | feature | explore | other`; **`start_here.classified_as`** is `"default"` when no intent is supplied. Hub-leader **`include_snippets`** one-liners share the adaptive **`signature_max_chars`** cap. **`--compact`** drops `hubs`, `sample_markers`, and `start_here` and emits minified JSON (non-compact pretty-prints with 2-space indent). Whitespace-only `--for` values are rejected at CLI parse time. **`include_snippets`** is a no-op when **`compact: true`**. Product-shape constraint: [No split-brain incremental index](./roadmap.md#floors-v1-product-shape).
137+
**`src/cli/cmd-context.ts`** (argv + render) → **`handleContext`** in **`tool-handlers.ts`** → **`src/application/context-engine.ts`** (engine — **`buildContextEnvelope`**, **`classifyIntent`**, **`composeStartHere`**, **`resolveContextBudget`**, **`buildCodebaseMap`**, **`computeMapId`**, **`buildCliEntryHints`**, `ContextEnvelope` type). `buildContextEnvelope` composes the JSON envelope from existing recipes (legacy **`hubs`** at the bundled `fan-in` recipe default limit; budget-capped **`start_here.hub_leaders`** via **`resolveContextBudget(file_count)`**), intent-scoped `sample_markers`, `QUERY_RECIPES` catalog, **`start_here`** (inline `index-summary`, intent-ranked `query_recipe` cards, hub leaders with exported-symbol signatures — optional one-line **`include_snippets`** via CLI `--include-snippets` or MCP/HTTP `include_snippets`, path-contained disk reads with `stale`/`missing` flags), optional **`map_id`** + **`codebase_map`** (hub paths from `start_here.hub_leaders` + static codemap CLI/MCP routing hints from **`src/outcome-aliases.ts`** and session-start MCP tools — not app runtime entry files), and **`index_freshness`** via **`src/application/index-freshness.ts`**. **`map_id`** is the first 16 hex chars of `hashContent(JSON.stringify(canonical))` over sorted `hub_paths`, `index_summary`, `schema_version`, `file_count`, and `last_indexed_commit` — agents compare ids across sessions without re-fetching full **`start_here`**. Debug `--for` / MCP `intent` biases markers toward FIXME/TODO kinds; whitespace-only intent is treated as no intent on all transports. **`classifyIntent`** maps free text to `refactor | debug | test | feature | explore | other`; **`start_here.classified_as`** is `"default"` when no intent is supplied. Hub-leader **`include_snippets`** one-liners share the adaptive **`signature_max_chars`** cap. **`--compact`** drops `hubs`, `sample_markers`, `start_here`, `map_id`, and `codebase_map` and emits minified JSON (non-compact pretty-prints with 2-space indent). **`--no-codebase-map`** / MCP/HTTP `include_codebase_map: false` omits `map_id` and `codebase_map` while keeping `start_here`. Whitespace-only `--for` values are rejected at CLI parse time. **`include_snippets`** is a no-op when **`compact: true`**. MCP **`runMcpServer`** appends a short auto-generated codebase-map block (`map_id` + top three hub paths) to initialize **`instructions`** after bootstrap. Product-shape constraint: [No split-brain incremental index](./roadmap.md#floors-v1-product-shape).
138138

139139
**Impact wiring:** **`src/cli/cmd-impact.ts`** (argv — `<target>` + `--in <path>` + `--direction up|down|both` + `--depth N` + `--via dependencies|calls|imports|all` + `--limit N` + `--summary` + `--json`; bootstrap absorbs `--root`/`--config`) + **`src/application/impact-engine.ts`** (engine — `findImpact({db, target, direction?, via?, depth?, limit?, inPath?})`). Pure transport-agnostic walker over the calls + dependencies + imports graphs; **`--via calls`** walks only parse-resolved `calls` rows (`CALLS_AST_ONLY_SQL` — excludes callback-synthesis heuristics unless consumers query `calls-including-heuristic`); CLI / MCP / HTTP all dispatch the same engine function via `tool-handlers.ts`'s `handleImpact` (MCP/HTTP `in` arg). Target auto-resolves: contains `/` or matches `files.path` → file target; otherwise symbol (case-sensitive). **Homonym symbols** (`matched_in.length > 1`): unscoped walks union per-defining-file call graphs (first hop scoped to each definition's call sites); `--in` / MCP `in` filters `matched_in` via show-engine prefix/exact rules — no match → empty `matches` + `skipped_scope`. Walks compatible backends per resolved kind: **symbol** → `calls` (callers / callees by `caller_name` / `callee_name`); **file** → `dependencies` (`from_path` / `to_path`) + `imports` (`file_path` / `resolved_path`, `IS NOT NULL` filter). `--via <b>` overrides; mismatched explicit choices land in `skipped_backends` (no error — agents see why their backend selection yielded fewer rows than expected). One `WITH RECURSIVE` per (direction, backend) combo with cycle detection via path-string `instr` check (SQLite has no native cycle predicate); JS-side merge + dedup by `(direction, kind, name?, file_path)` keeping the shallowest depth. `--depth 0` uses an unbounded sentinel (`UNBOUNDED_DEPTH_SENTINEL = 1_000_000`); cycle detection + `LIMIT` keep cyclic graphs cheap regardless. Termination reason classification: `limit` (truncated) > `depth` (any node sat at the cap) > `exhausted`. Result envelope: `{target, direction, via, depth_limit, matches: [{depth, direction, edge, kind, name?, file_path}], summary: {nodes, max_depth_reached, by_kind, terminated_by}, skipped_backends?, skipped_scope?}`. `--summary` blanks `matches` (transport bandwidth saver) but preserves `summary.nodes` so CI gates (`jq '.summary.nodes'`) still see the count. SARIF / annotations not supported (graph traversal, not findings — the parser accepts the flag combos but the engine only emits JSON).
140140

0 commit comments

Comments
 (0)