Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
14 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 15 additions & 0 deletions .agents/rules/codemap.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,21 @@ A local database (default **`.codemap.db`**) indexes structure: symbols, imports

**Recipe `actions`:** with **`--json`**, recipes that define an `actions` template append it to every row (kebab-case verb + description — e.g. `fan-out` → `review-coupling`). Under `--baseline`, actions attach to the **`added`** rows only. Inspect via **`--recipes-json`**. Ad-hoc SQL never carries actions.

**Project-local recipes:** drop `<id>.sql` (and optional `<id>.md` for description + actions) into **`<projectRoot>/.codemap/recipes/`** — auto-discovered, runs via `--recipe <id>` like bundled. Project recipes win on id collision; check `--recipes-json` for **`shadows: true`** entries to know when a project recipe overrides the documented bundled version. `<id>.md` supports YAML frontmatter for the per-row action template — block-list shape only (the loader's hand-rolled parser doesn't accept inline-flow `[{...}]`):

```markdown
---
actions:
- type: review-coupling
auto_fixable: false
description: "High fan-out usually means orchestrator role."
---

(Markdown body — first non-empty line becomes the catalog description.)
```

Validation: SQL is rejected at load time if it starts with DML/DDL (DELETE/DROP/UPDATE/etc.); the runtime `PRAGMA query_only=1` is the parser-proof backstop.

**Baselines** (`query_baselines` table inside `.codemap.db`, no parallel JSON files): `--save-baseline[=<name>]` snapshots a result set; `--baseline[=<name>]` diffs the current result against it (added / removed rows; identity = `JSON.stringify(row)`). Name defaults to the `--recipe` id; ad-hoc SQL needs an explicit `=<name>`. Survives `--full` and SCHEMA bumps.

**Audit (`bun src/index.ts audit`)**: structural-drift command; emits `{head, deltas: {files, dependencies, deprecated}}` (each delta carries its own `base` metadata). Reuses B.6 baselines as the snapshot source. Two CLI shapes — `--baseline <prefix>` auto-resolves `<prefix>-files` / `<prefix>-dependencies` / `<prefix>-deprecated`; `--<delta>-baseline <name>` is the explicit per-delta override. v1 ships no `verdict` / threshold config — consumers compose `--json` + `jq` for CI exit codes. Auto-runs an incremental index before the diff (use `--no-index` to skip for frozen-DB CI).
Expand Down
5 changes: 3 additions & 2 deletions .agents/skills/codemap/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,7 @@ Replace placeholders (`'...'`) with your module path, file glob, or symbol name.
- **`--baseline[=<name>]`** — diff the current result against the saved baseline. Output `{baseline:{...}, current_row_count, added: [...], removed: [...]}` (with `--json`) or a two-section terminal dump. Identity = per-row multiset equality (canonical `JSON.stringify` keyed frequency map; duplicates preserved). Pair with `--summary` for `{baseline:{...}, current_row_count, added: N, removed: N}`. **Mutually exclusive with `--group-by`.**
- **`--baselines`** lists saved baselines (no `rows_json` payload); **`--drop-baseline <name>`** deletes one. Both reject every other flag — they're list-only / drop-only operations.
- **Per-row recipe `actions`** — recipes that define an **`actions: [{type, auto_fixable?, description?}]`** template append it to every row in **`--json`** output (recipe-only; ad-hoc SQL never carries actions). Under `--baseline`, actions attach to the **`added`** rows only (the rows the agent should act on). Inspect via **`--recipes-json`**.
- **Project-local recipes** — drop **`<id>.sql`** (and optional **`<id>.md`** for description body + actions) into **`<projectRoot>/.codemap/recipes/`** to make team-internal SQL a first-class CLI verb. `--recipes-json` and the `codemap://recipes` MCP resource list project recipes alongside bundled ones with **`source: "bundled" | "project"`** discriminating them. Project recipes win on id collision; entries that override a bundled id carry **`shadows: true`** so agents reading the catalog at session start know when a recipe behaves differently from the documented bundled version. `<id>.md` supports YAML frontmatter for the per-row action template — **block-list shape only** (loader's hand-rolled parser; no inline-flow `[{...}]`): `---\nactions:\n - type: my-verb\n auto_fixable: false\n description: "..."\n---`. Validation: SQL is rejected at load time if it starts with DML/DDL (DELETE/DROP/UPDATE/etc.); the runtime `PRAGMA query_only=1` is the parser-proof backstop. `.codemap.db` is gitignored; **`.codemap/recipes/` is NOT** — recipes are git-tracked source code authored for human review.

**Audit (`bun src/index.ts audit`)** — separate top-level command for structural-drift verdicts. Composes B.6 baselines into a per-delta `{head, deltas}` envelope; v1 ships `files` / `dependencies` / `deprecated`. Two snapshot-source shapes:

Expand All @@ -69,8 +70,8 @@ Each emitted delta carries its own `base` metadata so mixed-baseline audits are

**Resources (lazy-cached on first `read_resource`; constant for server-process lifetime):**

- **`codemap://recipes`** — full catalog JSON (same as `--recipes-json`).
- **`codemap://recipes/{id}`** — single recipe `{id, description, sql, actions?}`. Replaces `--print-sql <id>`.
- **`codemap://recipes`** — full catalog JSON (same as `--recipes-json`). Each entry carries `source: "bundled" | "project"` and `shadows: true` on project entries that override a bundled recipe id. Read this at session start so you know when a `--recipe foo` call will run a project override instead of the documented bundled version.
- **`codemap://recipes/{id}`** — single recipe `{id, description, body?, sql, actions?, source, shadows?}`. Replaces `--print-sql <id>`.
- **`codemap://schema`** — DDL of every table in `.codemap.db` (queried live from `sqlite_schema`).
- **`codemap://skill`** — full text of bundled `templates/agents/skills/codemap/SKILL.md`. Agents that don't preload the skill at session start can fetch it here.

Expand Down
49 changes: 49 additions & 0 deletions .changeset/recipes-content-registry.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
---
"@stainless-code/codemap": minor
---

feat(recipes): recipes-as-content registry — bundled .md siblings + project-local recipes

Two complementary capabilities:

1. **Bundled recipes get richer descriptions.** Every bundled recipe in
`templates/recipes/` is now a `<id>.sql` file paired with an optional
`<id>.md` description body (replaces the inline TypeScript map in
`src/cli/query-recipes.ts`). Per-row `actions` templates live in YAML
frontmatter on the `.md` instead of code. Same surface for end users
(`--recipe <id>` / `--recipes-json` / `codemap://recipes`); single
storage shape across bundled + project recipes.

2. **Project-local recipes** — drop `<id>.{sql,md}` files into
`<projectRoot>/.codemap/recipes/` to ship team-internal SQL as first-
class recipes. Auto-discovered via `--recipe <id>`, surfaced in
`--recipes-json` and the `codemap://recipes` MCP resource alongside
bundled. Project recipes win on id collision; the catalog entry
carries `shadows: true` on overrides so agents reading the catalog
at session start see when a recipe behaves differently from the
documented bundled version (per-execution response shape stays
unchanged — uniformity contract preserved).

Catalog entries (`--recipes-json` output, `codemap://recipes`
payload) gain three additive fields: `body` (full Markdown body),
`source` (`"bundled" | "project"`), and `shadows?` (true on
project entries that override a bundled id). Existing consumers
that destructure `{id, description, sql, actions?}` keep working.

Validation: load-time lexical scan rejects DML / DDL keywords
(`INSERT` / `UPDATE` / `DELETE` / `DROP` / `CREATE` / `ALTER` /
`ATTACH` / `DETACH` / `REPLACE` / `TRUNCATE` / `VACUUM` / `PRAGMA`)
in recipe SQL with recipe-aware error messages — defence in depth
alongside the runtime `PRAGMA query_only=1` backstop in
`query-engine.ts` shipped in the previous release.

Implementation: pure transport-agnostic loader in
`src/application/recipes-loader.ts`; thin shim in
`src/cli/query-recipes.ts` preserves backwards-compat exports
(`QUERY_RECIPES`, `getQueryRecipeSql`, etc.). Hand-rolled YAML
frontmatter parser scoped to the `actions` shape (no `js-yaml`
dependency).

`.codemap.db` is gitignored as before; `.codemap/recipes/` is NOT
(verified via `git check-ignore`) — recipes are git-tracked source
code authored for human review.
8 changes: 8 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -110,6 +110,14 @@ codemap query --recipes-json
codemap query --print-sql fan-out
# `components-by-hooks` ranks by hook count without SQLite JSON1 (comma-based count on the stored JSON array).

# Project-local recipes — drop SQL files into .codemap/recipes/ to make them discoverable across the team
# Bundled recipes live in templates/recipes/ in the npm package; project recipes win on id collision
# (shadowing is signalled via a `shadows: true` field in --recipes-json so agents notice the override)
mkdir -p .codemap/recipes
echo "SELECT path FROM files WHERE language IN ('ts', 'tsx') AND line_count > 500" \
> .codemap/recipes/big-ts-files.sql
codemap query --recipe big-ts-files # auto-discovered alongside bundled
Comment thread
coderabbitai[bot] marked this conversation as resolved.

# MCP server (Model Context Protocol) — for agent hosts (Claude Code, Cursor, Codex, generic MCP clients)
codemap mcp # JSON-RPC on stdio; one tool per CLI verb plus query_batch
# Tools: query, query_batch (MCP-only — N statements in one round-trip), query_recipe, audit,
Expand Down
2 changes: 2 additions & 0 deletions docs/architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -125,6 +125,8 @@ A local SQLite database (`.codemap.db`) indexes the project tree and stores stru

**Context wiring:** **`src/cli/cmd-context.ts`** — **`buildContextEnvelope`** composes the JSON envelope from existing recipes (`fan-in` for `hubs`, `markers` SELECT for `sample_markers`, `QUERY_RECIPES` map for the catalog). **`classifyIntent`** maps `--for "<text>"` to one of `refactor | debug | test | feature | explore | other` via regex against the trimmed input; whitespace-only intents are rejected. `--compact` drops `hubs` + `sample_markers` and emits one-line JSON; otherwise pretty-prints with 2-space indent.

**Recipes wiring:** **`src/application/recipes-loader.ts`** (pure transport-agnostic loader) + **`src/cli/query-recipes.ts`** (shim — caches the loader output, exposes `getQueryRecipeSql` / `getQueryRecipeActions` / `listQueryRecipeIds` / `listQueryRecipeCatalog` / `getQueryRecipeCatalogEntry`). Recipes live as file pairs: **`<id>.sql`** + optional **`<id>.md`**. The loader reads `templates/recipes/` (bundled, ships in npm package next to `templates/agents/`) and `<projectRoot>/.codemap/recipes/` (project-local — root-only resolution per the registry plan, no walk-up). Project recipes win on id collision; entries that override a bundled id carry **`shadows: true`** in the catalog so agents reading `codemap://recipes` at session start see when a recipe behaves differently from the documented bundled version. Per-row **`actions`** templates (kebab-case verb + description) live in YAML frontmatter on each `<id>.md` — uniform shape across bundled + project. Hand-rolled YAML parser scoped to `actions: [{type, auto_fixable?, description?}]` only (no `js-yaml` dep). Load-time validation rejects empty SQL and DML / DDL keywords (`INSERT` / `UPDATE` / `DELETE` / `DROP` / `CREATE` / `ALTER` / `ATTACH` / `DETACH` / `REPLACE` / `TRUNCATE` / `VACUUM` / `PRAGMA`) with recipe-aware error messages — defence in depth alongside the runtime `PRAGMA query_only=1` backstop in `query-engine.ts` (PR #35). `.codemap.db` is gitignored; `.codemap/recipes/` is NOT (verified via `git check-ignore`) — recipes are git-tracked source code authored for human review.

**MCP wiring:** **`src/cli/cmd-mcp.ts`** (argv — `--help` only; bootstrap absorbs `--root`/`--config`) + **`src/application/mcp-server.ts`** (engine — tool registry, resource handlers, response composition). Mirrors the `cmd-audit.ts ↔ audit-engine.ts` seam — CLI parses + lifecycle; engine owns the SDK. **`runMcpServer`** bootstraps codemap once at server boot (config + resolver + DB access become module-level state), instantiates `McpServer` from **`@modelcontextprotocol/sdk`**, attaches a **`StdioServerTransport`**, and resolves when stdin closes (clean shutdown). Tool handlers reuse the existing engine entry-points: **`query`** + **`query_recipe`** call **`executeQuery`** in **`src/application/query-engine.ts`** (a pure transport-agnostic engine extracted from `printQueryResult`'s JSON branch — same `[...rows]` / `{count}` / `{group_by, groups}` envelope `--json` would print); **`query_batch`** loops via **`executeQueryBatch`** with batch-wide-defaults + per-statement-overrides (items are `string | {sql, summary?, changed_since?, group_by?}`); **`audit`** runs `resolveAuditBaselines` + `runAudit` from PR #33 unchanged; **`context`** / **`validate`** call `buildContextEnvelope` / `computeValidateRows` (pure functions in `src/cli/cmd-*.ts` — same layer-reversal allowance as `query-recipes`). **`save_baseline`** is one polymorphic tool (`{name, sql? | recipe?}`) with a runtime exclusivity check — mirrors the CLI's single `--save-baseline=<name>` verb. **Tool naming**: snake_case throughout — Codemap convention matching the patterns in MCP spec examples and reference servers (GitHub MCP, Cursor built-ins); the spec itself doesn't mandate it. CLI stays kebab — translation lives at the MCP-arg layer. **Resources** (`codemap://recipes`, `codemap://recipes/{id}`, `codemap://schema`, `codemap://skill`) use **lazy memoisation** — first `read_resource` populates a per-server-instance cache; constant for the server-process lifetime so eager-vs-lazy produce identical observable behavior. `codemap://schema` queries `sqlite_schema` live; `codemap://skill` reads from `resolveAgentsTemplateDir() + skills/codemap/SKILL.md`. Output shape uniformity (plan § 4): every tool returns the JSON envelope its CLI counterpart's `--json` flag prints, surfaced via `content: [{type: "text", text: JSON.stringify(payload)}]`. `--changed-since` git lookups are memoised per `(root, ref)` pair across batch items so a `query_batch` of N items sharing the same ref does one git invocation, not N. Per-statement errors in `query_batch` are isolated — failed statements return `{error}` in their slot while siblings still execute.

**Performance wiring:** **`--performance`** plumbs through **`RunIndexOptions.performance`** → **`indexFiles({ performance, collectMs })`**. `parse-worker-core.ts` records per-file **`parseMs`** on each `ParsedFile`; main thread times the four phases (`collect`, `parse`, `insert`, `index_create`) and assembles **`IndexPerformanceReport`** under `IndexRunStats.performance`. Note: `total_ms` is `indexFiles` wall-clock, **not** end-to-end run wall — `collect_ms` happens before `indexFiles` and is reported separately.
Expand Down
13 changes: 11 additions & 2 deletions docs/glossary.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ Alphabetical, lowercase. Disambiguation pairs link to each other.

- **TS shape** = a TypeScript interface or type alias.
- **SQLite table** = an actual on-disk table in `.codemap.db`.
- **Recipe** = a bundled SQL string in `src/cli/query-recipes.ts`, exposed via `codemap query --recipe <id>`.
- **Recipe** = a cataloged SQL recipe loaded by `src/application/recipes-loader.ts` from `templates/recipes/<id>.{sql,md}` (bundled) or `<projectRoot>/.codemap/recipes/<id>.{sql,md}` (project-local). Exposed via `codemap query --recipe <id>` and the `codemap://recipes` MCP resource. See [§ R recipe](#recipe).
- **Query** = any SQL run against the index (recipe or ad-hoc).

---
Expand Down Expand Up @@ -325,7 +325,16 @@ See **recipe**.

### recipe

A bundled SQL string in `src/cli/query-recipes.ts`, identified by id (e.g. `fan-in`, `deprecated-symbols`, `files-hashes`). Run via `codemap query --recipe <id>` (alias `-r`). Distinct from an ad-hoc **query** (which is any SQL string the agent composes itself).
A SQL file (plus optional sibling `.md` description) loaded into the catalog by `src/application/recipes-loader.ts`. Two sources, same shape:

- **Bundled** — ships in the npm package as `templates/recipes/<id>.{sql,md}`. Examples: `fan-in`, `deprecated-symbols`, `files-hashes`.
- **Project-local** — loaded from `<projectRoot>/.codemap/recipes/<id>.{sql,md}` (root-only resolution; not gitignored — meant to be checked in for team review).

Run via `codemap query --recipe <id>` (alias `-r`). Project recipes win on id collision with bundled ones (entries carry `shadows: true` in the catalog so agents reading `codemap://recipes` at session start see when a recipe behaves differently from the documented bundled version). Per-row `actions` templates (kebab-case verb + description) live in YAML frontmatter on each `<id>.md` — uniform between bundled and project. Load-time validation rejects empty SQL and DML / DDL keywords; runtime `PRAGMA query_only=1` (PR #35) is the parser-proof backstop. Distinct from an ad-hoc **query** (any SQL string the agent composes itself; ad-hoc SQL never carries actions).

Comment thread
coderabbitai[bot] marked this conversation as resolved.
### `recipe shadows`

Boolean flag on a project-local recipe entry that has the same `id` as a bundled recipe — `shadows: true` means "this project recipe overrides what the bundled version would have done." Surfaces in `--recipes-json`, `codemap://recipes`, and `codemap://recipes/{id}` so agents can see overrides without parsing per-execution responses (per-execution shape stays unchanged for plan § 4 uniformity). Silent at runtime — the agent-facing skill prompt is the channel that tells agents to check the flag at session start.

### research

Expand Down
1 change: 0 additions & 1 deletion docs/roadmap.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,6 @@ Codemap stays a structural-index primitive that other tools can consume. Out of
- [ ] **`codemap audit --base <ref>`** (v1.x) — worktree+reindex snapshot strategy. v1 shipped `--baseline <prefix>` / `--<delta>-baseline <name>` (B.6 reuse) — see [`architecture.md` § Audit wiring](./architecture.md#cli-usage). v1.x adds `--base <ref>` for "audit against an arbitrary ref I haven't pre-baselined" (defers worktree spawn + cache decision until a real consumer asks).
- [ ] **`codemap audit` verdict + thresholds** (v1.x) — `verdict: "pass" | "warn" | "fail"` driven by `codemap.config.audit.deltas[<key>].{added_max, action}`. Triggers: two consumers ship `jq`-based threshold scripts with similar shapes, OR one consumer asks with a concrete config sketch. Until then, raw deltas + consumer-side `jq` is the CI exit-code idiom.
- [ ] **`codemap serve` (HTTP API, v1.x)** — same tool taxonomy + output shape as `codemap mcp` (shipped in v1), exposed over `POST /tool/{name}` with loopback default and optional `--token`. Defer until a concrete non-MCP consumer asks; design points are reserved in [`architecture.md` § MCP wiring](./architecture.md#cli-usage) so HTTP inherits them when its turn comes.
- [ ] **Recipes-as-content registry** — pair every bundled recipe in `src/cli/query-recipes.ts` with a sibling `.md` (or YAML frontmatter) describing _when to use, follow-up SQL_; surface in `--recipes-json`. Plus **project-local recipes** loaded from `.codemap/recipes/*.{sql,md}` so teams can ship internal SQL without an adapter API
- [ ] **Targeted-read CLI** — `codemap show <symbol>` / `codemap snippet <name>` returns `file_path:line_start-line_end` + `signature` for one symbol. Same data as `SELECT … FROM symbols WHERE name = ?`, but a one-step CLI keeps agents from composing SQL for trivial precise reads
- [ ] **Watch mode** for dev — `node:fs.watch` recursive + `--files` re-index loop; Linux `recursive` requires Node 19.1+
- [ ] **Monorepo / workspace awareness** — discover workspaces from `pnpm-workspace.yaml` / `package.json` and index per-workspace dependency graphs
Expand Down
Loading
Loading