diff --git a/.agents/rules/codemap.md b/.agents/rules/codemap.md index e155cfb8..68d8b23b 100644 --- a/.agents/rules/codemap.md +++ b/.agents/rules/codemap.md @@ -29,6 +29,21 @@ A local database (default **`.codemap.db`**) indexes structure: symbols, imports **Recipe `actions`:** with **`--json`**, recipes that define an `actions` template append it to every row (kebab-case verb + description — e.g. `fan-out` → `review-coupling`). Under `--baseline`, actions attach to the **`added`** rows only. Inspect via **`--recipes-json`**. Ad-hoc SQL never carries actions. +**Project-local recipes:** drop `.sql` (and optional `.md` for description + actions) into **`/.codemap/recipes/`** — auto-discovered, runs via `--recipe ` like bundled. Project recipes win on id collision; check `--recipes-json` for **`shadows: true`** entries to know when a project recipe overrides the documented bundled version. `.md` supports YAML frontmatter for the per-row action template — block-list shape only (the loader's hand-rolled parser doesn't accept inline-flow `[{...}]`): + +```markdown +--- +actions: + - type: review-coupling + auto_fixable: false + description: "High fan-out usually means orchestrator role." +--- + +(Markdown body — first non-empty line becomes the catalog description.) +``` + +Validation: SQL is rejected at load time if it starts with DML/DDL (DELETE/DROP/UPDATE/etc.); the runtime `PRAGMA query_only=1` is the parser-proof backstop. + **Baselines** (`query_baselines` table inside `.codemap.db`, no parallel JSON files): `--save-baseline[=]` snapshots a result set; `--baseline[=]` diffs the current result against it (added / removed rows; identity = `JSON.stringify(row)`). Name defaults to the `--recipe` id; ad-hoc SQL needs an explicit `=`. Survives `--full` and SCHEMA bumps. **Audit (`bun src/index.ts audit`)**: structural-drift command; emits `{head, deltas: {files, dependencies, deprecated}}` (each delta carries its own `base` metadata). Reuses B.6 baselines as the snapshot source. Two CLI shapes — `--baseline ` auto-resolves `-files` / `-dependencies` / `-deprecated`; `---baseline ` is the explicit per-delta override. v1 ships no `verdict` / threshold config — consumers compose `--json` + `jq` for CI exit codes. Auto-runs an incremental index before the diff (use `--no-index` to skip for frozen-DB CI). diff --git a/.agents/skills/codemap/SKILL.md b/.agents/skills/codemap/SKILL.md index 16006caa..b0ff4cb1 100644 --- a/.agents/skills/codemap/SKILL.md +++ b/.agents/skills/codemap/SKILL.md @@ -45,6 +45,7 @@ Replace placeholders (`'...'`) with your module path, file glob, or symbol name. - **`--baseline[=]`** — diff the current result against the saved baseline. Output `{baseline:{...}, current_row_count, added: [...], removed: [...]}` (with `--json`) or a two-section terminal dump. Identity = per-row multiset equality (canonical `JSON.stringify` keyed frequency map; duplicates preserved). Pair with `--summary` for `{baseline:{...}, current_row_count, added: N, removed: N}`. **Mutually exclusive with `--group-by`.** - **`--baselines`** lists saved baselines (no `rows_json` payload); **`--drop-baseline `** deletes one. Both reject every other flag — they're list-only / drop-only operations. - **Per-row recipe `actions`** — recipes that define an **`actions: [{type, auto_fixable?, description?}]`** template append it to every row in **`--json`** output (recipe-only; ad-hoc SQL never carries actions). Under `--baseline`, actions attach to the **`added`** rows only (the rows the agent should act on). Inspect via **`--recipes-json`**. +- **Project-local recipes** — drop **`.sql`** (and optional **`.md`** for description body + actions) into **`/.codemap/recipes/`** to make team-internal SQL a first-class CLI verb. `--recipes-json` and the `codemap://recipes` MCP resource list project recipes alongside bundled ones with **`source: "bundled" | "project"`** discriminating them. Project recipes win on id collision; entries that override a bundled id carry **`shadows: true`** so agents reading the catalog at session start know when a recipe behaves differently from the documented bundled version. `.md` supports YAML frontmatter for the per-row action template — **block-list shape only** (loader's hand-rolled parser; no inline-flow `[{...}]`): `---\nactions:\n - type: my-verb\n auto_fixable: false\n description: "..."\n---`. Validation: SQL is rejected at load time if it starts with DML/DDL (DELETE/DROP/UPDATE/etc.); the runtime `PRAGMA query_only=1` is the parser-proof backstop. `.codemap.db` is gitignored; **`.codemap/recipes/` is NOT** — recipes are git-tracked source code authored for human review. **Audit (`bun src/index.ts audit`)** — separate top-level command for structural-drift verdicts. Composes B.6 baselines into a per-delta `{head, deltas}` envelope; v1 ships `files` / `dependencies` / `deprecated`. Two snapshot-source shapes: @@ -69,8 +70,8 @@ Each emitted delta carries its own `base` metadata so mixed-baseline audits are **Resources (lazy-cached on first `read_resource`; constant for server-process lifetime):** -- **`codemap://recipes`** — full catalog JSON (same as `--recipes-json`). -- **`codemap://recipes/{id}`** — single recipe `{id, description, sql, actions?}`. Replaces `--print-sql `. +- **`codemap://recipes`** — full catalog JSON (same as `--recipes-json`). Each entry carries `source: "bundled" | "project"` and `shadows: true` on project entries that override a bundled recipe id. Read this at session start so you know when a `--recipe foo` call will run a project override instead of the documented bundled version. +- **`codemap://recipes/{id}`** — single recipe `{id, description, body?, sql, actions?, source, shadows?}`. Replaces `--print-sql `. - **`codemap://schema`** — DDL of every table in `.codemap.db` (queried live from `sqlite_schema`). - **`codemap://skill`** — full text of bundled `templates/agents/skills/codemap/SKILL.md`. Agents that don't preload the skill at session start can fetch it here. diff --git a/.changeset/recipes-content-registry.md b/.changeset/recipes-content-registry.md new file mode 100644 index 00000000..3df4ba00 --- /dev/null +++ b/.changeset/recipes-content-registry.md @@ -0,0 +1,49 @@ +--- +"@stainless-code/codemap": minor +--- + +feat(recipes): recipes-as-content registry — bundled .md siblings + project-local recipes + +Two complementary capabilities: + +1. **Bundled recipes get richer descriptions.** Every bundled recipe in + `templates/recipes/` is now a `.sql` file paired with an optional + `.md` description body (replaces the inline TypeScript map in + `src/cli/query-recipes.ts`). Per-row `actions` templates live in YAML + frontmatter on the `.md` instead of code. Same surface for end users + (`--recipe ` / `--recipes-json` / `codemap://recipes`); single + storage shape across bundled + project recipes. + +2. **Project-local recipes** — drop `.{sql,md}` files into + `/.codemap/recipes/` to ship team-internal SQL as first- + class recipes. Auto-discovered via `--recipe `, surfaced in + `--recipes-json` and the `codemap://recipes` MCP resource alongside + bundled. Project recipes win on id collision; the catalog entry + carries `shadows: true` on overrides so agents reading the catalog + at session start see when a recipe behaves differently from the + documented bundled version (per-execution response shape stays + unchanged — uniformity contract preserved). + +Catalog entries (`--recipes-json` output, `codemap://recipes` +payload) gain three additive fields: `body` (full Markdown body), +`source` (`"bundled" | "project"`), and `shadows?` (true on +project entries that override a bundled id). Existing consumers +that destructure `{id, description, sql, actions?}` keep working. + +Validation: load-time lexical scan rejects DML / DDL keywords +(`INSERT` / `UPDATE` / `DELETE` / `DROP` / `CREATE` / `ALTER` / +`ATTACH` / `DETACH` / `REPLACE` / `TRUNCATE` / `VACUUM` / `PRAGMA`) +in recipe SQL with recipe-aware error messages — defence in depth +alongside the runtime `PRAGMA query_only=1` backstop in +`query-engine.ts` shipped in the previous release. + +Implementation: pure transport-agnostic loader in +`src/application/recipes-loader.ts`; thin shim in +`src/cli/query-recipes.ts` preserves backwards-compat exports +(`QUERY_RECIPES`, `getQueryRecipeSql`, etc.). Hand-rolled YAML +frontmatter parser scoped to the `actions` shape (no `js-yaml` +dependency). + +`.codemap.db` is gitignored as before; `.codemap/recipes/` is NOT +(verified via `git check-ignore`) — recipes are git-tracked source +code authored for human review. diff --git a/README.md b/README.md index fe43c409..483b3178 100644 --- a/README.md +++ b/README.md @@ -110,6 +110,14 @@ codemap query --recipes-json codemap query --print-sql fan-out # `components-by-hooks` ranks by hook count without SQLite JSON1 (comma-based count on the stored JSON array). +# Project-local recipes — drop SQL files into .codemap/recipes/ to make them discoverable across the team +# Bundled recipes live in templates/recipes/ in the npm package; project recipes win on id collision +# (shadowing is signalled via a `shadows: true` field in --recipes-json so agents notice the override) +mkdir -p .codemap/recipes +echo "SELECT path FROM files WHERE language IN ('ts', 'tsx') AND line_count > 500" \ + > .codemap/recipes/big-ts-files.sql +codemap query --recipe big-ts-files # auto-discovered alongside bundled + # MCP server (Model Context Protocol) — for agent hosts (Claude Code, Cursor, Codex, generic MCP clients) codemap mcp # JSON-RPC on stdio; one tool per CLI verb plus query_batch # Tools: query, query_batch (MCP-only — N statements in one round-trip), query_recipe, audit, diff --git a/docs/architecture.md b/docs/architecture.md index 8689b727..f059bb51 100644 --- a/docs/architecture.md +++ b/docs/architecture.md @@ -125,6 +125,8 @@ A local SQLite database (`.codemap.db`) indexes the project tree and stores stru **Context wiring:** **`src/cli/cmd-context.ts`** — **`buildContextEnvelope`** composes the JSON envelope from existing recipes (`fan-in` for `hubs`, `markers` SELECT for `sample_markers`, `QUERY_RECIPES` map for the catalog). **`classifyIntent`** maps `--for ""` to one of `refactor | debug | test | feature | explore | other` via regex against the trimmed input; whitespace-only intents are rejected. `--compact` drops `hubs` + `sample_markers` and emits one-line JSON; otherwise pretty-prints with 2-space indent. +**Recipes wiring:** **`src/application/recipes-loader.ts`** (pure transport-agnostic loader) + **`src/cli/query-recipes.ts`** (shim — caches the loader output, exposes `getQueryRecipeSql` / `getQueryRecipeActions` / `listQueryRecipeIds` / `listQueryRecipeCatalog` / `getQueryRecipeCatalogEntry`). Recipes live as file pairs: **`.sql`** + optional **`.md`**. The loader reads `templates/recipes/` (bundled, ships in npm package next to `templates/agents/`) and `/.codemap/recipes/` (project-local — root-only resolution per the registry plan, no walk-up). Project recipes win on id collision; entries that override a bundled id carry **`shadows: true`** in the catalog so agents reading `codemap://recipes` at session start see when a recipe behaves differently from the documented bundled version. Per-row **`actions`** templates (kebab-case verb + description) live in YAML frontmatter on each `.md` — uniform shape across bundled + project. Hand-rolled YAML parser scoped to `actions: [{type, auto_fixable?, description?}]` only (no `js-yaml` dep). Load-time validation rejects empty SQL and DML / DDL keywords (`INSERT` / `UPDATE` / `DELETE` / `DROP` / `CREATE` / `ALTER` / `ATTACH` / `DETACH` / `REPLACE` / `TRUNCATE` / `VACUUM` / `PRAGMA`) with recipe-aware error messages — defence in depth alongside the runtime `PRAGMA query_only=1` backstop in `query-engine.ts` (PR #35). `.codemap.db` is gitignored; `.codemap/recipes/` is NOT (verified via `git check-ignore`) — recipes are git-tracked source code authored for human review. + **MCP wiring:** **`src/cli/cmd-mcp.ts`** (argv — `--help` only; bootstrap absorbs `--root`/`--config`) + **`src/application/mcp-server.ts`** (engine — tool registry, resource handlers, response composition). Mirrors the `cmd-audit.ts ↔ audit-engine.ts` seam — CLI parses + lifecycle; engine owns the SDK. **`runMcpServer`** bootstraps codemap once at server boot (config + resolver + DB access become module-level state), instantiates `McpServer` from **`@modelcontextprotocol/sdk`**, attaches a **`StdioServerTransport`**, and resolves when stdin closes (clean shutdown). Tool handlers reuse the existing engine entry-points: **`query`** + **`query_recipe`** call **`executeQuery`** in **`src/application/query-engine.ts`** (a pure transport-agnostic engine extracted from `printQueryResult`'s JSON branch — same `[...rows]` / `{count}` / `{group_by, groups}` envelope `--json` would print); **`query_batch`** loops via **`executeQueryBatch`** with batch-wide-defaults + per-statement-overrides (items are `string | {sql, summary?, changed_since?, group_by?}`); **`audit`** runs `resolveAuditBaselines` + `runAudit` from PR #33 unchanged; **`context`** / **`validate`** call `buildContextEnvelope` / `computeValidateRows` (pure functions in `src/cli/cmd-*.ts` — same layer-reversal allowance as `query-recipes`). **`save_baseline`** is one polymorphic tool (`{name, sql? | recipe?}`) with a runtime exclusivity check — mirrors the CLI's single `--save-baseline=` verb. **Tool naming**: snake_case throughout — Codemap convention matching the patterns in MCP spec examples and reference servers (GitHub MCP, Cursor built-ins); the spec itself doesn't mandate it. CLI stays kebab — translation lives at the MCP-arg layer. **Resources** (`codemap://recipes`, `codemap://recipes/{id}`, `codemap://schema`, `codemap://skill`) use **lazy memoisation** — first `read_resource` populates a per-server-instance cache; constant for the server-process lifetime so eager-vs-lazy produce identical observable behavior. `codemap://schema` queries `sqlite_schema` live; `codemap://skill` reads from `resolveAgentsTemplateDir() + skills/codemap/SKILL.md`. Output shape uniformity (plan § 4): every tool returns the JSON envelope its CLI counterpart's `--json` flag prints, surfaced via `content: [{type: "text", text: JSON.stringify(payload)}]`. `--changed-since` git lookups are memoised per `(root, ref)` pair across batch items so a `query_batch` of N items sharing the same ref does one git invocation, not N. Per-statement errors in `query_batch` are isolated — failed statements return `{error}` in their slot while siblings still execute. **Performance wiring:** **`--performance`** plumbs through **`RunIndexOptions.performance`** → **`indexFiles({ performance, collectMs })`**. `parse-worker-core.ts` records per-file **`parseMs`** on each `ParsedFile`; main thread times the four phases (`collect`, `parse`, `insert`, `index_create`) and assembles **`IndexPerformanceReport`** under `IndexRunStats.performance`. Note: `total_ms` is `indexFiles` wall-clock, **not** end-to-end run wall — `collect_ms` happens before `indexFiles` and is reported separately. diff --git a/docs/glossary.md b/docs/glossary.md index df9baff2..3f650ef9 100644 --- a/docs/glossary.md +++ b/docs/glossary.md @@ -12,7 +12,7 @@ Alphabetical, lowercase. Disambiguation pairs link to each other. - **TS shape** = a TypeScript interface or type alias. - **SQLite table** = an actual on-disk table in `.codemap.db`. -- **Recipe** = a bundled SQL string in `src/cli/query-recipes.ts`, exposed via `codemap query --recipe `. +- **Recipe** = a cataloged SQL recipe loaded by `src/application/recipes-loader.ts` from `templates/recipes/.{sql,md}` (bundled) or `/.codemap/recipes/.{sql,md}` (project-local). Exposed via `codemap query --recipe ` and the `codemap://recipes` MCP resource. See [§ R recipe](#recipe). - **Query** = any SQL run against the index (recipe or ad-hoc). --- @@ -325,7 +325,16 @@ See **recipe**. ### recipe -A bundled SQL string in `src/cli/query-recipes.ts`, identified by id (e.g. `fan-in`, `deprecated-symbols`, `files-hashes`). Run via `codemap query --recipe ` (alias `-r`). Distinct from an ad-hoc **query** (which is any SQL string the agent composes itself). +A SQL file (plus optional sibling `.md` description) loaded into the catalog by `src/application/recipes-loader.ts`. Two sources, same shape: + +- **Bundled** — ships in the npm package as `templates/recipes/.{sql,md}`. Examples: `fan-in`, `deprecated-symbols`, `files-hashes`. +- **Project-local** — loaded from `/.codemap/recipes/.{sql,md}` (root-only resolution; not gitignored — meant to be checked in for team review). + +Run via `codemap query --recipe ` (alias `-r`). Project recipes win on id collision with bundled ones (entries carry `shadows: true` in the catalog so agents reading `codemap://recipes` at session start see when a recipe behaves differently from the documented bundled version). Per-row `actions` templates (kebab-case verb + description) live in YAML frontmatter on each `.md` — uniform between bundled and project. Load-time validation rejects empty SQL and DML / DDL keywords; runtime `PRAGMA query_only=1` (PR #35) is the parser-proof backstop. Distinct from an ad-hoc **query** (any SQL string the agent composes itself; ad-hoc SQL never carries actions). + +### `recipe shadows` + +Boolean flag on a project-local recipe entry that has the same `id` as a bundled recipe — `shadows: true` means "this project recipe overrides what the bundled version would have done." Surfaces in `--recipes-json`, `codemap://recipes`, and `codemap://recipes/{id}` so agents can see overrides without parsing per-execution responses (per-execution shape stays unchanged for plan § 4 uniformity). Silent at runtime — the agent-facing skill prompt is the channel that tells agents to check the flag at session start. ### research diff --git a/docs/roadmap.md b/docs/roadmap.md index 034e5874..f4911311 100644 --- a/docs/roadmap.md +++ b/docs/roadmap.md @@ -39,7 +39,6 @@ Codemap stays a structural-index primitive that other tools can consume. Out of - [ ] **`codemap audit --base `** (v1.x) — worktree+reindex snapshot strategy. v1 shipped `--baseline ` / `---baseline ` (B.6 reuse) — see [`architecture.md` § Audit wiring](./architecture.md#cli-usage). v1.x adds `--base ` for "audit against an arbitrary ref I haven't pre-baselined" (defers worktree spawn + cache decision until a real consumer asks). - [ ] **`codemap audit` verdict + thresholds** (v1.x) — `verdict: "pass" | "warn" | "fail"` driven by `codemap.config.audit.deltas[].{added_max, action}`. Triggers: two consumers ship `jq`-based threshold scripts with similar shapes, OR one consumer asks with a concrete config sketch. Until then, raw deltas + consumer-side `jq` is the CI exit-code idiom. - [ ] **`codemap serve` (HTTP API, v1.x)** — same tool taxonomy + output shape as `codemap mcp` (shipped in v1), exposed over `POST /tool/{name}` with loopback default and optional `--token`. Defer until a concrete non-MCP consumer asks; design points are reserved in [`architecture.md` § MCP wiring](./architecture.md#cli-usage) so HTTP inherits them when its turn comes. -- [ ] **Recipes-as-content registry** — pair every bundled recipe in `src/cli/query-recipes.ts` with a sibling `.md` (or YAML frontmatter) describing _when to use, follow-up SQL_; surface in `--recipes-json`. Plus **project-local recipes** loaded from `.codemap/recipes/*.{sql,md}` so teams can ship internal SQL without an adapter API - [ ] **Targeted-read CLI** — `codemap show ` / `codemap snippet ` returns `file_path:line_start-line_end` + `signature` for one symbol. Same data as `SELECT … FROM symbols WHERE name = ?`, but a one-step CLI keeps agents from composing SQL for trivial precise reads - [ ] **Watch mode** for dev — `node:fs.watch` recursive + `--files` re-index loop; Linux `recursive` requires Node 19.1+ - [ ] **Monorepo / workspace awareness** — discover workspaces from `pnpm-workspace.yaml` / `package.json` and index per-workspace dependency graphs diff --git a/src/application/mcp-server.ts b/src/application/mcp-server.ts index 418de73a..402425ea 100644 --- a/src/application/mcp-server.ts +++ b/src/application/mcp-server.ts @@ -20,10 +20,9 @@ import { buildContextEnvelope } from "../cli/cmd-context"; import { computeValidateRows } from "../cli/cmd-validate"; import { getQueryRecipeActions, + getQueryRecipeCatalogEntry, getQueryRecipeSql, listQueryRecipeCatalog, - listQueryRecipeIds, - QUERY_RECIPES, } from "../cli/query-recipes"; import { loadUserConfig, resolveCodemapConfig } from "../config"; import { @@ -605,23 +604,26 @@ function registerResources(server: McpServer): void { }, ); - // codemap://recipes/{id} — one recipe (template form) + // codemap://recipes/{id} — one recipe (template form). Per Tracer 4 the + // payload includes `body` / `source` / `shadows` from the catalog entry — + // session-start agents check `shadows` to know when a project recipe + // overrides the documented bundled version. const oneRecipeCache = new Map(); server.registerResource( "recipe", new ResourceTemplate("codemap://recipes/{id}", { list: () => ({ - resources: listQueryRecipeIds().map((id) => ({ - uri: `codemap://recipes/${id}`, - name: id, - description: QUERY_RECIPES[id]!.description, + resources: listQueryRecipeCatalog().map((entry) => ({ + uri: `codemap://recipes/${entry.id}`, + name: entry.id, + description: entry.description, mimeType: "application/json", })), }), }), { description: - "Single recipe by id: {id, description, sql, actions?}. Replaces `codemap query --print-sql ` for agents.", + "Single recipe by id: {id, description, body?, sql, actions?, source, shadows?}. Replaces `codemap query --print-sql ` for agents; carries provenance fields so agents see when a project-local recipe overrides a bundled one.", mimeType: "application/json", }, (uri, variables) => { @@ -635,20 +637,15 @@ function registerResources(server: McpServer): void { ], }; } - const meta = QUERY_RECIPES[id]; - if (meta === undefined) { + const entry = getQueryRecipeCatalogEntry(id); + if (entry === undefined) { // Resources can't return structured errors the way tools do; throw so // the SDK surfaces a JSON-RPC error to the host. throw new Error( `codemap: unknown recipe "${id}". Read codemap://recipes for the catalog.`, ); } - const payload = JSON.stringify({ - id, - description: meta.description, - sql: meta.sql, - ...(meta.actions !== undefined ? { actions: meta.actions } : {}), - }); + const payload = JSON.stringify(entry); oneRecipeCache.set(id, payload); return { contents: [ diff --git a/src/application/recipes-loader.test.ts b/src/application/recipes-loader.test.ts new file mode 100644 index 00000000..c3be2ef1 --- /dev/null +++ b/src/application/recipes-loader.test.ts @@ -0,0 +1,446 @@ +import { afterEach, beforeEach, describe, expect, it } from "bun:test"; +import { mkdirSync, mkdtempSync, rmSync, writeFileSync } from "node:fs"; +import { tmpdir } from "node:os"; +import { join } from "node:path"; + +import { + extractFrontmatterAndBody, + loadAllRecipes, + mergeRecipes, + readRecipesFromDir, + validateRecipeSql, +} from "./recipes-loader"; +import type { LoadedRecipe } from "./recipes-loader"; + +let workDir: string; + +beforeEach(() => { + workDir = mkdtempSync(join(tmpdir(), "recipes-loader-")); +}); + +afterEach(() => { + rmSync(workDir, { recursive: true, force: true }); +}); + +function makeRecipeDir(name: string): string { + const dir = join(workDir, name); + mkdirSync(dir, { recursive: true }); + return dir; +} + +describe("readRecipesFromDir", () => { + it("returns [] when directory doesn't exist (project-recipes case)", () => { + expect(readRecipesFromDir(join(workDir, "missing"), "project")).toEqual([]); + }); + + it("ignores non-.sql files", () => { + const dir = makeRecipeDir("ignore-noise"); + writeFileSync(join(dir, "fan-out.sql"), "SELECT 1\n"); + writeFileSync(join(dir, "README.md"), "# unrelated\n"); + writeFileSync(join(dir, ".DS_Store"), ""); + const r = readRecipesFromDir(dir, "bundled"); + expect(r.map((x) => x.id)).toEqual(["fan-out"]); + }); + + it("loads SQL only — no sibling .md → description/body/actions undefined", () => { + const dir = makeRecipeDir("sql-only"); + writeFileSync(join(dir, "fan-out.sql"), "SELECT 1\n"); + const r = readRecipesFromDir(dir, "bundled"); + expect(r).toHaveLength(1); + const recipe = r[0]!; + expect(recipe).toMatchObject({ + id: "fan-out", + sql: "SELECT 1\n", + description: undefined, + body: undefined, + actions: undefined, + source: "bundled", + shadows: false, + }); + }); + + it("pairs sibling .md — description = first non-empty line, body = full text", () => { + const dir = makeRecipeDir("with-md"); + writeFileSync(join(dir, "fan-out.sql"), "SELECT 1\n"); + writeFileSync( + join(dir, "fan-out.md"), + "# Fan-out\n\nWhen to use: …\n\nFollow-up SQL: …\n", + ); + const r = readRecipesFromDir(dir, "bundled"); + expect(r[0]!.description).toBe("Fan-out"); + expect(r[0]!.body).toContain("When to use"); + }); + + it("description strips leading `# ` heading marker", () => { + const dir = makeRecipeDir("md-headers"); + writeFileSync(join(dir, "x.sql"), "SELECT 1\n"); + writeFileSync(join(dir, "x.md"), "## Heading two\n\ncontent\n"); + expect(readRecipesFromDir(dir, "bundled")[0]!.description).toBe( + "Heading two", + ); + }); + + it("returns recipes sorted by id (deterministic order)", () => { + const dir = makeRecipeDir("ordering"); + writeFileSync(join(dir, "zebra.sql"), "SELECT 1\n"); + writeFileSync(join(dir, "alpha.sql"), "SELECT 2\n"); + writeFileSync(join(dir, "monkey.sql"), "SELECT 3\n"); + const r = readRecipesFromDir(dir, "project"); + expect(r.map((x) => x.id)).toEqual(["alpha", "monkey", "zebra"]); + }); + + it("throws on empty SQL (just whitespace + comments)", () => { + const dir = makeRecipeDir("empty"); + writeFileSync( + join(dir, "blank.sql"), + "-- this is just a comment\n \n-- and another\n", + ); + expect(() => readRecipesFromDir(dir, "project")).toThrow(/empty/); + }); + + it("counts SQL with content as non-empty even with leading comments", () => { + const dir = makeRecipeDir("comments-then-sql"); + writeFileSync( + join(dir, "x.sql"), + "-- doc comment line\nSELECT path FROM files\n", + ); + expect(readRecipesFromDir(dir, "bundled")).toHaveLength(1); + }); + + it("returns [] for a non-directory path (not an error)", () => { + const filePath = join(workDir, "actually-a-file.txt"); + writeFileSync(filePath, ""); + expect(readRecipesFromDir(filePath, "bundled")).toEqual([]); + }); +}); + +describe("mergeRecipes", () => { + function recipe(id: string, source: LoadedRecipe["source"]): LoadedRecipe { + return { + id, + sql: `SELECT '${id}'`, + description: undefined, + body: undefined, + actions: undefined, + source, + shadows: false, + }; + } + + it("project-only — no shadows, no merging", () => { + const r = mergeRecipes( + [], + [recipe("a", "project"), recipe("b", "project")], + ); + expect(r.map((x) => `${x.id}:${x.source}:${x.shadows}`)).toEqual([ + "a:project:false", + "b:project:false", + ]); + }); + + it("bundled-only — passes through, sorted by id", () => { + const r = mergeRecipes( + [recipe("zebra", "bundled"), recipe("alpha", "bundled")], + [], + ); + expect(r.map((x) => x.id)).toEqual(["alpha", "zebra"]); + }); + + it("project shadows bundled — project wins, shadows: true", () => { + const r = mergeRecipes( + [recipe("fan-out", "bundled"), recipe("fan-in", "bundled")], + [recipe("fan-out", "project")], + ); + const fanOut = r.find((x) => x.id === "fan-out")!; + expect(fanOut.source).toBe("project"); + expect(fanOut.shadows).toBe(true); + // bundled fan-out is filtered out — only one entry per id. + expect(r.filter((x) => x.id === "fan-out")).toHaveLength(1); + // unrelated bundled recipe still present. + const fanIn = r.find((x) => x.id === "fan-in")!; + expect(fanIn.source).toBe("bundled"); + expect(fanIn.shadows).toBe(false); + }); + + it("project recipe with no bundled match — shadows: false", () => { + const r = mergeRecipes( + [recipe("fan-out", "bundled")], + [recipe("internal-flaky-tests", "project")], + ); + const internal = r.find((x) => x.id === "internal-flaky-tests")!; + expect(internal.shadows).toBe(false); + }); +}); + +describe("loadAllRecipes — bundled + project composition", () => { + it("loads bundled-only when projectDir is undefined", () => { + const dir = makeRecipeDir("bundled-only"); + writeFileSync(join(dir, "fan-out.sql"), "SELECT 1\n"); + const r = loadAllRecipes({ bundledDir: dir, projectDir: undefined }); + expect(r).toHaveLength(1); + expect(r[0]!.source).toBe("bundled"); + }); + + it("loads bundled + project, sorted by id", () => { + const bundledDir = makeRecipeDir("bundled"); + const projectDir = makeRecipeDir("project"); + writeFileSync(join(bundledDir, "fan-out.sql"), "SELECT 1\n"); + writeFileSync( + join(projectDir, "internal-flaky-tests.sql"), + "SELECT path FROM files\n", + ); + const r = loadAllRecipes({ bundledDir, projectDir }); + expect(r.map((x) => `${x.id}:${x.source}`)).toEqual([ + "fan-out:bundled", + "internal-flaky-tests:project", + ]); + }); + + it("project recipe shadows bundled with same id (project wins, shadows: true)", () => { + const bundledDir = makeRecipeDir("bundled-shadowed"); + const projectDir = makeRecipeDir("project-shadowing"); + writeFileSync(join(bundledDir, "fan-out.sql"), "SELECT 1\n"); + writeFileSync( + join(projectDir, "fan-out.sql"), + "SELECT 'project version'\n", + ); + const r = loadAllRecipes({ bundledDir, projectDir }); + expect(r).toHaveLength(1); + const recipe = r[0]!; + expect(recipe.source).toBe("project"); + expect(recipe.shadows).toBe(true); + expect(recipe.sql).toContain("project version"); + }); + + it("missing .codemap/recipes/ directory is not an error", () => { + const bundledDir = makeRecipeDir("bundled"); + writeFileSync(join(bundledDir, "x.sql"), "SELECT 1\n"); + const r = loadAllRecipes({ + bundledDir, + projectDir: join(workDir, "does-not-exist"), + }); + expect(r).toHaveLength(1); + expect(r[0]!.source).toBe("bundled"); + }); +}); + +describe("validateRecipeSql — load-time DML/DDL deny-list", () => { + it("accepts SELECT (the common case)", () => { + expect(() => + validateRecipeSql("ok", "/tmp/ok.sql", "SELECT 1\n"), + ).not.toThrow(); + }); + + it("accepts WITH-prefixed CTEs", () => { + expect(() => + validateRecipeSql( + "cte", + "/tmp/cte.sql", + "WITH x AS (SELECT 1) SELECT * FROM x\n", + ), + ).not.toThrow(); + }); + + it("rejects DELETE with recipe-aware error", () => { + expect(() => + validateRecipeSql("bad", "/tmp/bad.sql", "DELETE FROM files\n"), + ).toThrow(/recipes must be read-only/); + }); + + for (const verb of [ + "INSERT", + "UPDATE", + "DROP", + "CREATE", + "ALTER", + "ATTACH", + "DETACH", + "REPLACE", + "TRUNCATE", + "VACUUM", + "PRAGMA", + ]) { + it(`rejects ${verb} at load time`, () => { + expect(() => + validateRecipeSql( + "bad", + "/tmp/bad.sql", + `${verb} something arbitrary\n`, + ), + ).toThrow(/read-only/); + }); + } + + it("ignores leading -- comments before the keyword", () => { + expect(() => + validateRecipeSql( + "ok", + "/tmp/ok.sql", + "-- doc line\n-- another doc\nSELECT 1\n", + ), + ).not.toThrow(); + }); + + it("rejects lowercase deny-list keywords (case-insensitive)", () => { + expect(() => + validateRecipeSql("bad", "/tmp/bad.sql", "drop table x\n"), + ).toThrow(/read-only/); + }); + + it("strips block /* */ comments before deciding the first keyword", () => { + // Without block-comment stripping, this would mis-detect 'INSERT' from the + // comment text and reject a legitimate SELECT recipe. + expect(() => + validateRecipeSql( + "ok", + "/tmp/ok.sql", + "/* notes about INSERT semantics — see issue #42 */\nSELECT 1\n", + ), + ).not.toThrow(); + }); + + it("rejects DELETE smuggled after a leading block comment (defence in depth)", () => { + // A bare `/* SELECT */ DELETE FROM x` would have slipped past a + // comment-blind first-keyword scan; block-comment stripping makes the + // deny-list see the real first keyword. + expect(() => + validateRecipeSql( + "bad", + "/tmp/bad.sql", + "/* SELECT */ DELETE FROM files\n", + ), + ).toThrow(/read-only/); + }); + + it("rejects pure-block-comment files as empty (no SQL after stripping)", () => { + expect(() => + validateRecipeSql( + "blank", + "/tmp/blank.sql", + "/* placeholder, no SQL yet */\n", + ), + ).toThrow(/empty/); + }); +}); + +describe("extractFrontmatterAndBody — YAML actions parser", () => { + it("returns body as full text when no frontmatter delimiter present", () => { + const md = "Just some plain markdown.\n"; + const r = extractFrontmatterAndBody(md); + expect(r.actions).toBeUndefined(); + expect(r.body).toBe(md); + }); + + it("parses a single action with type only", () => { + const md = `--- +actions: + - type: review-coupling +--- +Body line one +Body line two +`; + const r = extractFrontmatterAndBody(md); + expect(r.actions).toEqual([{ type: "review-coupling" }]); + expect(r.body.startsWith("Body line one")).toBe(true); + }); + + it("parses action with type + description (double-quoted)", () => { + const md = `--- +actions: + - type: split-barrel + description: "Confirm intent before splitting." +--- +body +`; + const r = extractFrontmatterAndBody(md); + expect(r.actions).toEqual([ + { type: "split-barrel", description: "Confirm intent before splitting." }, + ]); + }); + + it("parses action with auto_fixable: true (boolean scalar)", () => { + const md = `--- +actions: + - type: delete-file + auto_fixable: true + description: bare unquoted text is fine +--- +body +`; + const r = extractFrontmatterAndBody(md); + expect(r.actions).toEqual([ + { + type: "delete-file", + auto_fixable: true, + description: "bare unquoted text is fine", + }, + ]); + }); + + it("parses multiple action items", () => { + const md = `--- +actions: + - type: a + - type: b + description: second +--- +body +`; + const r = extractFrontmatterAndBody(md); + expect(r.actions).toEqual([ + { type: "a" }, + { type: "b", description: "second" }, + ]); + }); + + it("returns undefined actions when no actions key in frontmatter", () => { + const md = `--- +some_other_key: value +--- +body +`; + const r = extractFrontmatterAndBody(md); + expect(r.actions).toBeUndefined(); + expect(r.body.startsWith("body")).toBe(true); + }); + + it("treats malformed frontmatter (no closing ---) as no frontmatter", () => { + const md = `--- +actions: + - type: foo +this never closes +`; + const r = extractFrontmatterAndBody(md); + expect(r.actions).toBeUndefined(); + expect(r.body).toBe(md); + }); +}); + +describe("readRecipesFromDir — frontmatter integration", () => { + it("populates actions from sibling .md frontmatter", () => { + const dir = makeRecipeDir("with-frontmatter"); + writeFileSync(join(dir, "fan-out.sql"), "SELECT 1\n"); + writeFileSync( + join(dir, "fan-out.md"), + `--- +actions: + - type: review-coupling + description: "High fan-out usually means orchestrator role." +--- + +Top 10 files by dependency fan-out (edge count) +`, + ); + const r = readRecipesFromDir(dir, "bundled"); + expect(r).toHaveLength(1); + expect(r[0]!.actions).toEqual([ + { + type: "review-coupling", + description: "High fan-out usually means orchestrator role.", + }, + ]); + expect(r[0]!.description).toBe( + "Top 10 files by dependency fan-out (edge count)", + ); + }); +}); diff --git a/src/application/recipes-loader.ts b/src/application/recipes-loader.ts new file mode 100644 index 00000000..484cb0bc --- /dev/null +++ b/src/application/recipes-loader.ts @@ -0,0 +1,359 @@ +import { existsSync, readdirSync, readFileSync, statSync } from "node:fs"; +import { join } from "node:path"; + +/** + * One agent-facing follow-up suggested for every row of a recipe's result. + * Recipe authors hand-write this alongside the SQL (predictable: every row gets + * the same template). Ad-hoc SQL never carries actions — recipe-only feature. + * + * `auto_fixable` defaults to `false` when omitted. `description` is human prose + * for the agent to surface; `type` is a stable kebab-case verb the agent can + * key off (`delete-file`, `split-barrel`, `flag-caller`, …). + */ +export interface RecipeAction { + type: string; + auto_fixable?: boolean; + description?: string; +} + +/** + * One loaded recipe — the canonical shape the loader returns. Bundled and + * project recipes share this shape; `source` discriminates them. `shadows` + * is true when a project recipe overrides a bundled recipe of the same id + * (see plan §9 Q-E — agents read this at session start to know when a + * recipe behaves differently from the documented bundled version). + */ +export interface LoadedRecipe { + id: string; + sql: string; + description: string | undefined; + body: string | undefined; + actions: RecipeAction[] | undefined; + source: "bundled" | "project"; + shadows: boolean; +} + +export interface LoadRecipesOpts { + /** + * Absolute path to the directory containing bundled recipe `.sql` files. + * Resolved by the caller via `resolveBundledRecipesDir()` (npm package + * layout — `templates/recipes/` next to `templates/agents/`). + */ + bundledDir: string; + /** + * Absolute path to the project's `.codemap/recipes/` directory, or + * `undefined` if it doesn't exist. Tracer 3 wires this; Tracer 1 + * accepts but doesn't read it. + */ + projectDir: string | undefined; +} + +/** + * Eager loader — reads every `.sql` from `bundledDir` (and `projectDir` + * once Tracer 3 lands), pairs each with optional `.md`, applies + * load-time validation (non-empty SQL after stripping comments; + * lexical DML/DDL deny-list — Tracer 5), and returns the merged list. + * + * Project recipes win on id collision (`shadows: true` flag; see plan + * §9 Q-E). Per plan §9 Q-B (eager startup load), this is called once + * at module init in `cli/query-recipes.ts`'s shim layer; the result + * is module-cached for the process lifetime. + */ +export function loadAllRecipes(opts: LoadRecipesOpts): LoadedRecipe[] { + const bundled = readRecipesFromDir(opts.bundledDir, "bundled"); + const project = + opts.projectDir !== undefined + ? readRecipesFromDir(opts.projectDir, "project") + : []; + return mergeRecipes(bundled, project); +} + +/** + * Project recipes win on id collision; matching bundled entries are filtered + * out and the project entry's `shadows` flag is flipped to `true`. Order: + * project first (in id order), then bundled (in id order) — the catalog + * surface stays deterministic per directory listing. + */ +export function mergeRecipes( + bundled: LoadedRecipe[], + project: LoadedRecipe[], +): LoadedRecipe[] { + const projectIds = new Set(project.map((r) => r.id)); + const flaggedProject = project.map((r) => ({ + ...r, + shadows: projectIds.has(r.id) && bundled.some((b) => b.id === r.id), + })); + const filteredBundled = bundled.filter((r) => !projectIds.has(r.id)); + return [...flaggedProject, ...filteredBundled].sort((a, b) => + a.id.localeCompare(b.id), + ); +} + +/** + * Read every `.sql` from `dir`, pair with optional `.md`. Returns + * `[]` if the directory doesn't exist (project-recipes case — absence of + * `.codemap/recipes/` is not an error). Throws with recipe-aware error + * messages if a `.sql` fails load-time validation (empty after + * comment-stripping, or starts with a DML / DDL keyword). + * + * The runtime `PRAGMA query_only=1` backstop in `executeQuery` (PR #35) + * stays as the parser-proof safety net for anything this lexical scan + * can't catch (multi-statement payloads, `WITH foo AS (DELETE …) SELECT` + * sub-queries, attached databases). Different jobs: lexical = good UX + * for common mistakes; backstop = correctness no matter what. + */ +export function readRecipesFromDir( + dir: string, + source: "bundled" | "project", +): LoadedRecipe[] { + if (!existsSync(dir)) return []; + const stat = statSync(dir); + if (!stat.isDirectory()) return []; + + const entries = readdirSync(dir); + const recipes: LoadedRecipe[] = []; + + for (const entry of entries) { + if (!entry.endsWith(".sql")) continue; + const id = entry.slice(0, -".sql".length); + if (id.length === 0) continue; + const sqlPath = join(dir, entry); + const sql = readFileSync(sqlPath, "utf8"); + validateRecipeSql(id, sqlPath, sql); + + const mdPath = join(dir, `${id}.md`); + const md = existsSync(mdPath) ? readFileSync(mdPath, "utf8") : undefined; + const { actions, body } = + md !== undefined + ? extractFrontmatterAndBody(md) + : { actions: undefined, body: undefined }; + const description = + body !== undefined ? firstNonEmptyLine(body) : undefined; + + recipes.push({ + id, + sql, + description, + body, + actions, + source, + shadows: false, + }); + } + + return recipes.sort((a, b) => a.id.localeCompare(b.id)); +} + +/** + * Throws with a recipe-aware message if `sql` is empty (after stripping + * `--` line comments) or starts with a DML / DDL keyword. Caller keeps + * the path for the error message; the parser-proof runtime backstop in + * `executeQuery` is the safety net beyond this. + */ +export function validateRecipeSql( + id: string, + sqlPath: string, + sql: string, +): void { + if (isEffectivelyEmpty(sql)) { + throw new Error( + `Recipe "${id}" at ${sqlPath} is empty (no SQL after stripping -- comments and whitespace).`, + ); + } + const firstKeyword = firstSqlKeyword(sql); + if (firstKeyword !== undefined && DML_DDL_DENY.has(firstKeyword)) { + throw new Error( + `Recipe "${id}" at ${sqlPath} starts with "${firstKeyword}" — recipes must be read-only. Use \`codemap query --save-baseline\` for capturing rows; the runtime PRAGMA query_only=1 guard would also reject this at execution time.`, + ); + } +} + +const DML_DDL_DENY = new Set([ + "INSERT", + "UPDATE", + "DELETE", + "DROP", + "CREATE", + "ALTER", + "ATTACH", + "DETACH", + "REPLACE", + "TRUNCATE", + "VACUUM", + "PRAGMA", +]); + +/** + * First identifier-shaped token in `sql` after stripping `--` line + * comments and leading whitespace. Returns the upper-cased keyword + * (SQLite is case-insensitive for keywords) or `undefined` if no token + * exists. Doesn't try to be clever about strings or block-style comments + * (those are rare in recipes; the runtime backstop catches what slips by). + */ +function firstSqlKeyword(sql: string): string | undefined { + const stripped = stripLineComments(sql); + const match = stripped.match(/[A-Za-z_][A-Za-z0-9_]*/); + return match === null ? undefined : match[0].toUpperCase(); +} + +function stripLineComments(sql: string): string { + // Strip block comments first so that a leading `/* DELETE FROM x */` doesn't + // smuggle a deny-listed keyword past the lexer, and so that pure-comment + // recipes (block-comment only, no actual SQL) trip the empty-recipe check. + // Greedy-but-non-overlapping match; doesn't try to track nested comments + // (SQLite doesn't support them) or escape sequences inside strings (recipes + // mixing block comments with string literals are vanishingly rare and the + // runtime PRAGMA query_only=1 backstop catches anything that slips by). + const noBlock = sql.replace(/\/\*[\s\S]*?\*\//g, ""); + return noBlock + .split("\n") + .map((line) => { + const idx = line.indexOf("--"); + return idx === -1 ? line : line.slice(0, idx); + }) + .join("\n"); +} + +/** + * Strip `--` line comments and trailing whitespace; return true if nothing + * meaningful remains. + */ +function isEffectivelyEmpty(sql: string): boolean { + return stripLineComments(sql).trim().length === 0; +} + +function firstNonEmptyLine(text: string): string | undefined { + for (const raw of text.split("\n")) { + const trimmed = raw.trim(); + if (trimmed.length === 0) continue; + // Strip leading Markdown header markers so "# Fan-out" → "Fan-out". + return trimmed.replace(/^#+\s+/, ""); + } + return undefined; +} + +/** + * Hand-rolled YAML frontmatter parser scoped to codemap's recipe needs. + * Reads one optional `actions` list of RecipeAction-shaped items between + * `---` delimiters at the top of the file. Per plan §9 Q-D: recipe-specific + * shallow shape only; reject anything weirder so authors get clear errors + * instead of half-parsed YAML edge cases. + * + * Returns the parsed actions (or undefined when the file has no + * frontmatter / no actions key) plus the body — file content with the + * frontmatter block stripped, used as the description body downstream. + */ +export function extractFrontmatterAndBody(md: string): { + actions: RecipeAction[] | undefined; + body: string; +} { + // Frontmatter must start at byte 0 with three dashes + newline (LF or + // CRLF); anything else is treated as plain Markdown. + const startMatch = md.match(/^---\r?\n/); + if (startMatch === null) { + return { actions: undefined, body: md }; + } + const afterStart = md.slice(startMatch[0].length); + const endMatch = afterStart.match(/\n---\r?\n/); + if (endMatch === null) { + return { actions: undefined, body: md }; + } + const fmText = afterStart.slice(0, endMatch.index); + const body = afterStart.slice(endMatch.index! + endMatch[0].length); + const actions = parseActionsFromFrontmatter(fmText); + return { actions, body }; +} + +// Parses the actions block from the frontmatter text. Strict shape — one +// top-level "actions" key whose value is a list of items with a required +// "type" field plus optional "auto_fixable" (boolean) and "description" +// (string). Returns undefined when no actions key is found. Other top-level +// keys are tolerated (forward-compat for future recipe metadata). +function parseActionsFromFrontmatter(fm: string): RecipeAction[] | undefined { + const lines = fm.split(/\r?\n/); + let i = 0; + while (i < lines.length) { + const line = lines[i]!; + if (/^\s*$/.test(line)) { + i++; + continue; + } + const keyMatch = line.match(/^([A-Za-z_][A-Za-z0-9_]*)\s*:\s*$/); + if (keyMatch !== null && keyMatch[1] === "actions") { + return parseActionList(lines, i + 1); + } + i++; + } + return undefined; +} + +function parseActionList(lines: string[], startIdx: number): RecipeAction[] { + const out: RecipeAction[] = []; + let i = startIdx; + let current: RecipeAction | undefined; + + while (i < lines.length) { + const line = lines[i]!; + // Stop at the next top-level YAML key (no leading whitespace + colon). + if (/^[A-Za-z_]/.test(line)) break; + + // List-item start (e.g. " - type: foo"). + const itemMatch = line.match(/^\s*-\s+(\w+)\s*:\s*(.*)$/); + if (itemMatch !== null) { + if (current !== undefined) out.push(current); + const [, key, raw] = itemMatch; + const value = parseScalar(raw!); + current = applyKey({ type: "" }, key!, value); + i++; + continue; + } + + // Continuation key on the same item (e.g. " description: foo"). + const contMatch = line.match(/^\s+(\w+)\s*:\s*(.*)$/); + if (contMatch !== null && current !== undefined) { + const [, key, raw] = contMatch; + current = applyKey(current, key!, parseScalar(raw!)); + i++; + continue; + } + + // Anything else is unrecognised — stop parsing this list and let + // downstream surface it as "actions block had unexpected content" + // if we ever need stricter errors. For now, terminate cleanly. + break; + } + + if (current !== undefined) out.push(current); + // Filter out items missing required `type` field (defensive — strict + // YAML would error here, but we fail open on malformed entries). + return out.filter((a) => a.type.length > 0); +} + +function parseScalar(raw: string): string | boolean { + const trimmed = raw.trim(); + if (trimmed === "true") return true; + if (trimmed === "false") return false; + // Strip surrounding quotes (single or double). + if ( + (trimmed.startsWith('"') && trimmed.endsWith('"')) || + (trimmed.startsWith("'") && trimmed.endsWith("'")) + ) { + return trimmed.slice(1, -1); + } + return trimmed; +} + +function applyKey( + action: RecipeAction, + key: string, + value: string | boolean, +): RecipeAction { + const next = { ...action }; + if (key === "type" && typeof value === "string") next.type = value; + else if (key === "auto_fixable" && typeof value === "boolean") + next.auto_fixable = value; + else if (key === "description" && typeof value === "string") + next.description = value; + // Unknown keys silently ignored (forward-compat). + return next; +} diff --git a/src/cli/query-recipes.test.ts b/src/cli/query-recipes.test.ts new file mode 100644 index 00000000..326e8661 --- /dev/null +++ b/src/cli/query-recipes.test.ts @@ -0,0 +1,151 @@ +import { afterEach, beforeEach, describe, expect, it } from "bun:test"; +import { mkdirSync, mkdtempSync, rmSync, writeFileSync } from "node:fs"; +import { tmpdir } from "node:os"; +import { join } from "node:path"; + +import { resolveCodemapConfig } from "../config"; +import { initCodemap } from "../runtime"; +import { + _resetRecipesCacheForTests, + getQueryRecipeActions, + getQueryRecipeCatalogEntry, + getQueryRecipeSql, + listQueryRecipeCatalog, + listQueryRecipeIds, + resolveProjectRecipesDir, +} from "./query-recipes"; + +let projectRoot: string; + +beforeEach(() => { + projectRoot = mkdtempSync(join(tmpdir(), "query-recipes-")); + initCodemap(resolveCodemapConfig(projectRoot, undefined)); + _resetRecipesCacheForTests(); +}); + +afterEach(() => { + rmSync(projectRoot, { recursive: true, force: true }); + _resetRecipesCacheForTests(); +}); + +describe("resolveProjectRecipesDir", () => { + it("returns undefined when .codemap/recipes/ is absent", () => { + expect(resolveProjectRecipesDir(projectRoot)).toBeUndefined(); + }); + + it("returns the directory path when present", () => { + const recipesDir = join(projectRoot, ".codemap", "recipes"); + mkdirSync(recipesDir, { recursive: true }); + expect(resolveProjectRecipesDir(projectRoot)).toBe(recipesDir); + }); + + it("returns undefined when .codemap/recipes is a file (not directory)", () => { + mkdirSync(join(projectRoot, ".codemap"), { recursive: true }); + writeFileSync(join(projectRoot, ".codemap", "recipes"), "not a dir"); + expect(resolveProjectRecipesDir(projectRoot)).toBeUndefined(); + }); +}); + +describe("query-recipes shim — project recipes via runtime root", () => { + it("bundled-only when no .codemap/recipes/ exists", () => { + const ids = listQueryRecipeIds(); + expect(ids).toContain("fan-out"); + expect(ids).toContain("deprecated-symbols"); + // No project recipes; every entry in the catalog has source: "bundled". + // (catalog shape is the legacy QueryRecipeCatalogEntry through Tracer 4 + // — Tracer 4 adds source/body/shadows fields. For now confirm presence.) + expect(ids.length).toBeGreaterThan(0); + }); + + it("loads project-local recipes from .codemap/recipes/.sql", () => { + const recipesDir = join(projectRoot, ".codemap", "recipes"); + mkdirSync(recipesDir, { recursive: true }); + writeFileSync( + join(recipesDir, "internal-flaky-tests.sql"), + "SELECT path FROM files WHERE 1=0\n", + ); + _resetRecipesCacheForTests(); + + expect(listQueryRecipeIds()).toContain("internal-flaky-tests"); + expect(getQueryRecipeSql("internal-flaky-tests")).toContain("WHERE 1=0"); + }); + + it("project recipe shadows bundled — getQueryRecipeSql returns project version", () => { + const recipesDir = join(projectRoot, ".codemap", "recipes"); + mkdirSync(recipesDir, { recursive: true }); + writeFileSync( + join(recipesDir, "fan-out.sql"), + "SELECT 'project override' AS marker\n", + ); + _resetRecipesCacheForTests(); + + const sql = getQueryRecipeSql("fan-out"); + expect(sql).toContain("project override"); + // The bundled fan-out had `actions` (review-coupling) — project version + // doesn't carry actions until Tracer 5 wires YAML frontmatter. + expect(getQueryRecipeActions("fan-out")).toBeUndefined(); + }); + + it("listQueryRecipeCatalog includes project recipes alongside bundled", () => { + const recipesDir = join(projectRoot, ".codemap", "recipes"); + mkdirSync(recipesDir, { recursive: true }); + writeFileSync(join(recipesDir, "owner-fanout.sql"), "SELECT 1 AS x\n"); + _resetRecipesCacheForTests(); + + const catalog = listQueryRecipeCatalog(); + const ids = catalog.map((c) => c.id); + expect(ids).toContain("owner-fanout"); + expect(ids).toContain("fan-out"); + }); +}); + +describe("query-recipes shim — catalog source / shadows / body fields (Tracer 4)", () => { + it("bundled entries carry source: 'bundled' and no shadows flag", () => { + const fanOut = listQueryRecipeCatalog().find((c) => c.id === "fan-out"); + expect(fanOut?.source).toBe("bundled"); + expect(fanOut?.shadows).toBeUndefined(); + }); + + it("bundled entries carry body when sibling .md exists", () => { + const fanOut = listQueryRecipeCatalog().find((c) => c.id === "fan-out"); + expect(fanOut?.body).toBeDefined(); + expect(fanOut?.body).toContain("Top 10 files by dependency fan-out"); + }); + + it("project entries carry source: 'project' (no bundled clash → no shadows)", () => { + const recipesDir = join(projectRoot, ".codemap", "recipes"); + mkdirSync(recipesDir, { recursive: true }); + writeFileSync(join(recipesDir, "internal-fizz.sql"), "SELECT 1\n"); + _resetRecipesCacheForTests(); + + const fizz = listQueryRecipeCatalog().find((c) => c.id === "internal-fizz"); + expect(fizz?.source).toBe("project"); + expect(fizz?.shadows).toBeUndefined(); + }); + + it("project recipe shadowing bundled carries shadows: true", () => { + const recipesDir = join(projectRoot, ".codemap", "recipes"); + mkdirSync(recipesDir, { recursive: true }); + writeFileSync( + join(recipesDir, "fan-out.sql"), + "SELECT 'project override' AS marker\n", + ); + _resetRecipesCacheForTests(); + + const fanOut = listQueryRecipeCatalog().find((c) => c.id === "fan-out"); + expect(fanOut?.source).toBe("project"); + expect(fanOut?.shadows).toBe(true); + }); +}); + +describe("getQueryRecipeCatalogEntry (single-id lookup)", () => { + it("returns the same entry shape as listQueryRecipeCatalog for known id", () => { + const fromList = listQueryRecipeCatalog().find((c) => c.id === "fan-out"); + const fromGet = getQueryRecipeCatalogEntry("fan-out"); + expect(fromGet).toEqual(fromList); + }); + + it("returns undefined for unknown id", () => { + expect(getQueryRecipeCatalogEntry("no-such-recipe")).toBeUndefined(); + }); +}); diff --git a/src/cli/query-recipes.ts b/src/cli/query-recipes.ts index d258c950..4fdf87e6 100644 --- a/src/cli/query-recipes.ts +++ b/src/cli/query-recipes.ts @@ -1,256 +1,201 @@ -/** - * One agent-facing follow-up suggested for every row of a recipe's result. - * Recipe authors hand-write this alongside the SQL (predictable: every row gets - * the same template). Ad-hoc SQL never carries actions — recipe-only feature. - * - * `auto_fixable` defaults to `false` when omitted. `description` is human prose - * for the agent to surface; `type` is a stable kebab-case verb the agent can - * key off (`delete-file`, `split-barrel`, `flag-caller`, …). - */ -export interface RecipeAction { - type: string; - auto_fixable?: boolean; - description?: string; -} +import { existsSync, statSync } from "node:fs"; +import { dirname, join } from "node:path"; +import { fileURLToPath } from "node:url"; + +import { loadAllRecipes } from "../application/recipes-loader"; +import type { LoadedRecipe } from "../application/recipes-loader"; +import { getProjectRoot } from "../runtime"; + +export type { RecipeAction } from "../application/recipes-loader"; +import type { RecipeAction } from "../application/recipes-loader"; /** - * One bundled recipe: id, human description, SQL, and optional per-row actions - * (canonical source for CLI, `--recipes-json`, and the JSON output enrichment). + * Catalog entry surfaced to `--recipes-json`, the `codemap://recipes` MCP + * resource, and the per-id `codemap://recipes/{id}` lookup. Backwards-compat + * shape with three extensions added in Tracer 4: + * + * - **`body`** — full Markdown body of the sibling `.md` (when present); + * description is the first non-empty line of that body. + * - **`source`** — `"bundled"` (ships with the npm package) or `"project"` + * (loaded from `/.codemap/recipes/`). + * - **`shadows`** — `true` when a project recipe overrides a bundled recipe + * of the same id (per plan §9 Q-E — agents read this at session start to + * know when a recipe behaves differently from the documented bundled + * version). Absent / `false` for non-shadowing entries. */ export interface QueryRecipeCatalogEntry { id: string; description: string; + body?: string; sql: string; actions?: RecipeAction[]; + source: "bundled" | "project"; + shadows?: boolean; +} + +/** + * Directory containing the bundled recipe `.sql` + `.md` files (next to + * `dist/` and `templates/agents/` in the published npm artifact). Mirrors + * `resolveAgentsTemplateDir()`'s layout — see [`docs/architecture.md` + * § Recipes wiring]. + */ +export function resolveBundledRecipesDir(): string { + return join( + dirname(fileURLToPath(import.meta.url)), + "..", + "..", + "templates", + "recipes", + ); } /** - * Bundled read-only SQL for `codemap query --recipe `. Keys match **`codemap query --help`**. + * Returns `/.codemap/recipes/` if it exists as a directory, + * else `undefined`. Per plan §9 Q-C, root-only — no walk-up; same root + * the CLI's `--root` / `CODEMAP_ROOT` resolves to. + */ +export function resolveProjectRecipesDir( + projectRoot: string, +): string | undefined { + const dir = join(projectRoot, ".codemap", "recipes"); + if (!existsSync(dir)) return undefined; + if (!statSync(dir).isDirectory()) return undefined; + return dir; +} + +/** + * Module-cached registry — populated lazily on first access (loader is pure; + * the cache means we pay the filesystem read once per process lifetime per + * plan §9 Q-B). Cache key includes `projectDir` so that a process running + * against multiple roots (test fixtures, multi-root MCP sessions later) + * re-resolves when the root changes. * - * `actions` (optional) is appended to each row in `--json` output so agents see - * the recommended follow-up alongside the data. Add an `actions` array on a - * recipe only when there's a concrete next step the agent should consider for - * every row — counts-by-kind and similar aggregates intentionally omit it. + * Per Tracer 5: `actions` come from YAML frontmatter on each `.md` for + * BOTH bundled and project recipes — uniform shape, no special-casing. + */ +let cachedRegistry: LoadedRecipe[] | undefined; +let cachedRegistryProjectDir: string | undefined; + +function getRegistry(): LoadedRecipe[] { + // `getProjectRoot()` throws if `initCodemap()` hasn't run; that only + // happens for direct unit tests of this module pre-bootstrap. Treat + // that as "no project recipes" — bundled-only registry. + let projectDir: string | undefined; + try { + projectDir = resolveProjectRecipesDir(getProjectRoot()); + } catch { + projectDir = undefined; + } + + if (cachedRegistry !== undefined && cachedRegistryProjectDir === projectDir) { + return cachedRegistry; + } + + cachedRegistry = loadAllRecipes({ + bundledDir: resolveBundledRecipesDir(), + projectDir, + }); + cachedRegistryProjectDir = projectDir; + return cachedRegistry; +} + +/** + * Reset the module cache — test-only escape hatch for fixture swaps. + */ +export function _resetRecipesCacheForTests(): void { + cachedRegistry = undefined; + cachedRegistryProjectDir = undefined; +} + +/** + * Bundled read-only SQL for `codemap query --recipe `. Backwards-compat + * shim — derives from the registry; new callers should use the loader's + * {@link LoadedRecipe} shape via `listQueryRecipeCatalog()` (richer fields: + * `body`, `source`, `shadows`). */ export const QUERY_RECIPES: Record< string, { sql: string; description: string; actions?: RecipeAction[] } -> = { - "fan-out": { - description: "Top 10 files by dependency fan-out (edge count)", - sql: `SELECT from_path, COUNT(*) AS deps -FROM dependencies -GROUP BY from_path -ORDER BY deps DESC, from_path ASC -LIMIT 10`, - actions: [ - { - type: "review-coupling", - description: - "High fan-out usually means orchestrator role; consider extracting helpers or splitting responsibilities.", - }, - ], - }, - "fan-out-sample": { - description: - "Top 10 by fan-out, plus up to five sample dependency targets per file", - sql: `SELECT d.from_path, - COUNT(*) AS deps, - (SELECT GROUP_CONCAT(to_path, ' | ') - FROM (SELECT to_path FROM dependencies d2 WHERE d2.from_path = d.from_path ORDER BY to_path ASC LIMIT 5)) - AS sample_targets -FROM dependencies d -GROUP BY d.from_path -ORDER BY deps DESC, d.from_path ASC -LIMIT 10`, - }, - /** - * Same ranking as `fan-out-sample`, but sample targets as a JSON array (SQLite JSON1 - * `json_group_array`). Prefer `fan-out-sample` if JSON1 is unavailable. - */ - "fan-out-sample-json": { - description: - "Like fan-out-sample, but sample_targets is a JSON array (requires JSON1)", - sql: `SELECT d.from_path, - COUNT(*) AS deps, - (SELECT json_group_array(to_path) - FROM (SELECT to_path FROM dependencies d2 WHERE d2.from_path = d.from_path ORDER BY to_path ASC LIMIT 5)) - AS sample_targets -FROM dependencies d -GROUP BY d.from_path -ORDER BY deps DESC, d.from_path ASC -LIMIT 10`, - }, - /** - * Files most imported/depended-on (complement to fan-out). - */ - "fan-in": { - description: "Top 15 files by fan-in (how many other files depend on them)", - sql: `SELECT to_path, COUNT(*) AS fan_in -FROM dependencies -GROUP BY to_path -ORDER BY fan_in DESC, to_path ASC -LIMIT 15`, - actions: [ - { - type: "review-stability", - description: - "High fan-in: changes here ripple through many consumers. Protect with tests before refactoring.", - }, - ], - }, - "index-summary": { - description: - "Single row: row counts for main tables (quick health snapshot)", - sql: `SELECT - (SELECT COUNT(*) FROM files) AS files, - (SELECT COUNT(*) FROM symbols) AS symbols, - (SELECT COUNT(*) FROM imports) AS imports, - (SELECT COUNT(*) FROM components) AS components, - (SELECT COUNT(*) FROM dependencies) AS dependencies`, - }, - "files-largest": { - description: "Top 20 files by line count (size/complexity hotspots)", - sql: `SELECT path, line_count, size, language -FROM files -ORDER BY line_count DESC, path ASC -LIMIT 20`, - actions: [ - { - type: "split-file", - description: - "Files this large are typical refactor candidates. Look for cohesive sub-modules to extract.", - }, - ], - }, - /** - * Hook count uses comma tally + 1 on the stored JSON array (Codemap emits flat - * `["useFoo","useBar"]` shapes). Avoids SQLite JSON1 (`json_array_length`) so - * the recipe runs on any SQLite build the CLI already supports. - */ - "components-by-hooks": { - description: - "React components with the most hooks (comma count on stored JSON array)", - sql: `SELECT name, file_path, - CASE - WHEN hooks_used IS NULL OR trim(hooks_used) = '' OR trim(hooks_used) = '[]' THEN 0 - ELSE (length(hooks_used) - length(replace(hooks_used, ',', ''))) + 1 - END AS hook_count -FROM components -ORDER BY hook_count DESC, file_path ASC, name ASC -LIMIT 20`, - }, - "markers-by-kind": { - description: "Marker counts by kind (TODO, FIXME, …)", - sql: `SELECT kind, COUNT(*) AS count -FROM markers -GROUP BY kind -ORDER BY count DESC, kind ASC`, +> = new Proxy( + {}, + { + get(_target, prop) { + if (typeof prop !== "string") return undefined; + const recipe = getRegistry().find((r) => r.id === prop); + if (recipe === undefined) return undefined; + return { + sql: recipe.sql, + description: recipe.description ?? recipe.id, + ...(recipe.actions !== undefined ? { actions: recipe.actions } : {}), + }; + }, + ownKeys() { + return getRegistry().map((r) => r.id); + }, + getOwnPropertyDescriptor(_target, prop) { + if (typeof prop !== "string") return undefined; + const recipe = getRegistry().find((r) => r.id === prop); + if (recipe === undefined) return undefined; + return { + enumerable: true, + configurable: true, + value: { + sql: recipe.sql, + description: recipe.description ?? recipe.id, + ...(recipe.actions !== undefined ? { actions: recipe.actions } : {}), + }, + }; + }, }, - /** - * Symbols documented with `@deprecated` in their leading JSDoc. Useful for - * agents to flag callers of soon-to-be-removed APIs before suggesting changes. - */ - "deprecated-symbols": { - description: - "Symbols whose JSDoc contains @deprecated (caller-warning candidates)", - sql: `SELECT name, kind, file_path, line_start, signature, doc_comment -FROM symbols -WHERE doc_comment LIKE '%@deprecated%' -ORDER BY file_path ASC, line_start ASC -LIMIT 50`, - actions: [ - { - type: "flag-caller", - description: - "Warn before suggesting changes that depend on this symbol; check callers via the calls table.", - }, - ], - }, - /** - * Symbols carrying JSDoc visibility tags (`@internal`, `@private`, `@alpha`, - * `@beta`). Useful for agents to know what is *not* part of the public API - * before suggesting imports or extending re-exports. - */ - "visibility-tags": { - description: - "Symbols carrying a JSDoc visibility tag (public / private / internal / alpha / beta)", - sql: `SELECT name, kind, visibility, file_path, line_start, signature, doc_comment -FROM symbols -WHERE visibility IS NOT NULL -ORDER BY file_path ASC, line_start ASC -LIMIT 100`, - actions: [ - { - type: "flag-non-public", - description: - "Treat as not part of the public API unless visibility = 'public': don't import from package consumers; check the visibility tag before extending re-exports.", - }, - ], - }, - /** - * All indexed file paths with their content hash. Powers the \`codemap validate\` - * CLI: callers diff this list against on-disk content to detect stale entries - * without paying to re-read every file. - */ - "files-hashes": { - description: - "All indexed files with content_hash (input for staleness checks)", - sql: `SELECT path, content_hash, language, line_count -FROM files -ORDER BY path ASC`, - }, - /** - * "Barrel" candidates — files that re-export a lot. High export count can - * indicate either an intentional public API surface or accidental fan-out; - * agents can use it to decide whether a new export should land here or stay local. - */ - "barrel-files": { - description: - "Top 20 files by export count (barrel / public-API candidates)", - sql: `SELECT file_path, COUNT(*) AS exports -FROM exports -GROUP BY file_path -ORDER BY exports DESC, file_path ASC -LIMIT 20`, - actions: [ - { - type: "split-barrel", - description: - "Confirm this is an intentional public-API surface; if it's accidental fan-out, consider splitting into smaller barrels.", - }, - ], - }, -}; +); /** * Sorted recipe ids (same set as {@link QUERY_RECIPES}). */ export function listQueryRecipeIds(): string[] { - return Object.keys(QUERY_RECIPES).sort(); + return getRegistry().map((r) => r.id); } /** - * Full catalog for **`codemap query --recipes-json`** — derived from {@link QUERY_RECIPES} only. + * Full catalog for **`codemap query --recipes-json`** and the + * `codemap://recipes` MCP resource. Per Tracer 4, includes `body`, + * `source`, and `shadows` fields on each entry. */ export function listQueryRecipeCatalog(): QueryRecipeCatalogEntry[] { - return listQueryRecipeIds().map((id) => { - const meta = QUERY_RECIPES[id]!; - const entry: QueryRecipeCatalogEntry = { - id, - description: meta.description, - sql: meta.sql, - }; - if (meta.actions !== undefined) entry.actions = meta.actions; - return entry; - }); + return getRegistry().map((r) => buildCatalogEntry(r)); +} + +/** + * Single-entry lookup for the `codemap://recipes/{id}` MCP resource and any + * future `--recipe-json ` CLI shape. Returns `undefined` for unknown + * ids; otherwise the same {@link QueryRecipeCatalogEntry} shape as the + * full-catalog listing. + */ +export function getQueryRecipeCatalogEntry( + id: string, +): QueryRecipeCatalogEntry | undefined { + const recipe = getRegistry().find((r) => r.id === id); + return recipe === undefined ? undefined : buildCatalogEntry(recipe); +} + +function buildCatalogEntry(r: LoadedRecipe): QueryRecipeCatalogEntry { + const entry: QueryRecipeCatalogEntry = { + id: r.id, + description: r.description ?? r.id, + sql: r.sql, + source: r.source, + }; + if (r.body !== undefined) entry.body = r.body; + if (r.actions !== undefined) entry.actions = r.actions; + if (r.shadows) entry.shadows = true; + return entry; } /** * Returns the SQL string for a recipe id, or `undefined` if unknown. */ export function getQueryRecipeSql(id: string): string | undefined { - return QUERY_RECIPES[id]?.sql; + return getRegistry().find((r) => r.id === id)?.sql; } /** @@ -259,5 +204,5 @@ export function getQueryRecipeSql(id: string): string | undefined { * ad-hoc SQL never gets actions. */ export function getQueryRecipeActions(id: string): RecipeAction[] | undefined { - return QUERY_RECIPES[id]?.actions; + return getRegistry().find((r) => r.id === id)?.actions; } diff --git a/templates/agents/rules/codemap.md b/templates/agents/rules/codemap.md index 0d0d4c85..9d5733e2 100644 --- a/templates/agents/rules/codemap.md +++ b/templates/agents/rules/codemap.md @@ -36,6 +36,21 @@ Install **[@stainless-code/codemap](https://www.npmjs.com/package/@stainless-cod **Recipe `actions`:** with **`--json`**, recipes that define an `actions` template append it to every row (kebab-case verb + description — e.g. `fan-out` → `review-coupling`). Under `--baseline`, actions attach to the **`added`** rows only. Inspect via **`--recipes-json`**. Ad-hoc SQL never carries actions. +**Project-local recipes:** drop `.sql` (and optional `.md` for description + actions) into **`/.codemap/recipes/`** — auto-discovered, runs via `codemap query --recipe ` like bundled. Project recipes win on id collision; check `codemap query --recipes-json` for **`shadows: true`** entries to know when a project recipe overrides the documented bundled version. `.md` supports YAML frontmatter for the per-row action template — block-list shape only (the loader's hand-rolled parser doesn't accept inline-flow `[{...}]`): + +```markdown +--- +actions: + - type: review-coupling + auto_fixable: false + description: "High fan-out usually means orchestrator role." +--- + +(Markdown body — first non-empty line becomes the catalog description.) +``` + +Validation: SQL is rejected at load time if it starts with DML/DDL (DELETE/DROP/UPDATE/etc.); the runtime `PRAGMA query_only=1` is the parser-proof backstop. + **Baselines** (`query_baselines` table inside `.codemap.db`, no parallel JSON files): `--save-baseline[=]` snapshots a result set; `--baseline[=]` diffs the current result against it (added / removed rows; identity = `JSON.stringify(row)`). Name defaults to the `--recipe` id; ad-hoc SQL needs an explicit `=`. Survives `--full` and SCHEMA bumps. **Audit (`codemap audit`)**: structural-drift command; emits `{head, deltas: {files, dependencies, deprecated}}` (each delta carries its own `base` metadata). Reuses B.6 baselines as the snapshot source. Two CLI shapes — `--baseline ` auto-resolves `-files` / `-dependencies` / `-deprecated`; `---baseline ` is the explicit per-delta override. v1 ships no `verdict` / threshold config — consumers compose `--json` + `jq` for CI exit codes. Auto-runs an incremental index before the diff (use `--no-index` to skip for frozen-DB CI). diff --git a/templates/agents/skills/codemap/SKILL.md b/templates/agents/skills/codemap/SKILL.md index 515e72d6..60de5873 100644 --- a/templates/agents/skills/codemap/SKILL.md +++ b/templates/agents/skills/codemap/SKILL.md @@ -45,6 +45,7 @@ Replace placeholders (`'...'`) with your module path, file glob, or symbol name. - **`--baseline[=]`** — diff the current result against the saved baseline. Output `{baseline:{...}, current_row_count, added: [...], removed: [...]}` (with `--json`) or a two-section terminal dump. Identity = per-row multiset equality (canonical `JSON.stringify` keyed frequency map; duplicates preserved). Pair with `--summary` for `{baseline:{...}, current_row_count, added: N, removed: N}`. **Mutually exclusive with `--group-by`.** - **`--baselines`** lists saved baselines (no `rows_json` payload); **`--drop-baseline `** deletes one. Both reject every other flag — they're list-only / drop-only operations. - **Per-row recipe `actions`** — recipes that define an **`actions: [{type, auto_fixable?, description?}]`** template append it to every row in **`--json`** output (recipe-only; ad-hoc SQL never carries actions). Under `--baseline`, actions attach to the **`added`** rows only (the rows the agent should act on). Inspect via **`--recipes-json`**. +- **Project-local recipes** — drop **`.sql`** (and optional **`.md`** for description body + actions) into **`/.codemap/recipes/`** to make team-internal SQL a first-class CLI verb. `--recipes-json` and the `codemap://recipes` MCP resource list project recipes alongside bundled ones with **`source: "bundled" | "project"`** discriminating them. Project recipes win on id collision; entries that override a bundled id carry **`shadows: true`** so agents reading the catalog at session start know when a recipe behaves differently from the documented bundled version. `.md` supports YAML frontmatter for the per-row action template — **block-list shape only** (loader's hand-rolled parser; no inline-flow `[{...}]`): `---\nactions:\n - type: my-verb\n auto_fixable: false\n description: "..."\n---`. Validation: SQL is rejected at load time if it starts with DML/DDL (DELETE/DROP/UPDATE/etc.); the runtime `PRAGMA query_only=1` is the parser-proof backstop. `.codemap.db` is gitignored; **`.codemap/recipes/` is NOT** — recipes are git-tracked source code authored for human review. **Audit (`codemap audit`)** — separate top-level command for structural-drift verdicts. Composes B.6 baselines into a per-delta `{head, deltas}` envelope; v1 ships `files` / `dependencies` / `deprecated`. Two snapshot-source shapes: @@ -69,8 +70,8 @@ Each emitted delta carries its own `base` metadata so mixed-baseline audits are **Resources (lazy-cached on first `read_resource`; constant for server-process lifetime):** -- **`codemap://recipes`** — full catalog JSON (same as `--recipes-json`). -- **`codemap://recipes/{id}`** — single recipe `{id, description, sql, actions?}`. Replaces `--print-sql `. +- **`codemap://recipes`** — full catalog JSON (same as `--recipes-json`). Each entry carries `source: "bundled" | "project"` and `shadows: true` on project entries that override a bundled recipe id. Read this at session start so you know when a `--recipe foo` call will run a project override instead of the documented bundled version. +- **`codemap://recipes/{id}`** — single recipe `{id, description, body?, sql, actions?, source, shadows?}`. Replaces `--print-sql `. - **`codemap://schema`** — DDL of every table in `.codemap.db` (queried live from `sqlite_schema`). - **`codemap://skill`** — full text of this skill file. Agents that don't preload the skill at session start can fetch it here. diff --git a/templates/recipes/barrel-files.md b/templates/recipes/barrel-files.md new file mode 100644 index 00000000..96378140 --- /dev/null +++ b/templates/recipes/barrel-files.md @@ -0,0 +1,9 @@ +--- +actions: + - type: split-barrel + description: "Confirm this is an intentional public-API surface; if it's accidental fan-out, consider splitting into smaller barrels." +--- + +Top 20 files by export count (barrel / public-API candidates) + +High export count can indicate either an intentional public API surface or accidental fan-out. Agents can use this to decide whether a new export should land here or stay local. If it's accidental fan-out, consider splitting into smaller barrels. diff --git a/templates/recipes/barrel-files.sql b/templates/recipes/barrel-files.sql new file mode 100644 index 00000000..599f4f61 --- /dev/null +++ b/templates/recipes/barrel-files.sql @@ -0,0 +1,5 @@ +SELECT file_path, COUNT(*) AS exports +FROM exports +GROUP BY file_path +ORDER BY exports DESC, file_path ASC +LIMIT 20 diff --git a/templates/recipes/components-by-hooks.md b/templates/recipes/components-by-hooks.md new file mode 100644 index 00000000..642d8cfa --- /dev/null +++ b/templates/recipes/components-by-hooks.md @@ -0,0 +1,3 @@ +React components with the most hooks (comma count on stored JSON array) + +Hook count uses comma tally + 1 on the stored JSON array (Codemap emits flat `["useFoo","useBar"]` shapes). Avoids SQLite JSON1 (`json_array_length`) so the recipe runs on any SQLite build the CLI already supports. diff --git a/templates/recipes/components-by-hooks.sql b/templates/recipes/components-by-hooks.sql new file mode 100644 index 00000000..6f4a7890 --- /dev/null +++ b/templates/recipes/components-by-hooks.sql @@ -0,0 +1,8 @@ +SELECT name, file_path, + CASE + WHEN hooks_used IS NULL OR trim(hooks_used) = '' OR trim(hooks_used) = '[]' THEN 0 + ELSE (length(hooks_used) - length(replace(hooks_used, ',', ''))) + 1 + END AS hook_count +FROM components +ORDER BY hook_count DESC, file_path ASC, name ASC +LIMIT 20 diff --git a/templates/recipes/deprecated-symbols.md b/templates/recipes/deprecated-symbols.md new file mode 100644 index 00000000..b47f599f --- /dev/null +++ b/templates/recipes/deprecated-symbols.md @@ -0,0 +1,9 @@ +--- +actions: + - type: flag-caller + description: "Warn before suggesting changes that depend on this symbol; check callers via the calls table." +--- + +Symbols whose JSDoc contains @deprecated (caller-warning candidates) + +Useful for agents to flag callers of soon-to-be-removed APIs before suggesting changes. Pair with `WHERE callee_name = ''` against the `calls` table to find the actual call sites. diff --git a/templates/recipes/deprecated-symbols.sql b/templates/recipes/deprecated-symbols.sql new file mode 100644 index 00000000..79c76bd5 --- /dev/null +++ b/templates/recipes/deprecated-symbols.sql @@ -0,0 +1,5 @@ +SELECT name, kind, file_path, line_start, signature, doc_comment +FROM symbols +WHERE doc_comment LIKE '%@deprecated%' +ORDER BY file_path ASC, line_start ASC +LIMIT 50 diff --git a/templates/recipes/fan-in.md b/templates/recipes/fan-in.md new file mode 100644 index 00000000..e72ffddd --- /dev/null +++ b/templates/recipes/fan-in.md @@ -0,0 +1,9 @@ +--- +actions: + - type: review-stability + description: "High fan-in: changes here ripple through many consumers. Protect with tests before refactoring." +--- + +Top 15 files by fan-in (how many other files depend on them) + +Files at the top are the most depended-on in the codebase (the `dependencies` table aggregates static imports, dynamic imports, and resolved module-graph edges) — changes here ripple through many consumers. Protect with tests before refactoring; treat as the project's de-facto stable API even if not formally exported. diff --git a/templates/recipes/fan-in.sql b/templates/recipes/fan-in.sql new file mode 100644 index 00000000..65086dcf --- /dev/null +++ b/templates/recipes/fan-in.sql @@ -0,0 +1,5 @@ +SELECT to_path, COUNT(*) AS fan_in +FROM dependencies +GROUP BY to_path +ORDER BY fan_in DESC, to_path ASC +LIMIT 15 diff --git a/templates/recipes/fan-out-sample-json.md b/templates/recipes/fan-out-sample-json.md new file mode 100644 index 00000000..948a8ef6 --- /dev/null +++ b/templates/recipes/fan-out-sample-json.md @@ -0,0 +1,3 @@ +Like fan-out-sample, but sample_targets is a JSON array (requires JSON1) + +Same ranking as `fan-out-sample`, but uses SQLite's JSON1 `json_group_array`. Prefer `fan-out-sample` if your SQLite build doesn't include JSON1. diff --git a/templates/recipes/fan-out-sample-json.sql b/templates/recipes/fan-out-sample-json.sql new file mode 100644 index 00000000..d97b4992 --- /dev/null +++ b/templates/recipes/fan-out-sample-json.sql @@ -0,0 +1,9 @@ +SELECT d.from_path, + COUNT(*) AS deps, + (SELECT json_group_array(to_path) + FROM (SELECT to_path FROM dependencies d2 WHERE d2.from_path = d.from_path ORDER BY to_path ASC LIMIT 5)) + AS sample_targets +FROM dependencies d +GROUP BY d.from_path +ORDER BY deps DESC, d.from_path ASC +LIMIT 10 diff --git a/templates/recipes/fan-out-sample.md b/templates/recipes/fan-out-sample.md new file mode 100644 index 00000000..e96055fb --- /dev/null +++ b/templates/recipes/fan-out-sample.md @@ -0,0 +1 @@ +Top 10 by fan-out, plus up to five sample dependency targets per file diff --git a/templates/recipes/fan-out-sample.sql b/templates/recipes/fan-out-sample.sql new file mode 100644 index 00000000..913efa3c --- /dev/null +++ b/templates/recipes/fan-out-sample.sql @@ -0,0 +1,9 @@ +SELECT d.from_path, + COUNT(*) AS deps, + (SELECT GROUP_CONCAT(to_path, ' | ') + FROM (SELECT to_path FROM dependencies d2 WHERE d2.from_path = d.from_path ORDER BY to_path ASC LIMIT 5)) + AS sample_targets +FROM dependencies d +GROUP BY d.from_path +ORDER BY deps DESC, d.from_path ASC +LIMIT 10 diff --git a/templates/recipes/fan-out.md b/templates/recipes/fan-out.md new file mode 100644 index 00000000..400c0bff --- /dev/null +++ b/templates/recipes/fan-out.md @@ -0,0 +1,9 @@ +--- +actions: + - type: review-coupling + description: "High fan-out usually means orchestrator role; consider extracting helpers or splitting responsibilities." +--- + +Top 10 files by dependency fan-out (edge count) + +Files at the top of this list act as orchestrators — they depend on many other files (the `dependencies` table aggregates static imports, dynamic imports, and resolved module-graph edges). High fan-out usually means coordination logic that's a candidate for refactoring (extracting helpers, splitting responsibilities). Pair with `fan-in` to see hubs that are both depended-on AND depend-on-many. diff --git a/templates/recipes/fan-out.sql b/templates/recipes/fan-out.sql new file mode 100644 index 00000000..24c5257f --- /dev/null +++ b/templates/recipes/fan-out.sql @@ -0,0 +1,5 @@ +SELECT from_path, COUNT(*) AS deps +FROM dependencies +GROUP BY from_path +ORDER BY deps DESC, from_path ASC +LIMIT 10 diff --git a/templates/recipes/files-hashes.md b/templates/recipes/files-hashes.md new file mode 100644 index 00000000..9632d055 --- /dev/null +++ b/templates/recipes/files-hashes.md @@ -0,0 +1,3 @@ +All indexed files with content_hash (input for staleness checks) + +Powers the `codemap validate` CLI: callers diff this list against on-disk content to detect stale entries without paying to re-read every file. diff --git a/templates/recipes/files-hashes.sql b/templates/recipes/files-hashes.sql new file mode 100644 index 00000000..bdffbdff --- /dev/null +++ b/templates/recipes/files-hashes.sql @@ -0,0 +1,3 @@ +SELECT path, content_hash, language, line_count +FROM files +ORDER BY path ASC diff --git a/templates/recipes/files-largest.md b/templates/recipes/files-largest.md new file mode 100644 index 00000000..79fa86e2 --- /dev/null +++ b/templates/recipes/files-largest.md @@ -0,0 +1,9 @@ +--- +actions: + - type: split-file + description: "Files this large are typical refactor candidates. Look for cohesive sub-modules to extract." +--- + +Top 20 files by line count (size/complexity hotspots) + +Files this large are typical refactor candidates. Look for cohesive sub-modules to extract — each split should reduce coupling, not just shuffle lines. diff --git a/templates/recipes/files-largest.sql b/templates/recipes/files-largest.sql new file mode 100644 index 00000000..c0906fef --- /dev/null +++ b/templates/recipes/files-largest.sql @@ -0,0 +1,4 @@ +SELECT path, line_count, size, language +FROM files +ORDER BY line_count DESC, path ASC +LIMIT 20 diff --git a/templates/recipes/index-summary.md b/templates/recipes/index-summary.md new file mode 100644 index 00000000..60fd634b --- /dev/null +++ b/templates/recipes/index-summary.md @@ -0,0 +1 @@ +Single row: row counts for main tables (quick health snapshot) diff --git a/templates/recipes/index-summary.sql b/templates/recipes/index-summary.sql new file mode 100644 index 00000000..638a9d76 --- /dev/null +++ b/templates/recipes/index-summary.sql @@ -0,0 +1,6 @@ +SELECT + (SELECT COUNT(*) FROM files) AS files, + (SELECT COUNT(*) FROM symbols) AS symbols, + (SELECT COUNT(*) FROM imports) AS imports, + (SELECT COUNT(*) FROM components) AS components, + (SELECT COUNT(*) FROM dependencies) AS dependencies diff --git a/templates/recipes/markers-by-kind.md b/templates/recipes/markers-by-kind.md new file mode 100644 index 00000000..43168771 --- /dev/null +++ b/templates/recipes/markers-by-kind.md @@ -0,0 +1 @@ +Marker counts by kind (TODO, FIXME, …) diff --git a/templates/recipes/markers-by-kind.sql b/templates/recipes/markers-by-kind.sql new file mode 100644 index 00000000..a8bd5a85 --- /dev/null +++ b/templates/recipes/markers-by-kind.sql @@ -0,0 +1,4 @@ +SELECT kind, COUNT(*) AS count +FROM markers +GROUP BY kind +ORDER BY count DESC, kind ASC diff --git a/templates/recipes/visibility-tags.md b/templates/recipes/visibility-tags.md new file mode 100644 index 00000000..ca8d6ec8 --- /dev/null +++ b/templates/recipes/visibility-tags.md @@ -0,0 +1,9 @@ +--- +actions: + - type: flag-non-public + description: "Treat as not part of the public API unless visibility = 'public': don't import from package consumers; check the visibility tag before extending re-exports." +--- + +Symbols carrying a JSDoc visibility tag (public / private / internal / alpha / beta) + +Useful for agents to know what is _not_ part of the public API before suggesting imports or extending re-exports. The `visibility` column is structured (parsed at index time, not regex on `doc_comment`). diff --git a/templates/recipes/visibility-tags.sql b/templates/recipes/visibility-tags.sql new file mode 100644 index 00000000..a29cef82 --- /dev/null +++ b/templates/recipes/visibility-tags.sql @@ -0,0 +1,5 @@ +SELECT name, kind, visibility, file_path, line_start, signature, doc_comment +FROM symbols +WHERE visibility IS NOT NULL +ORDER BY file_path ASC, line_start ASC +LIMIT 100