stainless-code · SutuSebastian · Jun 10, 2026 · Jun 10, 2026 · Jun 10, 2026 · Jun 10, 2026
diff --git a/.agents/skills/harden-pr/SKILL.md b/.agents/skills/harden-pr/SKILL.md
@@ -53,16 +53,16 @@ Default to **lite** when invoked immediately after a slice commit. Default to **
 
 Reviewers optimize for this bar on in-scope files. **Full** mode applies it to the entire `origin/main...HEAD` diff; **lite** to the slice diff.
 
-| Area            | Pristine =                                                                                                         |
-| --------------- | ------------------------------------------------------------------------------------------------------------------ |
-| **Correctness** | No known bugs or unhandled edge cases in changed paths; behavior matches intent anchor                             |
-| **Tests**       | Changed behavior covered; affected tests pass                                                                      |
-| **Checks**      | Format, lint, typecheck clean on touched files ([`verify-after-each-step`](../../rules/verify-after-each-step.md)) |
-| **Docs**        | User-visible changes reflected in docs, changesets, help text — no drift                                           |
-| **Surfaces**    | No maintainer leaks into consumer surfaces ([`consumer-surfaces`](../../rules/consumer-surfaces.md))               |
-| **Structure**   | No boundary violations or barrel bypasses in the diff                                                              |
-| **Hygiene**     | No dead code, TODO slop, or sloppy naming in touched files; errors actionable                                      |
-| **Ship shape**  | A reviewer could merge without "fix before ship" notes (except deferred out-of-scope nits)                         |
+| Area            | Pristine =                                                                                                                                                                                                        |
+| --------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| **Correctness** | No known bugs or unhandled edge cases in changed paths; behavior matches intent anchor                                                                                                                            |
+| **Tests**       | Changed behavior covered; affected tests pass                                                                                                                                                                     |
+| **Checks**      | Format, lint, typecheck clean on touched files ([`verify-after-each-step`](../../rules/verify-after-each-step.md))                                                                                                |
+| **Docs**        | User-visible changes reflected in docs, changesets, help text — no drift; **shipped `docs/plans/<topic>.md` deleted + lifted** in the same PR ([`docs-governance`](../docs-governance/SKILL.md) § Closing a plan) |
+| **Surfaces**    | No maintainer leaks into consumer surfaces ([`consumer-surfaces`](../../rules/consumer-surfaces.md))                                                                                                              |
+| **Structure**   | No boundary violations or barrel bypasses in the diff                                                                                                                                                             |
+| **Hygiene**     | No dead code, TODO slop, or sloppy naming in touched files; errors actionable                                                                                                                                     |
+| **Ship shape**  | A reviewer could merge without "fix before ship" notes (except deferred out-of-scope nits)                                                                                                                        |
 
 If a finding moves the bar toward pristine and stays in-bounds → **fix it**, including nits in touched files.
 
@@ -80,7 +80,9 @@ Reviewers treat the anchor as contract. Findings that would violate it → **rep
 
 **Fix:** bugs, missing tests, docs/changeset drift, lint/type/format, error-handling gaps, edge cases, **behavior-preserving refactors in touched files**, in-scope nits (naming, comment hygiene, cheap lint fixes).
 
-**Report only:** redesign, new capabilities, semantic API changes, nits outside the diff, refactors unrelated to a flagged issue.
+**Report only:** redesign, semantic API changes, nits outside the diff, refactors unrelated to a flagged issue.
+
+**Do not defer complements:** agent-surface parity (rule/skill/MCP), glossary/architecture/golden-queries contracts, script/golden tests for acceptance criteria, and cross-links named in the plan ship in the **same PR** — not "optional v2" or post-merge unless the plan **Out of scope** section explicitly excludes them.
 
 ## Reviewer roster
 
@@ -89,7 +91,7 @@ Spawn applicable reviewers **in parallel** via subagents in **one batch per pass
 ### Core (always)
 
 1. **Correctness** — gaps vs production bar; bugs, edge cases, missing tests in changed paths
-2. **Ship-readiness** — gaps vs production bar; docs, changesets, consumer-surface leaks, error messages; run [`verify-after-each-step`](../../rules/verify-after-each-step.md) checks on touched files
+2. **Ship-readiness** — gaps vs production bar; docs, changesets, consumer-surface leaks, error messages; **grep inbound refs → delete shipped plan file → lift to `golden-queries.md` / `architecture.md` / `roadmap.md`**; run [`verify-after-each-step`](../../rules/verify-after-each-step.md) checks on touched files
 3. **Structure (lite)** — gaps vs production bar; boundary smells on the diff (imports across declared layers, barrel bypasses); query codemap per [`codemap`](../codemap/SKILL.md)
 
 ### Extended (adaptive — spawn when diff triggers match)
@@ -122,7 +124,7 @@ loop:
   pass += 1
   goto loop
 capped:
-  emit deferred-nits list
+  emit deferred-nits list (each nit must cite plan Out of scope or cross-PR blocker — not "optional")
 done:
   if uncommitted fixes → git commit -m "harden: …"
   emit final report (include babysit one-liner if full mode)

diff --git a/.changeset/high-crap-score.md b/.changeset/high-crap-score.md
@@ -0,0 +1,7 @@
+---
+"@stainless-code/codemap": patch
+---
+
+Add `high-crap-score` recipe: CRAP ranking with measured coverage when ingested, or graph-estimated 85/40/0% tiers from test reachability otherwise.
+
+Extend `unimported-exports` with `unresolved_import_blind_spot` reason and `evidence_json` (unresolved import hop) so dead-export / high-CRAP triage does not over-trust the graph past alias blind spots.
diff --git a/docs/architecture.md b/docs/architecture.md
@@ -192,6 +192,8 @@ Three **mutually exclusive** CLI entry shapes; all converge on `applyDiffPayload
 
 **Evidence columns (high-judgment recipes):** Some bundled recipes add optional **`reason`** and **`evidence_json`** TEXT columns on each result row — factual detection path for agents, not pass/fail verdicts. Contract: [golden-queries.md § Evidence columns](./golden-queries.md#evidence-columns-high-judgment-recipes).
 
+**Coverage columns (CRAP recipes):** `high-crap-score` adds **`coverage_source`** and **`effective_coverage_pct`** — measured vs graph-estimated undertest signal. Contract: [golden-queries.md § Coverage columns](./golden-queries.md#coverage-columns-crap--enrichment-recipes).
+
 **Recipes wiring:** **`src/application/recipes-loader.ts`** (pure transport-agnostic loader) + **`src/application/query-recipes.ts`** (cache + public API — `getQueryRecipeSql` / `getQueryRecipeActions` / `getQueryRecipeParams` / `listQueryRecipeIds` / `listQueryRecipeCatalog` / `getQueryRecipeCatalogEntry`, shared by CLI + MCP). Recipes live as file pairs: **`<id>.sql`** + optional **`<id>.md`**. The loader reads `templates/recipes/` (bundled, ships in npm package next to `templates/agents/`) and `<state-dir>/recipes/` (project-local — default `.codemap/recipes/`; honors `--state-dir` / `CODEMAP_STATE_DIR`; root-only resolution per the registry plan, no walk-up). Project recipes win on id collision; entries that override a bundled id carry **`shadows: true`** in the catalog so agents reading `codemap://recipes` at session start see when a recipe behaves differently from the documented bundled version. Per-row **`actions`** templates and recipe **`params`** declarations live in YAML frontmatter on each `<id>.md` — uniform shape across bundled + project. Param types are `string | number | boolean`; CLI passes values via repeatable `--params key=value[,key=value]`, MCP / HTTP pass nested `params: {key: value}` to `query_recipe`. Validation runs before SQL binding; missing / unknown / malformed params return the same `{error}` envelope as query failures. Hand-rolled YAML parser is scoped to block-list `actions:` and `params:` only (no `js-yaml` dep). Load-time validation rejects empty SQL and DML / DDL keywords (`INSERT` / `UPDATE` / `DELETE` / `DROP` / `CREATE` / `ALTER` / `ATTACH` / `DETACH` / `REPLACE` / `TRUNCATE` / `VACUUM` / `PRAGMA`) with recipe-aware error messages — defence in depth alongside the runtime `PRAGMA query_only=1` backstop in `query-engine.ts` (PR #35). `<state-dir>/index.db` is gitignored; `<state-dir>/recipes/` is NOT (verified via `git check-ignore`) — recipes are git-tracked source code authored for human review.
 
 **Tool / resource handlers (transport-agnostic):** **`src/application/tool-handlers.ts`** + **`src/application/resource-handlers.ts`** — pure functions that take the args object an MCP tool / resource URI accepts and return a discriminated **`ToolResult`** (`{ok: true, format: 'json'|'sarif'|'annotations'|'mermaid'|'diff'|'diff-json'|'codeclimate'|'badge', payload}` — badge arm also carries `badgeStyle`; `{ok: false, error}`) or a **`ResourcePayload`** (`{mimeType, text}`). MCP and HTTP both wrap the same handlers — MCP translates to `{content: [{type: "text", text}]}`, HTTP translates to `(status, body)` with the right `Content-Type`. Engine layer untouched; transport changes don't ripple into the SQL.

diff --git a/docs/glossary.md b/docs/glossary.md
@@ -151,6 +151,10 @@ Opt-in FTS5 virtual table over file content (`tokenize='porter unicode61'`). Alw
 
 Output mode rendering `{from, to, label?, kind?}` rows as a Mermaid `flowchart LR` diagram. Sibling of `--format sarif` / `--format annotations` in `application/output-formatters.ts`. **Bounded-input contract** (50-edge ceiling; `MERMAID_MAX_EDGES`) — unbounded inputs reject with a scope-suggestion error naming the recipe + count + scoping knobs (`LIMIT` / `--via` / `WHERE`). Auto-truncation explicitly out of scope (would be a verdict masquerading as output mode, violating the predicate-as-API moat). Recipes / ad-hoc SQL must alias columns to the `{from, to}` shape (e.g. `SELECT from_path AS "from", to_path AS "to" FROM dependencies LIMIT 50`).
 
+### CRAP score / `high-crap-score` / `coverage_source` / `effective_coverage_pct`
+
+Change Risk Anti-Patterns score per published formula: `CC² × (1 - effective_coverage/100)³ + CC` where `CC = symbols.complexity`. Bundled recipe **`high-crap-score`** ranks symbols at or above `min_crap` (default 30). **`effective_coverage_pct`** uses ingested **`coverage.coverage_pct`** when a row exists (**`coverage_source: measured`**), else graph-estimated tiers from test reachability (**`coverage_source: estimated`**: 85% direct test reference, 40% file reachable from tests via value-only **`dependencies`**, 0% otherwise). Heuristic only — not execution coverage; prefer **`codemap ingest-coverage`** before CI gates. Complements **`high-complexity-untested`** when coverage is not ingested.
+
 ### `coverage` (table)
 
 Statement coverage ingested from Istanbul JSON, LCOV, or V8 runtime (`NODE_V8_COVERAGE=...` directory via `--runtime`) via `codemap ingest-coverage <path>`. Natural-key PK `(file_path, name, line_start)` — intentionally **not** a FK to `symbols.id` because `symbols` re-creates with fresh AUTOINCREMENT ids on every `--full` reindex; the natural-key approach lets coverage rows survive that churn (`coverage` is also intentionally absent from `dropAll()`, joins the `query_baselines` precedent). Columns: `coverage_pct REAL` (`NULL` when `total_statements = 0` — "untested" and "no testable code" are different signals), `hit_statements`, `total_statements`. Orphan rows (file deleted from project) are cleaned by an explicit `DELETE FROM coverage WHERE file_path NOT IN (SELECT path FROM files)` at the end of every ingest. Three meta keys (`coverage_last_ingested_at` / `_path` / `_format`) record freshness — single ingest at a time, so format is meta-level not per-row.

diff --git a/docs/golden-queries.md b/docs/golden-queries.md
@@ -68,7 +68,11 @@ Scenarios live in **`fixtures/golden/scenarios.json`** (Tier A) or optional **`s
 
 ### Evidence columns (high-judgment recipes)
 
-Some bundled recipes add optional **`reason`** (TEXT) and **`evidence_json`** (TEXT, JSON array) columns on each row — factual detection path for agents, not engine verdicts. See [plans/evidence-chains-on-recipe-rows.md](./plans/evidence-chains-on-recipe-rows.md). Goldens assert these columns when the recipe ships evidence (`boundary-violations`, `deprecated-symbols`, `unimported-exports`).
+Some bundled recipes add optional **`reason`** (TEXT) and **`evidence_json`** (TEXT, JSON array) columns on each row — factual detection path for agents, not engine verdicts (Moat A — not pass/fail verdicts). Bounded subqueries cap evidence at three hops; list caps append `{"truncated":true}`. `unimported-exports` reasons: `no_direct_import`, `reexport_chain_possible`, `unresolved_import_blind_spot`. Goldens assert these columns when the recipe ships evidence (`boundary-violations`, `deprecated-symbols`, `unimported-exports`).
+
+### Coverage columns (CRAP / enrichment recipes)
+
+`high-crap-score` adds **`coverage_source`** (`measured` \| `estimated`) and **`effective_coverage_pct`** on each row — measured when `coverage` has a matching symbol row after `ingest-coverage`; otherwise graph-estimated tiers from test reachability. Goldens assert `coverage_source` when the recipe ships coverage semantics (`high-crap-score`); measured override is covered by `scripts/high-crap-score-measured.test.mjs`.
 
 ---