Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 15 additions & 13 deletions .agents/skills/harden-pr/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,16 +53,16 @@ Default to **lite** when invoked immediately after a slice commit. Default to **

Reviewers optimize for this bar on in-scope files. **Full** mode applies it to the entire `origin/main...HEAD` diff; **lite** to the slice diff.

| Area | Pristine = |
| --------------- | ------------------------------------------------------------------------------------------------------------------ |
| **Correctness** | No known bugs or unhandled edge cases in changed paths; behavior matches intent anchor |
| **Tests** | Changed behavior covered; affected tests pass |
| **Checks** | Format, lint, typecheck clean on touched files ([`verify-after-each-step`](../../rules/verify-after-each-step.md)) |
| **Docs** | User-visible changes reflected in docs, changesets, help text — no drift |
| **Surfaces** | No maintainer leaks into consumer surfaces ([`consumer-surfaces`](../../rules/consumer-surfaces.md)) |
| **Structure** | No boundary violations or barrel bypasses in the diff |
| **Hygiene** | No dead code, TODO slop, or sloppy naming in touched files; errors actionable |
| **Ship shape** | A reviewer could merge without "fix before ship" notes (except deferred out-of-scope nits) |
| Area | Pristine = |
| --------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **Correctness** | No known bugs or unhandled edge cases in changed paths; behavior matches intent anchor |
| **Tests** | Changed behavior covered; affected tests pass |
| **Checks** | Format, lint, typecheck clean on touched files ([`verify-after-each-step`](../../rules/verify-after-each-step.md)) |
| **Docs** | User-visible changes reflected in docs, changesets, help text — no drift; **shipped `docs/plans/<topic>.md` deleted + lifted** in the same PR ([`docs-governance`](../docs-governance/SKILL.md) § Closing a plan) |
| **Surfaces** | No maintainer leaks into consumer surfaces ([`consumer-surfaces`](../../rules/consumer-surfaces.md)) |
| **Structure** | No boundary violations or barrel bypasses in the diff |
| **Hygiene** | No dead code, TODO slop, or sloppy naming in touched files; errors actionable |
| **Ship shape** | A reviewer could merge without "fix before ship" notes (except deferred out-of-scope nits) |

If a finding moves the bar toward pristine and stays in-bounds → **fix it**, including nits in touched files.

Expand All @@ -80,7 +80,9 @@ Reviewers treat the anchor as contract. Findings that would violate it → **rep

**Fix:** bugs, missing tests, docs/changeset drift, lint/type/format, error-handling gaps, edge cases, **behavior-preserving refactors in touched files**, in-scope nits (naming, comment hygiene, cheap lint fixes).

**Report only:** redesign, new capabilities, semantic API changes, nits outside the diff, refactors unrelated to a flagged issue.
**Report only:** redesign, semantic API changes, nits outside the diff, refactors unrelated to a flagged issue.

**Do not defer complements:** agent-surface parity (rule/skill/MCP), glossary/architecture/golden-queries contracts, script/golden tests for acceptance criteria, and cross-links named in the plan ship in the **same PR** — not "optional v2" or post-merge unless the plan **Out of scope** section explicitly excludes them.

## Reviewer roster

Expand All @@ -89,7 +91,7 @@ Spawn applicable reviewers **in parallel** via subagents in **one batch per pass
### Core (always)

1. **Correctness** — gaps vs production bar; bugs, edge cases, missing tests in changed paths
2. **Ship-readiness** — gaps vs production bar; docs, changesets, consumer-surface leaks, error messages; run [`verify-after-each-step`](../../rules/verify-after-each-step.md) checks on touched files
2. **Ship-readiness** — gaps vs production bar; docs, changesets, consumer-surface leaks, error messages; **grep inbound refs → delete shipped plan file → lift to `golden-queries.md` / `architecture.md` / `roadmap.md`**; run [`verify-after-each-step`](../../rules/verify-after-each-step.md) checks on touched files
3. **Structure (lite)** — gaps vs production bar; boundary smells on the diff (imports across declared layers, barrel bypasses); query codemap per [`codemap`](../codemap/SKILL.md)

### Extended (adaptive — spawn when diff triggers match)
Expand Down Expand Up @@ -122,7 +124,7 @@ loop:
pass += 1
goto loop
capped:
emit deferred-nits list
emit deferred-nits list (each nit must cite plan Out of scope or cross-PR blocker — not "optional")
done:
if uncommitted fixes → git commit -m "harden: …"
emit final report (include babysit one-liner if full mode)
Expand Down
7 changes: 7 additions & 0 deletions .changeset/high-crap-score.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
---
"@stainless-code/codemap": patch
---

Add `high-crap-score` recipe: CRAP ranking with measured coverage when ingested, or graph-estimated 85/40/0% tiers from test reachability otherwise.

Extend `unimported-exports` with `unresolved_import_blind_spot` reason and `evidence_json` (unresolved import hop) so dead-export / high-CRAP triage does not over-trust the graph past alias blind spots.
2 changes: 2 additions & 0 deletions docs/architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -192,6 +192,8 @@ Three **mutually exclusive** CLI entry shapes; all converge on `applyDiffPayload

**Evidence columns (high-judgment recipes):** Some bundled recipes add optional **`reason`** and **`evidence_json`** TEXT columns on each result row — factual detection path for agents, not pass/fail verdicts. Contract: [golden-queries.md § Evidence columns](./golden-queries.md#evidence-columns-high-judgment-recipes).

**Coverage columns (CRAP recipes):** `high-crap-score` adds **`coverage_source`** and **`effective_coverage_pct`** — measured vs graph-estimated undertest signal. Contract: [golden-queries.md § Coverage columns](./golden-queries.md#coverage-columns-crap--enrichment-recipes).

**Recipes wiring:** **`src/application/recipes-loader.ts`** (pure transport-agnostic loader) + **`src/application/query-recipes.ts`** (cache + public API — `getQueryRecipeSql` / `getQueryRecipeActions` / `getQueryRecipeParams` / `listQueryRecipeIds` / `listQueryRecipeCatalog` / `getQueryRecipeCatalogEntry`, shared by CLI + MCP). Recipes live as file pairs: **`<id>.sql`** + optional **`<id>.md`**. The loader reads `templates/recipes/` (bundled, ships in npm package next to `templates/agents/`) and `<state-dir>/recipes/` (project-local — default `.codemap/recipes/`; honors `--state-dir` / `CODEMAP_STATE_DIR`; root-only resolution per the registry plan, no walk-up). Project recipes win on id collision; entries that override a bundled id carry **`shadows: true`** in the catalog so agents reading `codemap://recipes` at session start see when a recipe behaves differently from the documented bundled version. Per-row **`actions`** templates and recipe **`params`** declarations live in YAML frontmatter on each `<id>.md` — uniform shape across bundled + project. Param types are `string | number | boolean`; CLI passes values via repeatable `--params key=value[,key=value]`, MCP / HTTP pass nested `params: {key: value}` to `query_recipe`. Validation runs before SQL binding; missing / unknown / malformed params return the same `{error}` envelope as query failures. Hand-rolled YAML parser is scoped to block-list `actions:` and `params:` only (no `js-yaml` dep). Load-time validation rejects empty SQL and DML / DDL keywords (`INSERT` / `UPDATE` / `DELETE` / `DROP` / `CREATE` / `ALTER` / `ATTACH` / `DETACH` / `REPLACE` / `TRUNCATE` / `VACUUM` / `PRAGMA`) with recipe-aware error messages — defence in depth alongside the runtime `PRAGMA query_only=1` backstop in `query-engine.ts` (PR #35). `<state-dir>/index.db` is gitignored; `<state-dir>/recipes/` is NOT (verified via `git check-ignore`) — recipes are git-tracked source code authored for human review.

**Tool / resource handlers (transport-agnostic):** **`src/application/tool-handlers.ts`** + **`src/application/resource-handlers.ts`** — pure functions that take the args object an MCP tool / resource URI accepts and return a discriminated **`ToolResult`** (`{ok: true, format: 'json'|'sarif'|'annotations'|'mermaid'|'diff'|'diff-json'|'codeclimate'|'badge', payload}` — badge arm also carries `badgeStyle`; `{ok: false, error}`) or a **`ResourcePayload`** (`{mimeType, text}`). MCP and HTTP both wrap the same handlers — MCP translates to `{content: [{type: "text", text}]}`, HTTP translates to `(status, body)` with the right `Content-Type`. Engine layer untouched; transport changes don't ripple into the SQL.
Expand Down
4 changes: 4 additions & 0 deletions docs/glossary.md
Original file line number Diff line number Diff line change
Expand Up @@ -151,6 +151,10 @@ Opt-in FTS5 virtual table over file content (`tokenize='porter unicode61'`). Alw

Output mode rendering `{from, to, label?, kind?}` rows as a Mermaid `flowchart LR` diagram. Sibling of `--format sarif` / `--format annotations` in `application/output-formatters.ts`. **Bounded-input contract** (50-edge ceiling; `MERMAID_MAX_EDGES`) — unbounded inputs reject with a scope-suggestion error naming the recipe + count + scoping knobs (`LIMIT` / `--via` / `WHERE`). Auto-truncation explicitly out of scope (would be a verdict masquerading as output mode, violating the predicate-as-API moat). Recipes / ad-hoc SQL must alias columns to the `{from, to}` shape (e.g. `SELECT from_path AS "from", to_path AS "to" FROM dependencies LIMIT 50`).

### CRAP score / `high-crap-score` / `coverage_source` / `effective_coverage_pct`

Change Risk Anti-Patterns score per published formula: `CC² × (1 - effective_coverage/100)³ + CC` where `CC = symbols.complexity`. Bundled recipe **`high-crap-score`** ranks symbols at or above `min_crap` (default 30). **`effective_coverage_pct`** uses ingested **`coverage.coverage_pct`** when a row exists (**`coverage_source: measured`**), else graph-estimated tiers from test reachability (**`coverage_source: estimated`**: 85% direct test reference, 40% file reachable from tests via value-only **`dependencies`**, 0% otherwise). Heuristic only — not execution coverage; prefer **`codemap ingest-coverage`** before CI gates. Complements **`high-complexity-untested`** when coverage is not ingested.

### `coverage` (table)

Statement coverage ingested from Istanbul JSON, LCOV, or V8 runtime (`NODE_V8_COVERAGE=...` directory via `--runtime`) via `codemap ingest-coverage <path>`. Natural-key PK `(file_path, name, line_start)` — intentionally **not** a FK to `symbols.id` because `symbols` re-creates with fresh AUTOINCREMENT ids on every `--full` reindex; the natural-key approach lets coverage rows survive that churn (`coverage` is also intentionally absent from `dropAll()`, joins the `query_baselines` precedent). Columns: `coverage_pct REAL` (`NULL` when `total_statements = 0` — "untested" and "no testable code" are different signals), `hit_statements`, `total_statements`. Orphan rows (file deleted from project) are cleaned by an explicit `DELETE FROM coverage WHERE file_path NOT IN (SELECT path FROM files)` at the end of every ingest. Three meta keys (`coverage_last_ingested_at` / `_path` / `_format`) record freshness — single ingest at a time, so format is meta-level not per-row.
Expand Down
6 changes: 5 additions & 1 deletion docs/golden-queries.md
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,11 @@ Scenarios live in **`fixtures/golden/scenarios.json`** (Tier A) or optional **`s

### Evidence columns (high-judgment recipes)

Some bundled recipes add optional **`reason`** (TEXT) and **`evidence_json`** (TEXT, JSON array) columns on each row — factual detection path for agents, not engine verdicts. See [plans/evidence-chains-on-recipe-rows.md](./plans/evidence-chains-on-recipe-rows.md). Goldens assert these columns when the recipe ships evidence (`boundary-violations`, `deprecated-symbols`, `unimported-exports`).
Some bundled recipes add optional **`reason`** (TEXT) and **`evidence_json`** (TEXT, JSON array) columns on each row — factual detection path for agents, not engine verdicts (Moat A — not pass/fail verdicts). Bounded subqueries cap evidence at three hops; list caps append `{"truncated":true}`. `unimported-exports` reasons: `no_direct_import`, `reexport_chain_possible`, `unresolved_import_blind_spot`. Goldens assert these columns when the recipe ships evidence (`boundary-violations`, `deprecated-symbols`, `unimported-exports`).

### Coverage columns (CRAP / enrichment recipes)

`high-crap-score` adds **`coverage_source`** (`measured` \| `estimated`) and **`effective_coverage_pct`** on each row — measured when `coverage` has a matching symbol row after `ingest-coverage`; otherwise graph-estimated tiers from test reachability. Goldens assert `coverage_source` when the recipe ships coverage semantics (`high-crap-score`); measured override is covered by `scripts/high-crap-score-measured.test.mjs`.

---

Expand Down
Loading
Loading