Commit b5679a6
authored
* feat(coverage): add `coverage` table + SCHEMA_VERSION 5→6 (Tracer 1)
First tracer of the static coverage ingestion plan
(docs/plans/coverage-ingestion.md). Pure schema change, no engine yet.
What lands:
- New `coverage` table with natural-key PK (file_path, name, line_start)
per D6. Deliberately NOT a FK to symbols.id: dropAll() drops symbols
on every --full reindex and the recreated rows get fresh AUTOINCREMENT
ids; CASCADE would wipe coverage on every full rebuild. Natural key
sidesteps the entire FK/CASCADE hazard. Orphan cleanup (file deleted
from project) lives at the end of every ingest in tracer 2a — exercised
by the new test using a raw DELETE so the contract is locked in
before the engine code lands.
- `idx_coverage_file_name` mirrors the typical join shape (the killer
recipe joins symbols ↔ coverage on file_path/name/line_start) and the
(file_path, name) prefix also covers the GROUP BY file_path scan used
by the bundled files-by-coverage recipe (D2 + D13).
- `coverage` table is intentionally absent from `dropAll()` (joins the
query_baselines precedent), so user rows survive --full and
SCHEMA_VERSION-mismatch rebuilds.
- SCHEMA_VERSION bumped 5→6. Existing installs auto-rebuild on next
`codemap` run via the schema-mismatch path in createSchema(); the
new table is empty until first `codemap ingest-coverage` invocation
(tracer 3), which is the correct initial state per D12.
- New db.test.ts case exercises the full contract: round-trip of a
partial-coverage row set including the total_statements=0 → NULL edge
(D5), survival across dropAll() + createTables(), and the orphan-
cleanup DELETE that ships in tracer 2a's engine.
* feat(coverage): shared upsert core + Istanbul parser (Tracer 2a)
Pure engine + format-agnostic write path. No CLI / FS side effects beyond
db.run on the caller-supplied SQLite handle.
`upsertCoverageRows({db, projectRoot, rows, format, sourcePath})` —
shared core consumed by both Istanbul (this commit) and LCOV (next):
- Normalises absolute paths via toProjectRelative (D8); files outside
the project root land in skipped.unmatched_files.
- Loads symbols per-file once and projects each statement onto the
innermost enclosing symbol via JS-side (line_end - line_start) ASC
tie-break (D7) — avoids the per-statement SQL round-trip the plan
flagged as the hot path.
- Aggregates per (file_path, name, line_start) bucket; total = 0 →
coverage_pct NULL (D5 edge — "untested" ≠ "no testable code").
- Single transaction: per-file DELETE + bulk INSERT for idempotent
re-ingest, then orphan-cleanup DELETE (D6 — no FK / CASCADE means
deleted-file rows accumulate without this sweep), then writes the
three coverage_last_ingested_* meta keys.
`ingestIstanbul({db, projectRoot, payload, sourcePath})` — Istanbul
parser front-end:
- Subset-typed `IstanbulPayload` reads only statementMap + s; ignores
fnMap / branchMap / inputSourceMap so the format can grow without
churning the signature.
- Inner `path` field takes precedence over the absolute-path key
(handles webpack-style symlinked paths).
- Tolerates malformed file entries (missing statementMap or s) — skips
silently rather than throwing, so one corrupt file doesn't poison
the whole ingest.
10 unit tests cover both pieces:
- Shared core: innermost-wins aggregation, statement-outside-symbol
observability, zero-statement-symbol → no row, UPSERT idempotence,
orphan cleanup after file deletion, project-root normalisation,
unmatched-file skipping, meta-key writes.
- Istanbul parser: real-shape payload end-to-end, malformed-entry
tolerance.
* feat(coverage): LCOV parser front-end (Tracer 2b)
Closes the second format axis from the plan — LCOV (`lcov.info`) ingester
that shares the entire write path with the Istanbul ingester from Tracer 2a.
`ingestLcov({db, projectRoot, payload, sourcePath})` is a pure regex
tokenizer over the LCOV record format. Recognised lines:
- `SF:<path>` — start of a file record (sets the "current file")
- `DA:<line>,<exec_count>[,<checksum>]` — one statement per record
- `end_of_record` — closes the current file record
- Comments (`# …`), blank lines, and CRLF endings tolerated
Ignored lines (everything else: `TN:`, `FN:`, `FNDA:`, `FNF:`, `FNH:`,
`BRDA:`, `BRF:`, `BRH:`, `LF:`, `LH:`) — D5 scope is statement coverage
only; the unused records are skipped without churning the parser.
Throws on `DA:` outside an `SF:` block (malformed LCOV — the file would
have nowhere to attach to). Missing `end_of_record` is tolerated by
construction (next `SF:` resets the current file).
Five unit tests cover both the LCOV parser in isolation and the
cross-format equivalence contract:
- `ingestLcov`: well-formed LCOV with multiple SF records →
identical-shape coverage rows; CRLF + comments + blank-line tolerance;
DA-outside-SF throws; optional `DA:` checksum field parsed.
- Cross-format: identical Istanbul + LCOV inputs produce byte-identical
rows in the `coverage` table — the contract that lets Tracer 4 ship
one set of golden snapshots that asserts both formats land on the
same answer.
15 tests total in coverage-engine.test.ts (10 existing + 5 LCOV).
Zero new write-side code — the LCOV parser is pure parse-and-normalise
into the same `CoverageRow[]` the Istanbul parser produces, then both
hand off to `upsertCoverageRows`.
* feat(coverage): codemap ingest-coverage CLI verb (Tracer 3)
Wires the engine (Tracer 2a/2b) into the CLI surface end-to-end. Smoke-
tested against a real Istanbul payload pointed at src/db.ts:
ingest writes 2 symbols (openDb 100%, closeDb 0%); query --json reads
them back with the expected coverage_pct shape.
cli/cmd-ingest-coverage.ts:
- Single positional arg + --json. No --source flag (per plan: format
is auto-detectable from extension; flag noise dropped).
- resolveArtifact() probes the path: .json → istanbul, .info → lcov,
directory → looks for both filenames, errors if both or neither
present (no precedence guessing — explicit is better than implicit).
- File reads use the canonical Node-vs-Bun split: Bun.file().json() /
.text() on Bun (native parser perf for multi-MB Istanbul payloads),
readFile + JSON.parse on Node. Mirrors config.ts. See packaging.md.
- Plain-text terminal output by default; --json emits the IngestResult
envelope. Errors emit {"error":"…"} on stdout under --json, plain
message on stderr otherwise. Sets process.exitCode (no process.exit)
so piped stdout isn't truncated.
cli/main.ts: dispatch new verb between snippet and query (lazy import
matches every other cmd).
cli/bootstrap.ts: validateIndexModeArgs() recognises ingest-coverage as
a known subcommand; printCliUsage() lists it in a new "Coverage ingest"
section.
7 parser tests cover all paths: requires the subcommand position,
help on --help/-h, single-path default, --json flag, missing-path
error, unknown-option error, multiple-paths error.
Engine + CLI smoke verified locally:
$ bun src/index.ts ingest-coverage /tmp/cov.json --json
{"ingested":{"symbols":2,"files":1},"skipped":...,"format":"istanbul"}
* feat(coverage): bundled recipes + fixture coverage data + goldens (Tracer 4)
Closes the agent-surface contract from D13: every common coverage
question gets a `--recipe` verb. Three v1 recipes ship under
templates/recipes/, auto-discovered by the existing recipes-loader
(`bun src/index.ts query --recipes-json` lists 15 recipes total — was 12).
Bundled recipes:
- `untested-and-dead.{sql,md}` — the killer recipe. Exported functions
with no callers AND zero coverage. Recipe MD documents the v1
name-collision lossiness (D11) and three concrete narrowing patterns
agents can apply: scope by file_path prefix, exclude default exports
(Next.js entry points), restrict to public visibility.
- `files-by-coverage.{sql,md}` — files ranked ascending by statement
coverage. Replaces the deferred `file_coverage` rollup table (D2):
GROUP BY on the symbol-level table is index-bounded by
`idx_coverage_file_name`. NULL coverage_pct (zero-statement files)
sorted last to avoid drowning out actual zero-coverage files.
- `worst-covered-exports.{sql,md}` — top-20 worst-covered exported
functions (LIMIT in the SQL, not configurable in v1 — config-driven
LIMIT defers until a real consumer asks).
Each recipe ships a frontmatter `actions` block (per PR #26) so agents
get a per-row follow-up hint in `--json` output.
Fixture data:
- fixtures/minimal/coverage/coverage-final.json (Istanbul) and
fixtures/minimal/coverage/lcov.info (LCOV) — equivalent partial
coverage shape covering 10 fixture symbols across 6 files. Both
formats use project-relative paths so they work without sed.
Golden runner extension (scripts/query-golden.ts + schema.ts):
- New `setup` array in scenarios.json runs one-time setup steps after
`cm.index()` and before scenarios. Today: `{kind: "ingest-coverage",
path: "..."}` — engine auto-detects format by extension.
- Schema is backward-compatible: parser accepts either the legacy
flat array OR the new `{setup?, scenarios}` object via z.union.
- Setup dispatch reuses the engine functions directly (ingestIstanbul,
ingestLcov) — same write path the CLI verb uses.
5 new golden snapshots prove the surface end-to-end on the fixture
corpus:
- coverage-rows-after-ingest.json: raw `coverage` table contents
- untested-and-dead.json: 6 dead+untested symbols (incl. legacyClient
+ epochMs, both @deprecated)
- files-by-coverage.json: 6 files ranked from 0% (api/client.ts) to
100% (ProductCard, usePermissions)
- worst-covered-exports.json: top exported functions with coverage_pct
- (LCOV cross-format equivalence proven by engine unit test, not
duplicated in goldens — single setup step per scenarios.json keeps
the runner simple)
fixtures/minimal/README.md: new "What's exercised" row + Use section
with `bun src/index.ts ingest-coverage` worked examples.
All 24 golden scenarios pass. Engine unit tests still 15 pass.
* docs(coverage): architecture + glossary + agent rule/skill lockstep + changeset (Tracer 5)
Closes the plan execution per docs/README.md Rule 10 (agent rule + skill
lockstep) and Rule 9 (new domain nouns → glossary same-PR).
docs/architecture.md:
- application/ list: add coverage-engine.ts (upsertCoverageRows core +
ingestIstanbul / ingestLcov parsers).
- New § coverage table under Schema with the natural-key PK rationale
cross-referencing query_baselines as the dropAll() exclusion precedent.
docs/glossary.md:
- New `coverage` table entry: PK shape, NULL semantics, orphan cleanup.
- New `codemap ingest-coverage` / Istanbul JSON / LCOV / static coverage
ingestion entry: format auto-detection, innermost-wins projection
rationale (D7), three bundled recipes, no-half-way principle.
.agents/rules/codemap.md + templates/agents/rules/codemap.md (lockstep):
- Index intro mentions coverage as part of indexed structure.
- New CLI table row for `ingest-coverage`.
- Four new trigger-pattern rows: "Is symbol X tested?" → coverage table;
"What's structurally dead AND untested?" → --recipe untested-and-dead;
"Rank files by test coverage" → --recipe files-by-coverage;
"Worst-covered exported functions" → --recipe worst-covered-exports.
- Two new quick-reference query rows: symbol coverage + bundled recipe.
- Drift between .agents/ and templates/agents/ stays CLI-prefix-only
(`bun src/index.ts` vs `codemap`).
.agents/skills/codemap/SKILL.md + templates/agents/skills/codemap/SKILL.md
(lockstep): new `coverage` table block with full column schema + the
three meta keys.
docs/research/fallow.md:
- C.11 row marked Shipped (link to plan PR + this commit).
- Open question "column on symbols vs separate table?" resolved to
"separate table" with D1 + D6 cross-reference.
docs/roadmap.md: drop the "Static coverage ingestion" backlog row
(per Rule 2 — when a backlog item ships, it leaves the roadmap).
docs/plans/coverage-ingestion.md: deleted per Rule 3 (plan files have
the lifetime of the work; absorbed into architecture / glossary / agent
rule on ship).
.changeset/coverage-ingestion.md: minor changeset (per .agents/lessons.md
"changesets bump policy" — new tables + SCHEMA_VERSION bump = minor).
All 24 golden scenarios pass (5 new — coverage-rows-after-ingest,
untested-and-dead, files-by-coverage, worst-covered-exports — plus the
existing 19). 737 tests across 43 files, 0 fail.
* fix(db): drop semicolon inside `--` line comment in coverage index DDL
CI Build (Node, better-sqlite3) failed with "RangeError: The supplied SQL
string contains no statements" on `node dist/index.mjs --full`. Root
cause is the .agents/lessons.md lesson on naive `;` splitting: the new
coverage index comment contained `symbols.{file_path,name,line_start};`,
and runSql() splits multi-statement strings on `;` for better-sqlite3
(one statement per prepare). The trailing semicolon inside the `--`
comment created an empty fragment that better-sqlite3 rejects.
Reworded the comment to use parentheses + a period — same intent, no
semicolon. Verified locally via `bun run build && node dist/index.mjs
--full` against /tmp project: full index now completes on Node.
* fix(coverage): apply CodeRabbit review on PR #57 — 3 valid threads
All three CodeRabbit comments fact-check as ✅ Correct. Apply each:
1. **architecture.md:25 — broken anchor** `#static-coverage-ingestion`
doesn't exist (the section heading is `### coverage — Statement
coverage…`, not "Static coverage ingestion"). Fix: drop the
parenthetical anchor link, replace with "schema in § Schema → coverage"
pointing at the existing § Schema anchor that does exist.
2. **fixtures/minimal/README.md:24 — overstated deprecation focus.**
Said "exercise the join axis against `@deprecated` symbols" but only
1 of 6 golden rows is `@deprecated` (legacyClient); the others
(`FormatPrice`, `run`, `_epochSeconds`, `nanoseconds`, `_hiResEpoch`)
carry `@internal`/`@alpha`/`@private`/no tag. Reword to "across
exported functions of every visibility tag" — accurate and
tag-neutral.
3. **untested-and-dead.md:21 — SQL precedence bug in narrowing pattern.**
`AND s.visibility IS NULL OR s.visibility = 'public'` parses as
`(... AND s.visibility IS NULL) OR (s.visibility = 'public')` per
SQL precedence (AND binds tighter than OR). Every row with
`visibility = 'public'` would bypass every other WHERE predicate —
the agent following this pattern would get a much larger result
set than intended. Fix: wrap in parentheses + explicit comment
that the parens are load-bearing so future edits don't drop them.
files-hashes.json refreshed for the README touch.
* fix(docs): bump architecture.md schema version mention 5 → 6
CodeRabbit outside-diff comment on PR #57 (architecture.md:182) — caught
that I bumped SCHEMA_VERSION 5 → 6 in db.ts (Tracer 1) but left the
human-readable callout in architecture.md unchanged. Now in sync.
Per `docs/README.md` Rule 6, "schema version" is explicitly listed as a
decision value (not inventory) so the hardcoded number is fine — it
just needs to track the constant.
* docs(lessons): always construct gh body args via temp file, never heredoc
PR #57's body shipped with literal backslash-backtick artifacts everywhere
because the heredoc-into-`gh pr create --body` path shell-escaped every
backtick. Add it to .agents/lessons.md alongside the existing template-
literal backtick lesson — same root cause family (backticks + nested
quoting), different surface (gh CLI vs TS template strings).
Pattern: Write body → temp file → `gh pr <verb> --body-file <path>` →
delete. One extra tool call, zero rendering surprises.
1 parent e100d2c commit b5679a6
36 files changed
Lines changed: 2225 additions & 324 deletions
File tree
- .agents
- rules
- skills/codemap
- .changeset
- docs
- plans
- research
- fixtures
- golden
- minimal
- minimal
- coverage
- scripts
- query-golden
- src
- application
- cli
- templates
- agents
- rules
- skills/codemap
- recipes
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
12 | 12 | | |
13 | 13 | | |
14 | 14 | | |
| 15 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
6 | 6 | | |
7 | 7 | | |
8 | 8 | | |
9 | | - | |
| 9 | + | |
10 | 10 | | |
11 | 11 | | |
12 | 12 | | |
| |||
32 | 32 | | |
33 | 33 | | |
34 | 34 | | |
| 35 | + | |
35 | 36 | | |
36 | 37 | | |
37 | 38 | | |
| |||
109 | 110 | | |
110 | 111 | | |
111 | 112 | | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
112 | 117 | | |
113 | 118 | | |
114 | 119 | | |
| |||
156 | 161 | | |
157 | 162 | | |
158 | 163 | | |
| 164 | + | |
| 165 | + | |
159 | 166 | | |
160 | 167 | | |
161 | 168 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
267 | 267 | | |
268 | 268 | | |
269 | 269 | | |
| 270 | + | |
| 271 | + | |
| 272 | + | |
| 273 | + | |
| 274 | + | |
| 275 | + | |
| 276 | + | |
| 277 | + | |
| 278 | + | |
| 279 | + | |
| 280 | + | |
| 281 | + | |
| 282 | + | |
| 283 | + | |
| 284 | + | |
270 | 285 | | |
271 | 286 | | |
272 | 287 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
0 commit comments