Skip to content

Commit 3fc823a

Browse files
committed
feat(recipes): high-crap-score with graph-estimated coverage (plan 2)
Spike locks 85/40/0% reachability tiers on fixtures/minimal; ships high-crap-score recipe (measured override when coverage ingested), golden + script tests, and high-complexity-untested cross-link.
1 parent a11242e commit 3fc823a

11 files changed

Lines changed: 323 additions & 7 deletions

.changeset/high-crap-score.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
---
2+
"@stainless-code/codemap": patch
3+
---
4+
5+
Add `high-crap-score` recipe: CRAP ranking with measured coverage when ingested, or graph-estimated 85/40/0% tiers from test reachability otherwise.

docs/plans/agent-enrichment-wave.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -91,4 +91,4 @@ Each PR: `harden-pr full` → merge. Do not batch plans 1–4 into one PR.
9191

9292
## Current slice
9393

94-
**Active:** Plan 1 shipped in [**PR #174**](https://github.com/stainless-code/codemap/pull/174) (awaiting merge) — next: Plan 2 spike **2.0** (`graph-estimated-crap.md`).
94+
**Active:** Plan 2 **in flight** on `feat/high-crap-score` — slices **2.0–2.3** (`graph-estimated-crap.md`); PR **#C** when complete.

docs/plans/graph-estimated-crap.md

Lines changed: 18 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -32,9 +32,21 @@ recipe high-crap-score (SQL only)
3232
→ CRAP formula → rows + coverage_source column
3333
```
3434

35-
### Tracer bullet (slice 1)
35+
### Spike results (slice 2.0, `fixtures/minimal`)
3636

37-
Recipe SQL + `.md` on fixture index without coverage ingest (tiers only). Golden row asserting `coverage_source: estimated`. Second golden with `ingest-coverage``measured` overrides.
37+
`scripts/spike-crap-reachability.sql` + `scripts/spike-crap-reachability.test.mjs` lock tier counts on function-shaped symbols:
38+
39+
| Tier | Count | Example |
40+
| ---- | ----- | ------------------------------------------------------------------------ |
41+
| 85% | 1 | `labyrinth` — direct `bindings` ref from `smoke.test.ts` |
42+
| 40% | 4 | `deeplyNested`, `relay`, … — `complexity-fixture.ts` reachable from test |
43+
| 0% | 39 | `createClient`, `get`, … — not dependency-reachable from tests |
44+
45+
Reachability walk: `test_suites` + `*.test.*` / `*.spec.*` globs → recursive `dependencies` fan-out (value edges only).
46+
47+
### Tracer bullet (slice 2.1)
48+
49+
Recipe SQL + `.md` on fixture index without coverage ingest (tiers only). Golden row asserting `coverage_source: estimated`. `scripts/high-crap-score-measured.test.mjs` asserts `ingest-coverage``measured` overrides.
3850

3951
### Out of scope (v1)
4052

@@ -105,10 +117,10 @@ bun test scripts/query-golden-coverage-matrix.test.mjs # after golden scenario
105117

106118
## Acceptance
107119

108-
- [ ] Without coverage ingest: symbols in files imported by tests get tier 40/85; isolated files get 0%
109-
- [ ] With coverage ingest: `coverage_source = measured` and CRAP uses real `coverage_pct`
110-
- [ ] `codemap query --recipe high-crap-score --json` works; SARIF compatible via `--format sarif`
111-
- [ ] No new pass/fail primitive
120+
- [x] Without coverage ingest: symbols in files imported by tests get tier 40/85; isolated files get 0%
121+
- [x] With coverage ingest: `coverage_source = measured` and CRAP uses real `coverage_pct`
122+
- [x] `codemap query --recipe high-crap-score --json` works; SARIF compatible via `--format sarif`
123+
- [x] No new pass/fail primitive
112124

113125
---
114126

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
[
2+
{
3+
"name": "labyrinth",
4+
"kind": "function",
5+
"file_path": "src/lib/complexity-fixture.ts",
6+
"line_start": 22,
7+
"line_end": 83,
8+
"complexity": 19,
9+
"effective_coverage_pct": 85,
10+
"coverage_source": "estimated",
11+
"crap_score": 20.22
12+
}
13+
]

fixtures/golden/scenarios.json

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -593,6 +593,12 @@
593593
"prompt": "High cyclomatic complexity + low coverage",
594594
"recipe": "high-complexity-untested"
595595
},
596+
{
597+
"id": "high-crap-score",
598+
"prompt": "High CRAP score with graph-estimated coverage tiers (min_crap=15)",
599+
"recipe": "high-crap-score",
600+
"params": { "min_crap": 15 }
601+
},
596602
{
597603
"id": "text-in-deprecated-functions",
598604
"prompt": "FTS TODO/FIXME/HACK in @deprecated functions with low coverage (requires fts5)",
Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
import { describe, expect, it } from "bun:test";
2+
import { join } from "node:path";
3+
4+
/**
5+
* Plan 2 slice 2.2 — measured coverage overrides graph tiers when ingest ran.
6+
* Run via `bun run test:scripts` (golden runner already ingests coverage in setup).
7+
*/
8+
import { $ } from "bun";
9+
10+
const REPO_ROOT = join(import.meta.dir, "..");
11+
12+
describe("high-crap-score measured override", () => {
13+
it("uses coverage_source measured when coverage row exists (now @ 100%)", async () => {
14+
await $`bun src/index.ts ingest-coverage coverage/coverage-final.json --root fixtures/minimal`
15+
.cwd(REPO_ROOT)
16+
.quiet();
17+
const result =
18+
await $`bun src/index.ts query --recipe high-crap-score --json --params min_crap=1 --root fixtures/minimal`
19+
.cwd(REPO_ROOT)
20+
.quiet();
21+
expect(result.exitCode).toBe(0);
22+
const rows = JSON.parse(result.stdout.toString());
23+
const nowRow = rows.find(
24+
(r) => r.name === "now" && r.file_path === "src/utils/date.ts",
25+
);
26+
expect(nowRow).toBeDefined();
27+
expect(nowRow.coverage_source).toBe("measured");
28+
expect(nowRow.effective_coverage_pct).toBe(100);
29+
});
30+
});
Lines changed: 71 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,71 @@
1+
-- Plan 2 slice 2.0 spike: graph-estimated coverage tiers on fixtures/minimal.
2+
-- Run: codemap query --json "$(cat scripts/spike-crap-reachability.sql)" --root fixtures/minimal
3+
-- Expected function/method tier counts: 85% → labyrinth (direct test ref); 40% → complexity-fixture peers (reachable); 0% → rest.
4+
WITH RECURSIVE
5+
test_files(path) AS (
6+
SELECT DISTINCT f.path
7+
FROM files f
8+
WHERE EXISTS (
9+
SELECT 1
10+
FROM test_suites ts
11+
WHERE ts.file_path = f.path
12+
)
13+
OR f.path GLOB '*.test.ts'
14+
OR f.path GLOB '*.test.tsx'
15+
OR f.path GLOB '*.spec.ts'
16+
OR f.path GLOB '*.spec.tsx'
17+
OR f.path GLOB '*.test.js'
18+
OR f.path GLOB '*.spec.js'
19+
OR f.path GLOB '*.test.jsx'
20+
OR f.path GLOB '*.spec.jsx'
21+
),
22+
reachable_files(file_path, depth, visited) AS (
23+
SELECT path, 0, char(30) || path || char(30)
24+
FROM test_files
25+
UNION ALL
26+
SELECT
27+
d.to_path,
28+
rf.depth + 1,
29+
rf.visited || d.to_path || char(30)
30+
FROM dependencies d
31+
JOIN reachable_files rf ON d.from_path = rf.file_path
32+
WHERE rf.depth < 50
33+
AND instr(rf.visited, char(30) || d.to_path || char(30)) = 0
34+
),
35+
symbol_tiers AS (
36+
SELECT
37+
s.name,
38+
s.file_path,
39+
s.complexity,
40+
CASE
41+
WHEN EXISTS (
42+
SELECT 1
43+
FROM "references" r
44+
JOIN bindings b ON b.reference_id = r.id
45+
JOIN test_files tf ON tf.path = r.file_path
46+
WHERE b.resolved_symbol_id = s.id
47+
)
48+
OR EXISTS (
49+
SELECT 1
50+
FROM calls c2
51+
JOIN test_files tf ON tf.path = c2.file_path
52+
WHERE c2.callee_symbol_id = s.id
53+
AND (c2.provenance IS NULL OR c2.provenance = 'ast')
54+
)
55+
THEN 85
56+
WHEN EXISTS (
57+
SELECT 1
58+
FROM reachable_files rf
59+
WHERE rf.file_path = s.file_path
60+
)
61+
THEN 40
62+
ELSE 0
63+
END AS estimated_pct
64+
FROM symbols s
65+
WHERE s.complexity IS NOT NULL
66+
AND s.kind IN ('function', 'method')
67+
)
68+
SELECT estimated_pct, COUNT(*) AS symbol_count
69+
FROM symbol_tiers
70+
GROUP BY estimated_pct
71+
ORDER BY estimated_pct DESC
Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
import { describe, expect, it } from "bun:test";
2+
/**
3+
* Plan 2 slice 2.0 — locks reachability tier counts on fixtures/minimal.
4+
* Run via `bun run test:scripts`.
5+
*/
6+
import { readFileSync } from "node:fs";
7+
import { join } from "node:path";
8+
9+
import { $ } from "bun";
10+
11+
const REPO_ROOT = join(import.meta.dir, "..");
12+
const SPIKE_SQL = readFileSync(
13+
join(REPO_ROOT, "scripts/spike-crap-reachability.sql"),
14+
"utf-8",
15+
);
16+
17+
describe("spike-crap-reachability (fixtures/minimal)", () => {
18+
it("assigns 85/40/0% tiers to 1/4/39 function-shaped symbols", async () => {
19+
const result =
20+
await $`bun src/index.ts query --json ${SPIKE_SQL} --root fixtures/minimal`
21+
.cwd(REPO_ROOT)
22+
.quiet();
23+
expect(result.exitCode).toBe(0);
24+
const rows = JSON.parse(result.stdout.toString());
25+
const byTier = Object.fromEntries(
26+
rows.map((r) => [r.estimated_pct, r.symbol_count]),
27+
);
28+
expect(byTier[85]).toBe(1);
29+
expect(byTier[40]).toBe(4);
30+
expect(byTier[0]).toBe(39);
31+
});
32+
});

templates/recipes/high-complexity-untested.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,10 @@ McCabe formula: `1 + (decision points)`. Branching nodes counted by Codemap's pa
2222

2323
Each row also includes **SonarSource cognitive complexity** for the same symbol (nesting-heavy control flow scores higher than flat branch chains). The recipe **filter** still uses cyclomatic `>= 10`; use `high-cognitive-complexity` when cognitive score alone is the gate.
2424

25+
## Without `ingest-coverage`
26+
27+
`COALESCE(coverage_pct, 0)` treats missing coverage as **0%**, so every high-complexity symbol appears undertested. Prefer **`high-crap-score`** when coverage is not ingested — it uses graph-estimated tiers (85/40/0%) from test reachability instead of assuming zero coverage.
28+
2529
## Why the joint signal
2630

2731
- High complexity alone surfaces too many false positives — a heavily-branched config-loader or visitor pattern is fine if it's well-tested.
Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,31 @@
1+
---
2+
params:
3+
- name: min_crap
4+
type: number
5+
required: false
6+
default: 30
7+
description: Minimum CRAP score threshold (industry default 30)
8+
actions:
9+
- type: review-crap-score
10+
auto_fixable: false
11+
description: "High CRAP (complex + undertested) — add tests or simplify before refactor. Check coverage_source: measured rows used ingested coverage; estimated rows use graph tiers only."
12+
---
13+
14+
# high-crap-score
15+
16+
Ranks symbols by **CRAP score**`CC² × (1 - effective_coverage/100)³ + CC` where `CC = symbols.complexity`.
17+
18+
**Coverage precedence:** ingested `coverage` rows win (`coverage_source: measured`). Otherwise graph-estimated tiers (`coverage_source: estimated`):
19+
20+
| Tier | When |
21+
| ------- | --------------------------------------------------------------------------------------------- |
22+
| **85%** | Symbol directly referenced from a test file (`bindings`-resolved `references` or AST `calls`) |
23+
| **40%** | Symbol's `file_path` is dependency-reachable from any test file |
24+
| **0%** | Otherwise |
25+
26+
Estimates are **heuristics**, not execution coverage — prefer `codemap ingest-coverage` before CI gates. Composes with `high-complexity-untested` (cyclomatic + measured-only today).
27+
28+
```bash
29+
codemap query --recipe high-crap-score --json
30+
codemap query --recipe high-crap-score --params min_crap=15 --json
31+
```

0 commit comments

Comments
 (0)