Skip to content

Commit bb8bae9

Browse files
feat(audit): attribution on --base added rows (#177)
* feat(audit): add findingKey helper for delta attribution (Plan 4 slice 4.1) Project rows to requiredColumns and JSON.stringify for stable keys aligned with computeDelta / diffRows identity; unit tests cover all v1 deltas. * feat(audit): attribution on --base added rows (Plan 4) Tag ref-sourced audit deltas with introduced/inherited finding keys; share summary collapse across CLI/MCP/HTTP; retire plan + lift docs. * harden: audit attribution docs parity and anchor fixes Add architecture § heading, MCP instructions row, roadmap cross-refs; correct wave doc contract paths. * harden: subagent review fixes for audit attribution PR Tests for inherited/deps attribution and baseline no-attribution; consumer-surface parity (CLI help, rule, skill, MCP, README, changeset); delete completed agent-enrichment-wave plan per docs-governance. * docs(skill): sharpen harden-pr Task spawning and finding schema Require parallel Task subagents per pass; add prompt template, JSON finding shape, merge/dedupe rules, and explicit anti-inline-review rule.
1 parent 16687a0 commit bb8bae9

17 files changed

Lines changed: 606 additions & 262 deletions

.agents/skills/harden-pr/SKILL.md

Lines changed: 116 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
name: harden-pr
33
description: >-
44
Bring a branch to pristine, maximum production readiness without changing PR intent —
5-
spawn parallel reviewer subagents, fix in-bounds findings, loop autonomously until
5+
spawn parallel Task subagents (never inline review), fix in-bounds findings, loop autonomously until
66
clean or pass cap, then report once. Use after a tracer-bullet commit (lite), before PR
77
is done (full), or on "harden", "harden-pr", "pristine", "review until clean",
88
"production-ready pass". Invoking this skill authorizes one harden commit at cycle end.
@@ -33,12 +33,12 @@ Sister skills: [`audit-pr-architecture`](../audit-pr-architecture/SKILL.md) (ext
3333

3434
Otherwise: resolve anchor → run all passes → fix → verify → next pass → finish → report.
3535

36-
| Phase | Behavior |
37-
| --------------- | ------------------------------------------------------------------------------------------------------------------------------------- |
38-
| **During loop** | Autonomous. Spawn reviewers in parallel, merge findings, fix in-bounds, re-run checks, advance pass counter. |
39-
| **After loop** | Single concise report: mode, passes run, production-bar status (met / gaps), fixes made, checks status, deferred nits (if any). |
40-
| **Commit** | If there are uncommitted fixes: one `harden: …` commit **without asking** — skill invocation authorizes it. If no fixes: skip commit. |
41-
| **Babysit** | Full mode only. One line at end of report: "For GitHub/CI, run `/babysit`." Do not ask. |
36+
| Phase | Behavior |
37+
| --------------- | ----------------------------------------------------------------------------------------------------------------------------------------------- |
38+
| **During loop** | Autonomous. `Task`-batch reviewers in parallel (§ Spawning subagents), merge JSON findings, fix in-bounds, re-run checks, advance pass counter. |
39+
| **After loop** | Single concise report: mode, passes run, production-bar status (met / gaps), fixes made, checks status, deferred nits (if any). |
40+
| **Commit** | If there are uncommitted fixes: one `harden: …` commit **without asking** — skill invocation authorizes it. If no fixes: skip commit. |
41+
| **Babysit** | Full mode only. One line at end of report: "For GitHub/CI, run `/babysit`." Do not ask. |
4242

4343
## Modes
4444

@@ -84,11 +84,116 @@ Reviewers treat the anchor as contract. Findings that would violate it → **rep
8484

8585
**Do not defer complements:** agent-surface parity (rule/skill/MCP), glossary/architecture/golden-queries contracts, script/golden tests for acceptance criteria, and cross-links named in the plan ship in the **same PR** — not "optional v2" or post-merge unless the plan **Out of scope** section explicitly excludes them.
8686

87+
## Spawning subagents (non-negotiable)
88+
89+
The parent agent **MUST NOT** perform reviewer duties inline. Every pass **starts** with a parallel `Task` batch; grep/read/diff by the parent is setup only, **not** a substitute for reviewers.
90+
91+
| Rule | Requirement |
92+
| --------------- | -------------------------------------------------------------------------------------------------------------------- |
93+
| **Tool** | Cursor **`Task`** tool only (`subagent_type`: `generalPurpose` for most reviewers; `explore` for Structure lite) |
94+
| **Batching** | One parent message → **all** applicable reviewers for that pass **in parallel** (single turn, multiple `Task` calls) |
95+
| **Readonly** | `readonly: true` on every reviewer `Task` — reviewers report; parent fixes |
96+
| **Pass credit** | A pass counts only after parent merges subagent JSON and acts on actionable findings |
97+
98+
**Anti-pattern (invalid harden):** parent reads the diff, runs tests, fixes nits, and reports — without spawning the roster below.
99+
100+
### Finding schema (every reviewer returns this)
101+
102+
Each reviewer returns **only** a JSON array (no prose wrapper). Parent parses arrays from all reviewers, then merges.
103+
104+
```json
105+
{
106+
"finding": "One-sentence claim about a gap vs production bar",
107+
"severity": "blocker | major | minor | nit | info",
108+
"file": "repo-relative/path or \"multiple\"",
109+
"fixable_in_bounds": true
110+
}
111+
```
112+
113+
**Severity → action**
114+
115+
| Severity | Parent action |
116+
| -------------------------- | ------------------------------------------------------------------- |
117+
| `blocker` / `major` | Fix in pass 1; must fix or defer with plan Out of scope before done |
118+
| `minor` / `nit` | Fix when in touched files; pass 2+ if pass 1 was crowded |
119+
| `info` | Log only unless zero-cost fix in diff |
120+
| `fixable_in_bounds: false` | Final report deferred list — do not apply |
121+
122+
**Merge + dedupe (parent, after each batch)**
123+
124+
1. Concatenate all reviewer arrays.
125+
2. Drop `info` unless it blocks ship shape.
126+
3. Dedupe: same `file` + same root cause → keep highest severity, merge `finding` text.
127+
4. Sort actionable: `blocker``major``minor``nit`.
128+
5. If merged list is empty → pass succeeds; skip fix phase.
129+
130+
**Example merged queue (pass 1)**
131+
132+
```json
133+
[
134+
{
135+
"finding": "CLI --help documents summary counts but not per-row attribution on --base JSON rows.",
136+
"severity": "major",
137+
"file": "src/cli/cmd-audit.ts",
138+
"fixable_in_bounds": true
139+
},
140+
{
141+
"finding": "Skill shard leaks requiredColumns when describing attribution.",
142+
"severity": "major",
143+
"file": "templates/agent-content/skill/10-recipes-context.md",
144+
"fixable_in_bounds": true
145+
},
146+
{
147+
"finding": "No e2e test for attribution: inherited on deprecated delta.",
148+
"severity": "nit",
149+
"file": "src/application/audit-worktree.test.ts",
150+
"fixable_in_bounds": true
151+
}
152+
]
153+
```
154+
155+
### Reviewer prompt template (copy per `Task`)
156+
157+
Fill `{ROLE}`, `{REPO}`, `{INTENT_ANCHOR}`, `{SCOPE}`, `{EXTRA}`; set `subagent_type` and `readonly: true`.
158+
159+
```text
160+
You are the **{ROLE}** reviewer for `/harden-pr` on `{REPO}`.
161+
162+
**Intent anchor (contract — do not suggest changes that violate):**
163+
{INTENT_ANCHOR}
164+
165+
**Scope:** {SCOPE}
166+
(lite: slice diff files; full: `git diff --name-status origin/main...HEAD`)
167+
168+
**Production bar:** See harden-pr skill § Production bar — optimize for {ROLE} rows.
169+
170+
**Task:** {EXTRA}
171+
172+
**Return ONLY** a JSON array of findings:
173+
[{ "finding": "...", "severity": "blocker|major|minor|nit|info", "file": "...", "fixable_in_bounds": true|false }]
174+
If clean: []
175+
176+
Readonly — do not edit files.
177+
```
178+
179+
**`{EXTRA}` by role**
180+
181+
| Role | `{EXTRA}` |
182+
| ------------------ | ------------------------------------------------------------------------------------------------------------------ |
183+
| Correctness | Read changed source + tests; run affected `bun test <files>`; bugs, edge cases, missing coverage |
184+
| Ship-readiness | Grep inbound refs to deleted plan files; verify plan retired + lifted; changeset consumer-clean; cross-ref anchors |
185+
| Structure (lite) | Read `docs/architecture.md` § Layering; check diff imports for boundary violations; optional codemap queries |
186+
| Consumer surface | Read `consumer-surfaces` rule; parity across CLI help, MCP description, agent-content, README, changeset |
187+
| Structure (full) | Run `audit-pr-architecture` skill read-only; report only fixable-in-bounds items |
188+
| Schema / migration | `SCHEMA_VERSION`, migration paths, column contract drift |
189+
| Security | auth, secrets, env, user-input paths in diff |
190+
| Performance | hot paths, benchmarks, worker pools in diff |
191+
87192
## Reviewer roster
88193

89-
Spawn applicable reviewers **in parallel** via subagents in **one batch per pass**. Each returns `{ finding, severity, file, fixable_in_bounds }`.
194+
Spawn applicable reviewers **in parallel** via **`Task`** in **one batch per pass**. Each subagent returns the finding schema above.
90195

91-
### Core (always)
196+
### Core (always — every pass)
92197

93198
1. **Correctness** — gaps vs production bar; bugs, edge cases, missing tests in changed paths
94199
2. **Ship-readiness** — gaps vs production bar; docs, changesets, consumer-surface leaks, error messages; **grep inbound refs → delete shipped plan file → lift to `golden-queries.md` / `architecture.md` / `roadmap.md`**; run [`verify-after-each-step`](../../rules/verify-after-each-step.md) checks on touched files
@@ -114,8 +219,8 @@ Execute **without pausing for user input** until exit condition:
114219
resolve intent anchor
115220
pass = 1
116221
loop:
117-
spawn reviewers (parallel, one batch)
118-
merge + dedupe findings
222+
Task-batch all applicable reviewers (parallel, readonly)
223+
parent: merge + dedupe JSON findings (§ Finding schema)
119224
if none actionable → goto done
120225
fix in-bounds (pass 1: all; passes 2+: blockers first, then in-scope nits)
121226
run project checks on touched files
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
---
2+
"@stainless-code/codemap": patch
3+
---
4+
5+
On `audit --base <ref>` (CLI / MCP / HTTP), each `added` row carries `attribution: introduced | inherited` (branch-new vs pre-existing at merge base). `--summary` adds `added_introduced` / `added_inherited` per delta.

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -122,7 +122,7 @@ codemap audit --json --summary --baseline base # counts-only
122122
codemap audit --files-baseline base-files # explicit per-delta — runs only the slots provided
123123
codemap audit --baseline base --files-baseline hotfix-files # mixed — auto-resolve deps + deprecated; override files
124124
codemap audit --baseline base --no-index # skip the auto-incremental-index prelude (frozen-DB CI)
125-
codemap audit --base origin/main --json # ad-hoc — archive+reindex against any committish; no --save-baseline needed
125+
codemap audit --base origin/main --json # ad-hoc — added[].attribution; --summary adds added_introduced/inherited
126126
codemap audit --base origin/main --format sarif # emit SARIF 2.1.0 directly (Code Scanning); also: --ci alias
127127
codemap audit --base origin/main --ci # CI shortcut: --format sarif + non-zero exit on additions
128128
codemap audit --base v1.0.0 --files-baseline pre-release-files # mix --base with per-delta override

0 commit comments

Comments
 (0)