Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
127 changes: 116 additions & 11 deletions .agents/skills/harden-pr/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
name: harden-pr
description: >-
Bring a branch to pristine, maximum production readiness without changing PR intent —
spawn parallel reviewer subagents, fix in-bounds findings, loop autonomously until
spawn parallel Task subagents (never inline review), fix in-bounds findings, loop autonomously until
clean or pass cap, then report once. Use after a tracer-bullet commit (lite), before PR
is done (full), or on "harden", "harden-pr", "pristine", "review until clean",
"production-ready pass". Invoking this skill authorizes one harden commit at cycle end.
Expand Down Expand Up @@ -33,12 +33,12 @@ Sister skills: [`audit-pr-architecture`](../audit-pr-architecture/SKILL.md) (ext

Otherwise: resolve anchor → run all passes → fix → verify → next pass → finish → report.

| Phase | Behavior |
| --------------- | ------------------------------------------------------------------------------------------------------------------------------------- |
| **During loop** | Autonomous. Spawn reviewers in parallel, merge findings, fix in-bounds, re-run checks, advance pass counter. |
| **After loop** | Single concise report: mode, passes run, production-bar status (met / gaps), fixes made, checks status, deferred nits (if any). |
| **Commit** | If there are uncommitted fixes: one `harden: …` commit **without asking** — skill invocation authorizes it. If no fixes: skip commit. |
| **Babysit** | Full mode only. One line at end of report: "For GitHub/CI, run `/babysit`." Do not ask. |
| Phase | Behavior |
| --------------- | ----------------------------------------------------------------------------------------------------------------------------------------------- |
| **During loop** | Autonomous. `Task`-batch reviewers in parallel (§ Spawning subagents), merge JSON findings, fix in-bounds, re-run checks, advance pass counter. |
| **After loop** | Single concise report: mode, passes run, production-bar status (met / gaps), fixes made, checks status, deferred nits (if any). |
| **Commit** | If there are uncommitted fixes: one `harden: …` commit **without asking** — skill invocation authorizes it. If no fixes: skip commit. |
| **Babysit** | Full mode only. One line at end of report: "For GitHub/CI, run `/babysit`." Do not ask. |

## Modes

Expand Down Expand Up @@ -84,11 +84,116 @@ Reviewers treat the anchor as contract. Findings that would violate it → **rep

**Do not defer complements:** agent-surface parity (rule/skill/MCP), glossary/architecture/golden-queries contracts, script/golden tests for acceptance criteria, and cross-links named in the plan ship in the **same PR** — not "optional v2" or post-merge unless the plan **Out of scope** section explicitly excludes them.

## Spawning subagents (non-negotiable)

The parent agent **MUST NOT** perform reviewer duties inline. Every pass **starts** with a parallel `Task` batch; grep/read/diff by the parent is setup only, **not** a substitute for reviewers.

| Rule | Requirement |
| --------------- | -------------------------------------------------------------------------------------------------------------------- |
| **Tool** | Cursor **`Task`** tool only (`subagent_type`: `generalPurpose` for most reviewers; `explore` for Structure lite) |
| **Batching** | One parent message → **all** applicable reviewers for that pass **in parallel** (single turn, multiple `Task` calls) |
| **Readonly** | `readonly: true` on every reviewer `Task` — reviewers report; parent fixes |
| **Pass credit** | A pass counts only after parent merges subagent JSON and acts on actionable findings |

**Anti-pattern (invalid harden):** parent reads the diff, runs tests, fixes nits, and reports — without spawning the roster below.

### Finding schema (every reviewer returns this)

Each reviewer returns **only** a JSON array (no prose wrapper). Parent parses arrays from all reviewers, then merges.

```json
{
"finding": "One-sentence claim about a gap vs production bar",
"severity": "blocker | major | minor | nit | info",
"file": "repo-relative/path or \"multiple\"",
"fixable_in_bounds": true
}
```

**Severity → action**

| Severity | Parent action |
| -------------------------- | ------------------------------------------------------------------- |
| `blocker` / `major` | Fix in pass 1; must fix or defer with plan Out of scope before done |
| `minor` / `nit` | Fix when in touched files; pass 2+ if pass 1 was crowded |
| `info` | Log only unless zero-cost fix in diff |
| `fixable_in_bounds: false` | Final report deferred list — do not apply |

**Merge + dedupe (parent, after each batch)**

1. Concatenate all reviewer arrays.
2. Drop `info` unless it blocks ship shape.
3. Dedupe: same `file` + same root cause → keep highest severity, merge `finding` text.
4. Sort actionable: `blocker` → `major` → `minor` → `nit`.
5. If merged list is empty → pass succeeds; skip fix phase.

**Example merged queue (pass 1)**

```json
[
{
"finding": "CLI --help documents summary counts but not per-row attribution on --base JSON rows.",
"severity": "major",
"file": "src/cli/cmd-audit.ts",
"fixable_in_bounds": true
},
{
"finding": "Skill shard leaks requiredColumns when describing attribution.",
"severity": "major",
"file": "templates/agent-content/skill/10-recipes-context.md",
"fixable_in_bounds": true
},
{
"finding": "No e2e test for attribution: inherited on deprecated delta.",
"severity": "nit",
"file": "src/application/audit-worktree.test.ts",
"fixable_in_bounds": true
}
]
```

### Reviewer prompt template (copy per `Task`)

Fill `{ROLE}`, `{REPO}`, `{INTENT_ANCHOR}`, `{SCOPE}`, `{EXTRA}`; set `subagent_type` and `readonly: true`.

```text
You are the **{ROLE}** reviewer for `/harden-pr` on `{REPO}`.

**Intent anchor (contract — do not suggest changes that violate):**
{INTENT_ANCHOR}

**Scope:** {SCOPE}
(lite: slice diff files; full: `git diff --name-status origin/main...HEAD`)

**Production bar:** See harden-pr skill § Production bar — optimize for {ROLE} rows.

**Task:** {EXTRA}

**Return ONLY** a JSON array of findings:
[{ "finding": "...", "severity": "blocker|major|minor|nit|info", "file": "...", "fixable_in_bounds": true|false }]
If clean: []

Readonly — do not edit files.
```

**`{EXTRA}` by role**

| Role | `{EXTRA}` |
| ------------------ | ------------------------------------------------------------------------------------------------------------------ |
| Correctness | Read changed source + tests; run affected `bun test <files>`; bugs, edge cases, missing coverage |
| Ship-readiness | Grep inbound refs to deleted plan files; verify plan retired + lifted; changeset consumer-clean; cross-ref anchors |
| Structure (lite) | Read `docs/architecture.md` § Layering; check diff imports for boundary violations; optional codemap queries |
| Consumer surface | Read `consumer-surfaces` rule; parity across CLI help, MCP description, agent-content, README, changeset |
| Structure (full) | Run `audit-pr-architecture` skill read-only; report only fixable-in-bounds items |
| Schema / migration | `SCHEMA_VERSION`, migration paths, column contract drift |
| Security | auth, secrets, env, user-input paths in diff |
| Performance | hot paths, benchmarks, worker pools in diff |

## Reviewer roster

Spawn applicable reviewers **in parallel** via subagents in **one batch per pass**. Each returns `{ finding, severity, file, fixable_in_bounds }`.
Spawn applicable reviewers **in parallel** via **`Task`** in **one batch per pass**. Each subagent returns the finding schema above.

### Core (always)
### Core (always — every pass)

1. **Correctness** — gaps vs production bar; bugs, edge cases, missing tests in changed paths
2. **Ship-readiness** — gaps vs production bar; docs, changesets, consumer-surface leaks, error messages; **grep inbound refs → delete shipped plan file → lift to `golden-queries.md` / `architecture.md` / `roadmap.md`**; run [`verify-after-each-step`](../../rules/verify-after-each-step.md) checks on touched files
Expand All @@ -114,8 +219,8 @@ Execute **without pausing for user input** until exit condition:
resolve intent anchor
pass = 1
loop:
spawn reviewers (parallel, one batch)
merge + dedupe findings
Task-batch all applicable reviewers (parallel, readonly)
parent: merge + dedupe JSON findings (§ Finding schema)
if none actionable → goto done
fix in-bounds (pass 1: all; passes 2+: blockers first, then in-scope nits)
run project checks on touched files
Expand Down
5 changes: 5 additions & 0 deletions .changeset/audit-delta-attribution.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
---
"@stainless-code/codemap": patch
---

On `audit --base <ref>` (CLI / MCP / HTTP), each `added` row carries `attribution: introduced | inherited` (branch-new vs pre-existing at merge base). `--summary` adds `added_introduced` / `added_inherited` per delta.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -122,7 +122,7 @@ codemap audit --json --summary --baseline base # counts-only
codemap audit --files-baseline base-files # explicit per-delta — runs only the slots provided
codemap audit --baseline base --files-baseline hotfix-files # mixed — auto-resolve deps + deprecated; override files
codemap audit --baseline base --no-index # skip the auto-incremental-index prelude (frozen-DB CI)
codemap audit --base origin/main --json # ad-hoc — archive+reindex against any committish; no --save-baseline needed
codemap audit --base origin/main --json # ad-hoc — added[].attribution; --summary adds added_introduced/inherited
codemap audit --base origin/main --format sarif # emit SARIF 2.1.0 directly (Code Scanning); also: --ci alias
codemap audit --base origin/main --ci # CI shortcut: --format sarif + non-zero exit on additions
codemap audit --base v1.0.0 --files-baseline pre-release-files # mix --base with per-delta override
Expand Down
Loading
Loading