Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 9 additions & 1 deletion CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,15 @@ structure.

- Use kebab-case for all names
- Use `${CLAUDE_PLUGIN_ROOT}` for portable paths in hooks/MCP configs
- When editing plugin files (other than README.md or CLAUDE.md), bump the version in that plugin's `.claude-plugin/plugin.json`
- When editing plugin files (other than README.md or CLAUDE.md), bump the version in
that plugin's `.claude-plugin/plugin.json` following semver:
- **patch**: bug fixes, typo corrections, minor wording changes
- **minor**: new skills, commands, hooks, agents, or backward-compatible behavior changes
- **major**: breaking changes (renamed/removed skills, changed hook behavior, restructured plugin)
- Only bump once per PR branch. Before bumping, check `git diff main -- <plugin>/.claude-plugin/plugin.json`
to see if the version was already bumped. Skip if it was, unless the accumulated
changes now warrant a higher semver level (e.g., patch already bumped but a new
skill was added — upgrade to minor)
- Use plugin-dev skills: `/plugin-dev:create-plugin`, `/plugin-dev:skill-reviewer`, `/plugin-dev:plugin-validator`

## Documentation
Expand Down
2 changes: 1 addition & 1 deletion pr-review-toolkit/.claude-plugin/plugin.json
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
{
"name": "pr-review-toolkit",
"version": "1.3.0",
"version": "1.4.0",
"description": "Comprehensive PR review using parallel workflow agents",
"author": {
"name": "cblecker",
Expand Down
8 changes: 4 additions & 4 deletions pr-review-toolkit/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,10 +16,10 @@ upstream agents + command architecture.
| Agent | When it runs | What it does |
|-------|-------------|--------------|
| code-reviewer | Always | Reviews code for bugs, style, and guideline adherence (runs on Opus) |
| silent-failure-hunter | Code files changed | Identifies silent failures and inadequate error handling |
| pr-test-analyzer | Code files changed | Analyzes test coverage completeness |
| comment-analyzer | Docs changed or >= 3 files | Checks comment accuracy and maintainability |
| type-design-analyzer | Typed-language files changed | Evaluates type design and invariant quality |
| silent-failure-hunter | Changes touch error handling, try/catch, or fallback logic | Identifies silent failures and inadequate error handling |
| pr-test-analyzer | Functional code that should have corresponding tests | Analyzes test coverage completeness |
| comment-analyzer | Changes add or modify comments, docstrings, or docs | Checks comment accuracy and maintainability |
| type-design-analyzer | Changes introduce or modify type definitions in typed languages | Evaluates type design and invariant quality |

Agent selection is liberal: when in doubt, the agent runs. All agents execute in
parallel within a single workflow, and their lifecycle is managed automatically.
Expand Down
181 changes: 113 additions & 68 deletions pr-review-toolkit/skills/review-pr/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -89,6 +89,8 @@ Wait for the workflow to complete. It returns a JSON object:
"confidence": 85,
"title": "Short title",
"description": "Detailed explanation",
"verificationStatus": "verified | unverified",
"verificationRationale": "What was checked and confirmed",
"status": "new | duplicate | partial_overlap",
"matchedThreadId": "thread-id",
"existingCoverage": "What the existing thread covers",
Expand Down Expand Up @@ -121,22 +123,17 @@ Wait for the workflow to complete. It returns a JSON object:
```

`line` may be absent for findings that apply to an entire file or PR.
Each finding has a `status` from the contextualization phase:

- `new` — no existing thread covers this issue
- `duplicate` — an existing thread fully covers the same concern
- `partial_overlap` — an existing thread touches the same area but our
finding adds something; `delta` describes the addition and
`adjustedSeverity`/`adjustedConfidence` rescore the incremental value

`threadVerifications` is non-empty when `hasOwnResolvedThreads` is true
(we left comments in a previous review that have since been resolved).
Each entry assesses whether the author addressed the concern.
False positives are filtered before this output — remaining findings have
`verificationStatus` of `verified` or `unverified` (verifier unavailable).
`threadVerifications` is non-empty only when `hasOwnResolvedThreads` is
true, meaning we left comments in a previous review that have since been
resolved.

## Phase 3: Present findings

The workflow returns classified findings and thread verifications.
Present them to the user via AskUserQuestion using the template below.
Present them to the user in two steps: text output first, then a
selection prompt.

### Score resolution

Expand All @@ -145,99 +142,147 @@ For each finding, use the effective severity and confidence:
- If `adjustedSeverity` is present, use it; otherwise use `severity`
- If `adjustedConfidence` is present, use it; otherwise use `confidence`

### Presentation template
### Step 1: Output findings as text

Build the AskUserQuestion body using this structure. Omit any section
that has no items. `[:{line}]` means include `:{line}` only when line
is present; omit the colon and line number for file-level findings.
Output findings as plain text before any selection prompt. This step
is mandatory — do not skip or compress it into AskUserQuestion. Omit
any section that has no items. `[:{line}]` means include `:{line}`
only when line is present.

```
## Review Summary
## PR Review: owner/repo#123

{reviewMeta.existingThreadCount} existing thread(s) on this PR.
{reviewMeta.newCount} new finding(s), {reviewMeta.partialOverlapCount}
partial overlap(s), {reviewMeta.duplicateCount} duplicate(s).
{for each severity in [critical, important, suggestion]}
### {Severity} Issues

---
{for each finding where status = "new" and effective severity = {severity}}

## New Findings
N. `{file}[:{line}]` -- **{title}**
{description}
{if verificationStatus = "verified"}_Verified: {verificationRationale}_{end if}

{for each finding where status = "new", grouped by effective severity}
{end for}

### Critical
### Partial Overlaps

1. **[critical/{effectiveConfidence}]** `{file}[:{line}]` — {title}
{description}
{for each finding where status = "partial_overlap"}

### Important
4. `{file}[:{line}]` -- **{title}**
Extends existing review comment: {existingCoverage}.
New insight: {delta}.
{if verificationStatus = "verified"}_Verified: {verificationRationale}_{end if}

2. **[important/{effectiveConfidence}]** `{file}[:{line}]` — {title}
{description}
{if any findings have status = "duplicate"}
_N findings omitted as duplicates of existing review threads._
{end if}

### Suggestions
### Strengths

3. **[suggestion/{effectiveConfidence}]** `{file}[:{line}]` — {title}
{description}
{for each positiveObservation}

---
- {observation}

## Partial Overlaps
### Previous Review Status

{for each finding where status = "partial_overlap"}
{for each threadVerification, only if threadVerifications is non-empty}

4. **[{effectiveSeverity}/{effectiveConfidence}]** `{file}[:{line}]`
— {title}
Existing comment covers: {existingCoverage}
Our addition: {delta}
{if fixed + adequate}Resolved{else if fixed + inadequate}Fix incomplete{else if fixed + newIssue}Fix introduced new issue: {newIssueDescription}{else if pushed_back + adequate}Author disagrees -- reasoning valid{else if pushed_back + inadequate}Author disagrees -- {assessment}{else if unaddressed}Still unresolved{end if} `{file}:{line}` -- {originalConcern}
{assessment}
```

---
### Step 2: Recommendation

## Duplicates (will not post unless selected)
After presenting the findings, analyze each one and recommend which to
include in the posted review. For each numbered finding, output a
one-line recommendation:

{for each finding where status = "duplicate"}
```
## Recommendations

5. `{file}[:{line}]` — {existingCoverage}
Independently flagged the same issue.
1. **Include** -- nil pointer panic is a real crash risk in the error path
2. **Skip** -- sync.Pool is a performance optimization, not a correctness
issue; low value as a review comment on this PR
```

---
Consider these factors when making recommendations:

## Previous Review Status
- **Severity and verification status** — verified critical/important
findings are strong includes; overstated suggestions are candidates
to skip
- **Signal-to-noise ratio** — a review with 3 strong findings is more
useful than one with 10 of mixed quality; fewer, higher-impact
comments make a better review
- **PR context** — a suggestion that's valid but tangential to the PR's
purpose is noise; a finding central to what the PR is doing is signal
- **Actionability** — include findings the author can act on; skip
findings that are observations without a clear next step

{for each threadVerification, only if threadVerifications is non-empty}
### Step 3: Selection prompt

- {icon} `{file}:{line}` — {originalConcern}
{assessment}
Number findings sequentially across all actionable sections (new and
partial overlaps) so each has a unique number. After the recommendations,
ask via AskUserQuestion:

Icons:
fixed + adequate: ✅ Resolved
fixed + inadequate: ⚠️ Fix incomplete
fixed + newIssue: 🔴 Fix introduced new issue: {newIssueDescription}
pushed_back + adequate: ✅ Author disagrees — reasoning valid
pushed_back + inadequate:⚠️ Author disagrees — {assessment}
unaddressed: ❌ Still unresolved
> "Which findings should I include in the review? Enter numbers
> (e.g. 1,3,5), 'all', 'none', or 'recommended' to accept my
> recommendations above."

---
Free-text response, not option buttons.

## Positive Observations
## Phase 4: Draft and preview comments

{for each positiveObservation}
After the user selects findings, draft and preview the exact GitHub
comments before posting.

- {observation}
### Step 1: Draft each comment

For each approved finding, generate the exact text that will be posted
as a GitHub review comment. Comments should be:

- Written in first-person, natural voice (as if the user wrote them)
- No boilerplate headers, severity tags, or "AI-generated" markers

### Step 2: Present draft comments for approval

Output all drafted comments grouped by file:

```
## Draft Review Comments

### path/to/file.go

**Line 42:**
> Comment text exactly as it will be posted.

**Line 128:**
> Comment text exactly as it will be posted.

---

Review event: **REQUEST_CHANGES** / **COMMENT**
(REQUEST_CHANGES if any critical findings selected, COMMENT otherwise)
```

### Final prompt
Then ask via AskUserQuestion:

> "Ready to post these comments? Reply 'post', 'edit N' to modify a
> specific comment, or 'cancel'."

Free-text response, not option buttons.

### Step 3: Handle edits

Number findings sequentially across all sections (new, partial overlaps,
duplicates) so each has a unique number. After the template, ask:
"Which findings should I include in the review? Select by number
(e.g. 1,3,5), or reply 'all new' / 'all new + overlaps' / 'none'."
If the user replies "edit N", show the current text of comment N and
let them provide a replacement. Re-present the updated comment set and
repeat the approval prompt. Loop until the user replies 'post' or
'cancel'.

## Phase 4: Post review (after user approval)
## Phase 5: Post review

1. Create a pending review: `pull_request_review_write` with method `create`
2. For each approved finding with a file and line number:
- Call `add_comment_to_pending_review` with the file, line, and a
natural-language comment written as if by the user
- Call `add_comment_to_pending_review` with the file, line, and the
drafted comment text from Phase 4
- Use `subjectType: "LINE"` and `side: "RIGHT"`
3. Write the review body as a brief summary of the review. Include any approved
findings that lack a file or line number as inline items in the body.
Expand Down
Loading
Loading