Skip to content

Commit a58d8d1

Browse files
committed
Refactor review action to multi-phase orchestrator with model routing
1 parent d242216 commit a58d8d1

13 files changed

Lines changed: 821 additions & 338 deletions

File tree

.claude/agents/compliance.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
---
2+
name: compliance
3+
model: claude-sonnet-4-20250514
4+
description: CLAUDE.md compliance specialist
5+
---
6+
7+
Audit changed files against relevant CLAUDE.md guidance.
8+
Return only JSON findings with concrete rule references.

.claude/agents/quality.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
---
2+
name: quality
3+
model: claude-opus-4-5-20251101
4+
description: Code quality specialist for correctness and reliability
5+
---
6+
7+
Find high-signal correctness, reliability, and performance issues.
8+
Return only JSON findings.

.claude/agents/security.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
---
2+
name: security
3+
model: claude-opus-4-5-20251101
4+
description: Security specialist for exploitable vulnerabilities
5+
---
6+
7+
Find exploitable vulnerabilities in changed code with concrete attack paths.
8+
Return only JSON findings including exploit preconditions and trust boundary.

.claude/agents/triage.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
---
2+
name: triage
3+
model: claude-3-5-haiku-20241022
4+
description: Fast PR triage for skip/continue decisions
5+
---
6+
7+
Determine whether review can be skipped safely.
8+
Return only JSON with `skip_review`, `reason`, and `risk_level`.

.claude/agents/validator.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
---
2+
name: validator
3+
model: claude-sonnet-4-20250514
4+
description: Finding validation and deduplication specialist
5+
---
6+
7+
Validate candidate findings with strict confidence and impact criteria.
8+
Return only JSON decisions for keep/drop.

.claude/commands/review.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -40,6 +40,12 @@ To do this, follow these steps precisely:
4040
Agent 4: Opus security agent
4141
Look for security vulnerabilities in the introduced code. This includes injection, auth bypass, data exposure, unsafe deserialization, or other exploitable issues. Only look for issues that fall within the changed code.
4242

43+
Security evidence requirements for every reported issue:
44+
- Include a concrete exploit or abuse path.
45+
- Include attacker preconditions.
46+
- Identify the impacted trust boundary or sensitive asset.
47+
- Provide an actionable mitigation.
48+
4349
**CRITICAL: We only want HIGH SIGNAL issues.** Flag issues where:
4450
- The code will fail to compile or parse (syntax errors, type errors, missing imports, unresolved references)
4551
- The code will definitely produce wrong results regardless of inputs (clear logic errors)
@@ -52,6 +58,7 @@ To do this, follow these steps precisely:
5258
- Subjective suggestions or improvements
5359
- Security issues that depend on speculative inputs or unverified assumptions
5460
- Denial of Service (DoS) or rate limiting issues without concrete exploitability
61+
- Findings based only on diff snippets without validating surrounding repository context
5562

5663
If you are not certain an issue is real, do not flag it. False positives erode trust and waste reviewer time.
5764

README.md

Lines changed: 12 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -60,6 +60,11 @@ This action is not hardened against prompt injection attacks and should only be
6060
| `upload-results` | Whether to upload results as artifacts | `true` | No |
6161
| `exclude-directories` | Comma-separated list of directories to exclude from scanning | None | No |
6262
| `claude-model` | Claude [model name](https://docs.anthropic.com/en/docs/about-claude/models/overview#model-names) to use. Defaults to Opus 4.5. | `claude-opus-4-5-20251101` | No |
63+
| `model-triage` | Model used for triage phase (skip/continue decision). | `claude-3-5-haiku-20241022` | No |
64+
| `model-compliance` | Model used for CLAUDE.md compliance phase. | `claude-sonnet-4-20250514` | No |
65+
| `model-quality` | Model used for code quality phase. | `claude-opus-4-5-20251101` | No |
66+
| `model-security` | Model used for security phase. | `claude-opus-4-5-20251101` | No |
67+
| `model-validation` | Model used for finding validation phase. | `claude-sonnet-4-20250514` | No |
6368
| `claudecode-timeout` | Timeout for ClaudeCode analysis in minutes | `20` | No |
6469
| `run-every-commit` | Run ClaudeCode on every commit (skips cache check). Warning: May increase false positives on PRs with many commits. | `false` | No |
6570
| `false-positive-filtering-instructions` | Path to custom false positive filtering instructions text file | None | No |
@@ -68,6 +73,7 @@ This action is not hardened against prompt injection attacks and should only be
6873
| `dismiss-stale-reviews` | Dismiss previous bot reviews when posting a new review (useful for follow-up commits) | `true` | No |
6974
| `skip-draft-prs` | Skip code review on draft pull requests | `true` | No |
7075
| `require-label` | Only run review if this label is present. Leave empty to review all PRs. Add `labeled` to your workflow `pull_request` types to trigger on label addition. | None | No |
76+
| `max-diff-lines` | Maximum inline diff lines included as prompt anchor; repository tool reads are still required in all cases. | `5000` | No |
7177

7278
### Action Outputs
7379

@@ -94,11 +100,12 @@ claudecode/
94100
95101
### Workflow
96102
97-
1. **PR Analysis**: When a pull request is opened, Claude analyzes the diff to understand what changed
98-
2. **Contextual Review**: Claude examines the code changes in context, understanding the purpose and potential impacts
99-
3. **Finding Generation**: Issues are identified with detailed explanations, severity ratings, and remediation guidance
100-
4. **False Positive Filtering**: Advanced filtering removes low-impact or false positive prone findings to reduce noise
101-
5. **PR Comments**: Findings are posted as review comments on the specific lines of code
103+
1. **Triage Phase**: A fast triage pass determines if review should proceed.
104+
2. **Context Discovery**: Claude discovers relevant CLAUDE.md files, hotspots, and risky code paths.
105+
3. **Specialist Review**: Dedicated compliance, quality, and security phases run with configurable models.
106+
4. **Validation Phase**: Candidate findings are validated and deduplicated for high signal.
107+
5. **False Positive Filtering**: Additional filtering removes low-impact noise.
108+
6. **PR Comments**: Findings are posted as review comments on specific lines in the PR.
102109
103110
## Review Capabilities
104111

action.yml

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,31 @@ inputs:
3333
required: false
3434
default: ''
3535

36+
model-triage:
37+
description: 'Model for triage phase'
38+
required: false
39+
default: 'claude-3-5-haiku-20241022'
40+
41+
model-compliance:
42+
description: 'Model for CLAUDE.md compliance phase'
43+
required: false
44+
default: 'claude-sonnet-4-20250514'
45+
46+
model-quality:
47+
description: 'Model for code quality phase'
48+
required: false
49+
default: 'claude-opus-4-5-20251101'
50+
51+
model-security:
52+
description: 'Model for security phase'
53+
required: false
54+
default: 'claude-opus-4-5-20251101'
55+
56+
model-validation:
57+
description: 'Model for validation phase'
58+
required: false
59+
default: 'claude-sonnet-4-20250514'
60+
3661
run-every-commit:
3762
description: 'Run ClaudeCode on every commit (skips cache check). Warning: This may lead to more false positives on PRs with many commits as the AI analyzes the same code multiple times.'
3863
required: false
@@ -231,6 +256,11 @@ runs:
231256
CUSTOM_REVIEW_INSTRUCTIONS: ${{ inputs.custom-review-instructions }}
232257
CUSTOM_SECURITY_SCAN_INSTRUCTIONS: ${{ inputs.custom-security-scan-instructions }}
233258
CLAUDE_MODEL: ${{ inputs.claude-model }}
259+
MODEL_TRIAGE: ${{ inputs.model-triage }}
260+
MODEL_COMPLIANCE: ${{ inputs.model-compliance }}
261+
MODEL_QUALITY: ${{ inputs.model-quality }}
262+
MODEL_SECURITY: ${{ inputs.model-security }}
263+
MODEL_VALIDATION: ${{ inputs.model-validation }}
234264
CLAUDECODE_TIMEOUT: ${{ inputs.claudecode-timeout }}
235265
MAX_DIFF_LINES: ${{ inputs.max-diff-lines }}
236266
ACTION_PATH: ${{ github.action_path }}

claudecode/__init__.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,11 +12,16 @@
1212
from claudecode.github_action_audit import (
1313
GitHubActionClient,
1414
SimpleClaudeRunner,
15+
get_review_model_config,
1516
main
1617
)
18+
from claudecode.review_orchestrator import ReviewModelConfig, ReviewOrchestrator
1719

1820
__all__ = [
1921
"GitHubActionClient",
2022
"SimpleClaudeRunner",
23+
"ReviewModelConfig",
24+
"ReviewOrchestrator",
25+
"get_review_model_config",
2126
"main"
2227
]

claudecode/findings_merge.py

Lines changed: 61 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,61 @@
1+
"""Utilities for merging and deduplicating findings from multiple phases."""
2+
3+
from typing import Any, Dict, List, Tuple
4+
5+
6+
def _normalize_text(value: Any) -> str:
7+
return str(value or "").strip().lower()
8+
9+
10+
def _finding_key(finding: Dict[str, Any]) -> Tuple[str, int, str, str]:
11+
file_path = _normalize_text(finding.get("file"))
12+
line = finding.get("line")
13+
try:
14+
line_no = int(line)
15+
except (TypeError, ValueError):
16+
line_no = 1
17+
category = _normalize_text(finding.get("category"))
18+
title = _normalize_text(finding.get("title"))
19+
return file_path, line_no, category, title
20+
21+
22+
def _severity_rank(value: Any) -> int:
23+
sev = _normalize_text(value).upper()
24+
if sev == "HIGH":
25+
return 3
26+
if sev == "MEDIUM":
27+
return 2
28+
if sev == "LOW":
29+
return 1
30+
return 0
31+
32+
33+
def _confidence_value(value: Any) -> float:
34+
try:
35+
return float(value)
36+
except (TypeError, ValueError):
37+
return 0.0
38+
39+
40+
def merge_findings(findings: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
41+
"""Merge duplicate findings and keep the strongest candidate."""
42+
merged: Dict[Tuple[str, int, str, str], Dict[str, Any]] = {}
43+
44+
for finding in findings:
45+
if not isinstance(finding, dict):
46+
continue
47+
48+
key = _finding_key(finding)
49+
existing = merged.get(key)
50+
51+
if existing is None:
52+
merged[key] = finding
53+
continue
54+
55+
incoming_score = (_severity_rank(finding.get("severity")), _confidence_value(finding.get("confidence")))
56+
existing_score = (_severity_rank(existing.get("severity")), _confidence_value(existing.get("confidence")))
57+
58+
if incoming_score > existing_score:
59+
merged[key] = finding
60+
61+
return list(merged.values())

0 commit comments

Comments
 (0)