Skip to content

Commit 7f7433f

Browse files
SimplyLizclaude
andauthored
feat: token-optimized review skill with early exit and targeted reads (#182)
Rewrites the /ckb-review and /review slash commands for minimal LLM token usage (~3-8k tokens vs ~15-30k previously): - Early exit: score>=80 + verdict=pass → one-line approval, no source read - CLI-first: ckb review --compact instead of MCP tool discovery - Targeted reads: only files with warn/fail findings, not all hotspots - No drill-down phase: CLI compact output has enough signal - Terse output: flat issue list instead of multi-section prose - Anti-patterns list: explicit "don't do this" for token waste Updated in: embedded constant (setup.go), .claude/commands/review.md, ADR-001, and review advantages doc. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent c4261c8 commit 7f7433f

4 files changed

Lines changed: 111 additions & 122 deletions

File tree

.claude/commands/review.md

Lines changed: 48 additions & 69 deletions
Original file line numberDiff line numberDiff line change
@@ -1,98 +1,77 @@
1-
Run a comprehensive code review using CKB's deterministic analysis + your semantic review.
1+
Run a CKB-augmented code review optimized for minimal token usage.
22

33
## Input
44
$ARGUMENTS - Optional: base branch (default: main), or "staged" for staged changes, or a PR number
55

6-
## MCP vs CLI
6+
## Philosophy
77

8-
CKB runs as an MCP server in this environment. MCP mode is strongly preferred for interactive review because the SCIP index stays loaded between calls — drill-down tools like `findReferences`, `analyzeImpact`, and `explainSymbol` execute instantly against the in-memory index. CLI mode reloads the index on every invocation.
8+
CKB already answered the structural questions (secrets? breaking? dead code? test gaps?).
9+
The LLM's job is ONLY what CKB can't do: semantic reasoning about correctness, design,
10+
and intent. Every source line you read costs tokens — read only what CKB says is risky.
911

10-
## The Three Phases
12+
## Phase 1: Structural scan (~1k tokens into context)
1113

12-
### Phase 1: CKB structural scan (5 seconds, 0 tokens)
13-
14-
Call the `reviewPR` MCP tool with compact mode:
15-
```
16-
reviewPR(baseBranch: "main", compact: true)
14+
```bash
15+
ckb review --base=main --format=json --compact 2>/dev/null
1716
```
1817

19-
This returns ~1k tokens instead of ~30k — just the verdict, non-pass checks, top 10 findings, and action items. Use `compact: false` only if you need the full raw data.
20-
21-
If a PR number was given, get the base branch first:
18+
If a PR number was given:
2219
```bash
2320
BASE=$(gh pr view $ARGUMENTS --json baseRefName -q .baseRefName)
21+
ckb review --base=$BASE --format=json --compact 2>/dev/null
2422
```
25-
Then pass it: `reviewPR(baseBranch: BASE, compact: true)`
2623

27-
> **If CKB is not running as an MCP server** (last resort), use the CLI instead:
28-
> ```bash
29-
> ./ckb review --base=main --format=json
30-
> ```
31-
> Note: CLI mode reloads the SCIP index on every call, so drill-down steps will be slower.
24+
From the output, build three lists:
25+
- **SKIP**: passed checks — don't touch these files or topics
26+
- **INVESTIGATE**: warned/failed checks — these are your review scope
27+
- **READ**: hotspot files + files with warn/fail findings — the only files you'll read
3228

33-
From CKB's output, immediately note:
34-
- **Passed checks** → skip these categories. Don't waste tokens re-checking secrets, breaking changes, test coverage, etc.
35-
- **Warned checks** → your review targets
36-
- **Top hotspot files**read these first
37-
- **Test gaps** → functions to evaluate
29+
**Early exit**: If verdict=pass and score≥80, write a one-line approval and stop. No source reading needed.
3830

39-
### Phase 2: Drill down on CKB findings (0 tokens via MCP)
31+
## Phase 2: Targeted source reading (the only token-expensive step)
4032

41-
Before reading source code, use CKB's MCP tools to investigate specific findings. These calls are instant because the SCIP index is already loaded from Phase 1.
33+
Do NOT read the full diff. Do NOT read every changed file.
4234

43-
| CKB finding | Drill-down tool | What to check |
44-
|---|---|---|
45-
| Dead code | `findReferences(symbolId: "...")` or `searchSymbols` → `findReferences` | Does it actually have references? CKB's SCIP index can miss cross-package refs |
46-
| Blast radius | `analyzeImpact(symbolId: "...")` | Are the "callers" real logic or just framework registrations? |
47-
| Coupling gap | `explainSymbol(name: "...")` on the missing file | What does the co-change partner do? Does it actually need updates? |
48-
| Bug patterns | Already verified by differential analysis | Just check the specific line CKB flagged |
49-
| Complexity | `explainFile(path: "...")` | What functions are driving the increase? |
50-
| Test gaps | `getAffectedTests(baseBranch: "main")` | Which tests exist? Which functions are actually untested? |
51-
| Hotspots | `getHotspots(limit: 10)` | Full churn history for the flagged files |
35+
Read ONLY:
36+
1. Files that appear in INVESTIGATE findings (just the changed hunks via `git diff main...HEAD -- <file>`)
37+
2. New files (CKB has no history for these) — but only if <500 lines each
38+
3. Skip generated files, test files for existing tests, and config/CI files
5239

53-
### Phase 3: Semantic review of high-risk files
40+
For each file you read, look for exactly:
41+
- Logic errors (wrong condition, off-by-one, nil deref)
42+
- Security issues (injection, auth bypass, secrets)
43+
- Design problems (wrong abstraction, leaky interface)
44+
- Missing edge cases the tests don't cover
5445

55-
Now read the actual source — but only for:
56-
1. Files CKB ranked as top hotspots
57-
2. Files with warned findings that survived drill-down
58-
3. New files (CKB can't assess design quality of new code)
46+
Do NOT look for: style, naming, formatting, documentation, test coverage —
47+
CKB already checked these structurally.
5948

60-
For each file, look for things CKB CANNOT detect:
61-
- Logic bugs (wrong conditions, off-by-one, race conditions)
62-
- Security issues (injection, auth bypass, data exposure)
63-
- Design problems (wrong abstraction, unclear naming, leaky interfaces)
64-
- Edge cases (nil inputs, empty collections, concurrent access)
65-
- Error handling quality (not just missing — wrong strategy)
66-
67-
### Phase 4: Write the review
68-
69-
Format:
49+
## Phase 3: Write the review (be terse)
7050

7151
```markdown
72-
## Summary
73-
One paragraph: what the PR does, overall assessment.
52+
## [APPROVE|REQUEST CHANGES|DISCUSS] — CKB score: [N]/100
7453

75-
## Must Fix
76-
Findings that should block merge. File:line references.
54+
[One sentence: what the PR does]
7755

78-
## Should Fix
79-
Issues worth addressing but not blocking.
56+
### Issues
57+
1. **[must-fix|should-fix]** `file:line`[issue in one sentence]
58+
2. ...
8059

81-
## CKB Analysis
82-
- Verdict: [pass/warn/fail], Score: [0-100]
83-
- [N] checks passed, [N] warned
84-
- Key findings: [top 3]
85-
- False positives identified: [any CKB findings you disproved]
86-
- Test gaps: [N] untested functions — [your assessment of which matter]
60+
### CKB passed (no review needed)
61+
[comma-separated list of passed checks]
8762

88-
## Recommendation
89-
Approve / Request changes / Needs discussion
63+
### CKB flagged (verified above)
64+
[for each warn/fail finding: confirmed/false-positive + one-line reason]
9065
```
9166

92-
## Tips
67+
If no issues found: just the header line + CKB passed list. Nothing else.
68+
69+
## Anti-patterns (token waste)
9370

94-
- If CKB says "secrets: pass" — trust it, don't re-scan 100+ files
95-
- If CKB says "breaking: pass" — trust it, SCIP-verified API comparison
96-
- If CKB says "dead-code: FormatSARIF" — DON'T trust blindly, verify with `findReferences` or grep
97-
- CKB's hotspot scores are based on git churn history — higher score = more volatile file = review more carefully
98-
- CKB's complexity delta shows WHERE cognitive load increased — read those functions
71+
- Reading files CKB marked as pass → waste
72+
- Reading generated files → waste
73+
- Summarizing what the PR does in detail → waste (git log exists)
74+
- Explaining why passed checks passed → waste
75+
- Running MCP drill-down tools when CLI already gave enough signal → waste
76+
- Reading test files to "verify test quality" → waste unless CKB flagged test-gaps
77+
- Reading hotspot-only files with no findings → high churn ≠ needs review right now

cmd/ckb/setup.go

Lines changed: 49 additions & 52 deletions
Original file line numberDiff line numberDiff line change
@@ -821,86 +821,83 @@ func installClaudeCodeSkills() error {
821821
}
822822

823823
// ckbReviewSkill is the embedded /ckb-review slash command for Claude Code.
824-
const ckbReviewSkill = `Run a comprehensive code review using CKB's deterministic analysis + your semantic review.
824+
const ckbReviewSkill = `Run a CKB-augmented code review optimized for minimal token usage.
825825
826826
## Input
827827
$ARGUMENTS - Optional: base branch (default: main), or "staged" for staged changes, or a PR number
828828
829-
## MCP vs CLI
829+
## Philosophy
830830
831-
CKB runs as an MCP server. MCP mode is preferred because the SCIP index stays loaded between calls — drill-down tools execute instantly against the in-memory index.
831+
CKB already answered the structural questions (secrets? breaking? dead code? test gaps?).
832+
The LLM's job is ONLY what CKB can't do: semantic reasoning about correctness, design,
833+
and intent. Every source line you read costs tokens — read only what CKB says is risky.
832834
833-
## The Three Phases
835+
## Phase 1: Structural scan (~1k tokens into context)
834836
835-
### Phase 1: CKB structural scan (5 seconds, 0 tokens)
836-
837-
Call the reviewPR MCP tool with compact mode:
838-
` + "`" + `reviewPR(baseBranch: "main", compact: true)` + "`" + `
839-
840-
This returns ~1k tokens — verdict, non-pass checks, top 10 findings, action items.
837+
` + "```" + `bash
838+
ckb review --base=main --format=json --compact 2>/dev/null
839+
` + "```" + `
841840
842-
If a PR number was given, get the base branch first:
841+
If a PR number was given:
843842
` + "```" + `bash
844843
BASE=$(gh pr view $ARGUMENTS --json baseRefName -q .baseRefName)
844+
ckb review --base=$BASE --format=json --compact 2>/dev/null
845845
` + "```" + `
846-
Then: ` + "`" + `reviewPR(baseBranch: BASE, compact: true)` + "`" + `
847-
848-
> **If CKB is not running as an MCP server**, use CLI: ` + "`" + `ckb review --base=main --format=json` + "`" + `
849846
850-
From CKB's output:
851-
- **Passed checks** → skip entirely (secrets clean, no breaking changes, etc.)
852-
- **Warned checks** → your review targets
853-
- **Hotspot files** → read these first
854-
- **Test gaps** → functions to evaluate
847+
From the output, build three lists:
848+
- **SKIP**: passed checks — don't touch these files or topics
849+
- **INVESTIGATE**: warned/failed checks — these are your review scope
850+
- **READ**: hotspot files + files with warn/fail findings — the only files you'll read
855851
856-
### Phase 2: Drill down on CKB findings (0 tokens via MCP)
852+
**Early exit**: If verdict=pass and score>=80, write a one-line approval and stop. No source reading needed.
857853
858-
Use CKB MCP tools to investigate before reading source:
854+
## Phase 2: Targeted source reading (the only token-expensive step)
859855
860-
| Finding | Tool | Check |
861-
|---|---|---|
862-
| Dead code | findReferences or searchSymbols → findReferences | Has references SCIP missed? |
863-
| Blast radius | analyzeImpact | Real callers or framework wiring? |
864-
| Coupling gap | explainSymbol on the missing file | Does co-change partner need updates? |
865-
| Complexity | explainFile | Which functions drive the increase? |
866-
| Test gaps | getAffectedTests | Which tests exist? |
856+
Do NOT read the full diff. Do NOT read every changed file.
867857
868-
### Phase 3: Semantic review of high-risk files
858+
Read ONLY:
859+
1. Files that appear in INVESTIGATE findings (just the changed hunks via ` + "`" + `git diff main...HEAD -- <file>` + "`" + `)
860+
2. New files (CKB has no history for these) — but only if <500 lines each
861+
3. Skip generated files, test files for existing tests, and config/CI files
869862
870-
Read source only for:
871-
1. Top hotspot files (CKB ranked by churn)
872-
2. Files with findings that survived drill-down
873-
3. New files (CKB can't assess design quality)
863+
For each file you read, look for exactly:
864+
- Logic errors (wrong condition, off-by-one, nil deref)
865+
- Security issues (injection, auth bypass, secrets)
866+
- Design problems (wrong abstraction, leaky interface)
867+
- Missing edge cases the tests don't cover
874868
875-
Look for: logic bugs, security issues, design problems, edge cases, error handling quality.
869+
Do NOT look for: style, naming, formatting, documentation, test coverage —
870+
CKB already checked these structurally.
876871
877-
### Phase 4: Write the review
872+
## Phase 3: Write the review (be terse)
878873
879874
` + "```" + `markdown
880-
## Summary
881-
One paragraph: what the PR does, overall assessment.
875+
## [APPROVE|REQUEST CHANGES|DISCUSS] — CKB score: [N]/100
882876
883-
## Must Fix
884-
Findings that block merge. File:line references.
877+
[One sentence: what the PR does]
885878
886-
## Should Fix
887-
Issues worth addressing but not blocking.
879+
### Issues
880+
1. **[must-fix|should-fix]** ` + "`" + `file:line` + "`" + ` — [issue in one sentence]
881+
2. ...
888882
889-
## CKB Analysis
890-
- Verdict: [pass/warn/fail], Score: [0-100]
891-
- Key check results, false positives identified
892-
- Test gaps: [N] untested functions
883+
### CKB passed (no review needed)
884+
[comma-separated list of passed checks]
893885
894-
## Recommendation
895-
Approve / Request changes / Needs discussion
886+
### CKB flagged (verified above)
887+
[for each warn/fail finding: confirmed/false-positive + one-line reason]
896888
` + "```" + `
897889
898-
## Tips
890+
If no issues found: just the header line + CKB passed list. Nothing else.
891+
892+
## Anti-patterns (token waste)
899893
900-
- CKB "pass" checks: trust them (SCIP-verified, pattern-scanned)
901-
- CKB "dead-code": verify with findReferences before reporting
902-
- Hotspot scores: higher = more volatile = review more carefully
903-
- Complexity delta: read the specific functions CKB flagged
894+
- Reading files CKB marked as pass — waste
895+
- Reading generated files — waste
896+
- Summarizing what the PR does in detail — waste (git log exists)
897+
- Explaining why passed checks passed — waste
898+
- Running MCP drill-down tools when CLI already gave enough signal — waste
899+
- Reading test files to "verify test quality" — waste unless CKB flagged test-gaps
900+
- Reading hotspot-only files with no findings — high churn does not mean needs review right now
904901
`
905902

906903
func configureVSCodeGlobal(ckbCommand string, ckbArgs []string) error {

docs/decisions/ADR-001-review-llm-integration.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -58,7 +58,7 @@ A `DismissalStore` at `.ckb/review-dismissals.json` lets users dismiss specific
5858
- LLM integration is additive: narrative synthesis, not decision-making
5959
- Token efficiency: ~1.5k tokens per `--llm` call vs ~445k for a full LLM review from source
6060
- Self-enrichment reduces FP rate before the LLM sees findings, preventing FP amplification
61-
- The `/review` Claude Code skill orchestrates the full workflow: CKB → drill-down → semantic review
61+
- The `/review` and `/ckb-review` Claude Code skills orchestrate a token-optimized workflow: CKB structural scan → targeted source reading of flagged files only → terse review output
6262
- Framework symbol filtering (variables, constants, CLI wiring) works across Go, C++, Java, Python via SCIP symbol kinds
6363

6464
## Affected Modules

docs/features/review/advantages.md

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -140,6 +140,19 @@ Interactive setup prompts: "Install /ckb-review skill? [Y/n]" (default: yes).
140140

141141
The skill is embedded in the CKB binary and written to `~/.claude/commands/ckb-review.md`. It auto-updates when `ckb setup` is re-run after an update.
142142

143+
### Token-Optimized Design (v8.3+)
144+
145+
The skill is designed to minimize LLM token usage:
146+
147+
- **Early exit**: If CKB score ≥ 80 and verdict = pass, a one-line approval is emitted — no source reading
148+
- **CLI-first**: Uses `ckb review --format=json --compact` instead of MCP tool discovery, which is faster and more reliable
149+
- **Targeted reads**: Only files with warn/fail findings are read (not all hotspots, not the full diff)
150+
- **Structural trust**: Passed checks (secrets, breaking, dead-code) are trusted without LLM re-verification
151+
- **No drill-down phase**: The previous MCP drill-down step (findReferences, analyzeImpact) is removed — CLI compact output provides enough signal to decide what to read
152+
- **Terse output**: Flat numbered issue list instead of multi-section prose
153+
154+
Typical cost: ~3-8k tokens for a standard PR (down from ~15-30k with the previous skill).
155+
143156
---
144157

145158
## Is This Best Practice?

0 commit comments

Comments
 (0)