Skip to content

Commit bf28787

Browse files
SimplyLizclaude
andauthored
fix: tighten review skill early-exit criteria and add blind spots section (#184)
Syncs local skill refinements to repo and embedded constant: - Early exit now requires score>=90 + zero warns + <100 lines + no new files (score>=80 was unsafe due to per-check caps hiding warnings) - Added "CKB's blind spots" section listing what the LLM must catch (logic errors, business logic, race conditions, etc.) - Expanded Phase 2 checklist: race conditions, incomplete refactoring, secrets beyond CKB's 26 patterns - Added anti-patterns: trusting score>=80, skipping new files Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent 7f7433f commit bf28787

File tree

2 files changed

+62
-8
lines changed

2 files changed

+62
-8
lines changed

.claude/commands/review.md

Lines changed: 31 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,23 @@ CKB already answered the structural questions (secrets? breaking? dead code? tes
99
The LLM's job is ONLY what CKB can't do: semantic reasoning about correctness, design,
1010
and intent. Every source line you read costs tokens — read only what CKB says is risky.
1111

12+
### CKB's blind spots (what the LLM must catch)
13+
14+
CKB runs 15 deterministic checks with AST rules, SCIP index, and git history.
15+
It is structurally sound but semantically blind:
16+
17+
- **Logic errors**: wrong conditions (`>` vs `>=`), off-by-one, incorrect algorithm
18+
- **Business logic**: domain-specific mistakes CKB has no context for
19+
- **Design fitness**: wrong abstraction, leaky interface, coupling that metrics miss
20+
- **Input validation**: missing bounds checks, nil guards outside AST patterns
21+
- **Race conditions**: concurrency issues, mutex ordering, shared state
22+
- **Incomplete refactoring**: callers missed across module boundaries
23+
- **Domain edge cases**: error paths, boundary conditions tests don't cover
24+
25+
CKB's scoring uses per-check caps (max -20) and per-rule caps (max -10), so a score
26+
of 85 can still hide multiple capped warnings. HoldTheLine only flags changed lines,
27+
so pre-existing issues interacting with new code won't surface.
28+
1229
## Phase 1: Structural scan (~1k tokens into context)
1330

1431
```bash
@@ -26,7 +43,14 @@ From the output, build three lists:
2643
- **INVESTIGATE**: warned/failed checks — these are your review scope
2744
- **READ**: hotspot files + files with warn/fail findings — the only files you'll read
2845

29-
**Early exit**: If verdict=pass and score≥80, write a one-line approval and stop. No source reading needed.
46+
**Early exit**: Skip LLM ONLY when ALL conditions are met:
47+
1. Score ≥ 90 (not 80 — per-check caps hide warnings at 80)
48+
2. Zero warn/fail checks
49+
3. Small change (< 100 lines of diff)
50+
4. No new files (CKB has no SCIP history for them)
51+
52+
If ANY condition fails, proceed to Phase 2 — CKB's structural pass does NOT mean
53+
the code is semantically correct.
3054

3155
## Phase 2: Targeted source reading (the only token-expensive step)
3256

@@ -38,10 +62,11 @@ Read ONLY:
3862
3. Skip generated files, test files for existing tests, and config/CI files
3963

4064
For each file you read, look for exactly:
41-
- Logic errors (wrong condition, off-by-one, nil deref)
42-
- Security issues (injection, auth bypass, secrets)
43-
- Design problems (wrong abstraction, leaky interface)
65+
- Logic errors (wrong condition, off-by-one, nil deref, race condition)
66+
- Security issues (injection, auth bypass, secrets CKB's 26 patterns missed)
67+
- Design problems (wrong abstraction, leaky interface, coupling metrics don't catch)
4468
- Missing edge cases the tests don't cover
69+
- Incomplete refactoring (callers that should have changed but didn't)
4570

4671
Do NOT look for: style, naming, formatting, documentation, test coverage —
4772
CKB already checked these structurally.
@@ -75,3 +100,5 @@ If no issues found: just the header line + CKB passed list. Nothing else.
75100
- Running MCP drill-down tools when CLI already gave enough signal → waste
76101
- Reading test files to "verify test quality" → waste unless CKB flagged test-gaps
77102
- Reading hotspot-only files with no findings → high churn ≠ needs review right now
103+
- Trusting score >= 80 as "safe to skip" → dangerous (per-check caps hide warnings)
104+
- Skipping new files because CKB didn't flag them → CKB has no SCIP data for new files

cmd/ckb/setup.go

Lines changed: 31 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -832,6 +832,23 @@ CKB already answered the structural questions (secrets? breaking? dead code? tes
832832
The LLM's job is ONLY what CKB can't do: semantic reasoning about correctness, design,
833833
and intent. Every source line you read costs tokens — read only what CKB says is risky.
834834
835+
### CKB's blind spots (what the LLM must catch)
836+
837+
CKB runs 15 deterministic checks with AST rules, SCIP index, and git history.
838+
It is structurally sound but semantically blind:
839+
840+
- **Logic errors**: wrong conditions (` + "`" + `>` + "`" + ` vs ` + "`" + `>=` + "`" + `), off-by-one, incorrect algorithm
841+
- **Business logic**: domain-specific mistakes CKB has no context for
842+
- **Design fitness**: wrong abstraction, leaky interface, coupling that metrics miss
843+
- **Input validation**: missing bounds checks, nil guards outside AST patterns
844+
- **Race conditions**: concurrency issues, mutex ordering, shared state
845+
- **Incomplete refactoring**: callers missed across module boundaries
846+
- **Domain edge cases**: error paths, boundary conditions tests don't cover
847+
848+
CKB's scoring uses per-check caps (max -20) and per-rule caps (max -10), so a score
849+
of 85 can still hide multiple capped warnings. HoldTheLine only flags changed lines,
850+
so pre-existing issues interacting with new code won't surface.
851+
835852
## Phase 1: Structural scan (~1k tokens into context)
836853
837854
` + "```" + `bash
@@ -849,7 +866,14 @@ From the output, build three lists:
849866
- **INVESTIGATE**: warned/failed checks — these are your review scope
850867
- **READ**: hotspot files + files with warn/fail findings — the only files you'll read
851868
852-
**Early exit**: If verdict=pass and score>=80, write a one-line approval and stop. No source reading needed.
869+
**Early exit**: Skip LLM ONLY when ALL conditions are met:
870+
1. Score >= 90 (not 80 — per-check caps hide warnings at 80)
871+
2. Zero warn/fail checks
872+
3. Small change (< 100 lines of diff)
873+
4. No new files (CKB has no SCIP history for them)
874+
875+
If ANY condition fails, proceed to Phase 2 — CKB's structural pass does NOT mean
876+
the code is semantically correct.
853877
854878
## Phase 2: Targeted source reading (the only token-expensive step)
855879
@@ -861,10 +885,11 @@ Read ONLY:
861885
3. Skip generated files, test files for existing tests, and config/CI files
862886
863887
For each file you read, look for exactly:
864-
- Logic errors (wrong condition, off-by-one, nil deref)
865-
- Security issues (injection, auth bypass, secrets)
866-
- Design problems (wrong abstraction, leaky interface)
888+
- Logic errors (wrong condition, off-by-one, nil deref, race condition)
889+
- Security issues (injection, auth bypass, secrets CKB's 26 patterns missed)
890+
- Design problems (wrong abstraction, leaky interface, coupling metrics don't catch)
867891
- Missing edge cases the tests don't cover
892+
- Incomplete refactoring (callers that should have changed but didn't)
868893
869894
Do NOT look for: style, naming, formatting, documentation, test coverage —
870895
CKB already checked these structurally.
@@ -898,6 +923,8 @@ If no issues found: just the header line + CKB passed list. Nothing else.
898923
- Running MCP drill-down tools when CLI already gave enough signal — waste
899924
- Reading test files to "verify test quality" — waste unless CKB flagged test-gaps
900925
- Reading hotspot-only files with no findings — high churn does not mean needs review right now
926+
- Trusting score >= 80 as "safe to skip" — dangerous (per-check caps hide warnings)
927+
- Skipping new files because CKB did not flag them — CKB has no SCIP data for new files
901928
`
902929

903930
func configureVSCodeGlobal(ckbCommand string, ckbArgs []string) error {

0 commit comments

Comments
 (0)