Skip to content

Commit 4dd62ad

Browse files
MrFlounderclaude
andcommitted
feat(court): enhance reviewer and judge prompts to catch silent data corruption patterns
Adds 4 high-risk pattern checks to both reviewer agent prompts and the judge's own investigation phase, derived from the PR #3232 incident where a clean-looking diff caused mass data corruption through called-but-not-changed code. Reviewers now actively check for: 1. Called-but-not-changed code contracts (trace into called functions) 2. Mutations on read paths (writes triggered by GET endpoints) 3. Blast radius from null/default initial state (mass rewrites on first access) 4. Data contract completeness at merge/spread boundaries (missing fields) The judge also proactively investigates these patterns even if neither reviewer flagged them. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent b303372 commit 4dd62ad

1 file changed

Lines changed: 28 additions & 2 deletions

File tree

src/crabcode

Lines changed: 28 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8107,14 +8107,34 @@ Spawn two reviewer agents to analyze the PR independently:
81078107
```
81088108
Use Task tool:
81098109
subagent_type: "general-purpose"
8110-
prompt: "You are Reviewer A. Review this PR for bugs, security issues, and code quality. Be thorough but avoid false positives. Structure findings as Critical/Warning/Suggestion with file:line references. Here is the context: [include PR diff]"
8110+
prompt: "You are Reviewer A. Review this PR for bugs, security issues, and code quality. Be thorough but avoid false positives. Structure findings as Critical/Warning/Suggestion with file:line references.
8111+
8112+
Beyond standard review, pay special attention to these high-risk patterns:
8113+
8114+
1. **Called-but-not-changed code**: When the diff introduces a new call to existing code (especially data mutation), use the Read tool to trace INTO the called function. Verify its return value and side effects match what the caller assumes. Do not trust functions you haven't read.
8115+
8116+
2. **Mutations on read paths**: Flag any write operation (DB update, cache mutation, queue publish) triggered by a GET/read endpoint. Fire-and-forget patterns that silently mutate data are high risk — the failure mode is silent data corruption with no error signal.
8117+
8118+
3. **Blast radius from initial state**: When new code processes existing records conditionally (e.g. 'refresh if stale'), check the initial state of existing data. If a column starts as null/default for all existing rows, the new path hits EVERY record on first access — that is a mass data rewrite disguised as normal traffic.
8119+
8120+
4. **Data contract completeness at merge boundaries**: When code spreads/merges one data source onto another (e.g. {...existing, ...fresh}), verify the fresh source returns ALL expected fields. Missing fields silently zero out or null existing data.
8121+
8122+
Here is the context: [include PR diff]"
81118123
```
81128124
81138125
**Reviewer B (Codex):**
81148126
```
81158127
Use Task tool:
81168128
subagent_type: "Bash"
8117-
prompt: "Run: codex --print -p 'You are Reviewer B. Review this code for bugs, security issues, and quality problems. Structure as Critical/Warning/Suggestion with file:line. Diff: [include relevant sections]'"
8129+
prompt: "Run: codex --print -p 'You are Reviewer B. Review this code for bugs, security issues, and quality problems. Structure as Critical/Warning/Suggestion with file:line.
8130+
8131+
Beyond standard review, check for these high-risk patterns:
8132+
1. Called-but-not-changed code: If the diff calls existing functions (especially data mutation), verify those functions return/do what the caller assumes.
8133+
2. Mutations on read paths: Flag any DB write or cache mutation triggered by a GET/read endpoint. Fire-and-forget data rewrites are high risk.
8134+
3. Blast radius from initial state: If new code processes records conditionally (refresh if stale), check if null/default initial state means ALL existing records get hit at once.
8135+
4. Data contract completeness: When code spreads/merges data sources ({...existing, ...fresh}), verify the fresh source returns ALL expected fields. Missing fields silently zero out existing data.
8136+
8137+
Diff: [include relevant sections]'"
81188138
```
81198139
81208140
Wait for BOTH reviewers to complete before proceeding.
@@ -8135,6 +8155,12 @@ For EACH finding from either reviewer:
81358155
3. **Check context**: Does surrounding code explain/mitigate it?
81368156
4. **When reviewers disagree**: Investigate deeper, examine edge cases
81378157
8158+
**Judge's Own Investigation** (do this even if neither reviewer flagged it):
8159+
- For any new call path to existing code that mutates data, Read the called function and verify its contract
8160+
- If you see a write/update triggered by a read endpoint, flag it
8161+
- If you see conditional processing of existing records (e.g. "refresh if stale"), check the initial state — does null/default mean all records get hit at once?
8162+
- If you see object spread/merge of data sources, verify field completeness of the source
8163+
81388164
**Critical Rule**: Do NOT include any finding you haven't personally verified in the code.
81398165
81408166
### Phase 4: Deliberate

0 commit comments

Comments
 (0)