You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat(court): enhance reviewer and judge prompts to catch silent data corruption patterns
Adds 4 high-risk pattern checks to both reviewer agent prompts and the
judge's own investigation phase, derived from the PR #3232 incident where
a clean-looking diff caused mass data corruption through called-but-not-changed code.
Reviewers now actively check for:
1. Called-but-not-changed code contracts (trace into called functions)
2. Mutations on read paths (writes triggered by GET endpoints)
3. Blast radius from null/default initial state (mass rewrites on first access)
4. Data contract completeness at merge/spread boundaries (missing fields)
The judge also proactively investigates these patterns even if neither
reviewer flagged them.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy file name to clipboardExpand all lines: src/crabcode
+28-2Lines changed: 28 additions & 2 deletions
Original file line number
Diff line number
Diff line change
@@ -8107,14 +8107,34 @@ Spawn two reviewer agents to analyze the PR independently:
8107
8107
```
8108
8108
Use Task tool:
8109
8109
subagent_type: "general-purpose"
8110
-
prompt: "You are Reviewer A. Review this PR for bugs, security issues, and code quality. Be thorough but avoid false positives. Structure findings as Critical/Warning/Suggestion with file:line references. Here is the context: [include PR diff]"
8110
+
prompt: "You are Reviewer A. Review this PR for bugs, security issues, and code quality. Be thorough but avoid false positives. Structure findings as Critical/Warning/Suggestion with file:line references.
8111
+
8112
+
Beyond standard review, pay special attention to these high-risk patterns:
8113
+
8114
+
1. **Called-but-not-changed code**: When the diff introduces a new call to existing code (especially data mutation), use the Read tool to trace INTO the called function. Verify its return value and side effects match what the caller assumes. Do not trust functions you haven't read.
8115
+
8116
+
2. **Mutations on read paths**: Flag any write operation (DB update, cache mutation, queue publish) triggered by a GET/read endpoint. Fire-and-forget patterns that silently mutate data are high risk — the failure mode is silent data corruption with no error signal.
8117
+
8118
+
3. **Blast radius from initial state**: When new code processes existing records conditionally (e.g. 'refresh if stale'), check the initial state of existing data. If a column starts as null/default for all existing rows, the new path hits EVERY record on first access — that is a mass data rewrite disguised as normal traffic.
8119
+
8120
+
4. **Data contract completeness at merge boundaries**: When code spreads/merges one data source onto another (e.g. {...existing, ...fresh}), verify the fresh source returns ALL expected fields. Missing fields silently zero out or null existing data.
8121
+
8122
+
Here is the context: [include PR diff]"
8111
8123
```
8112
8124
8113
8125
**Reviewer B (Codex):**
8114
8126
```
8115
8127
Use Task tool:
8116
8128
subagent_type: "Bash"
8117
-
prompt: "Run: codex --print -p 'You are Reviewer B. Review this code for bugs, security issues, and quality problems. Structure as Critical/Warning/Suggestion with file:line. Diff: [include relevant sections]'"
8129
+
prompt: "Run: codex --print -p 'You are Reviewer B. Review this code for bugs, security issues, and quality problems. Structure as Critical/Warning/Suggestion with file:line.
8130
+
8131
+
Beyond standard review, check for these high-risk patterns:
8132
+
1. Called-but-not-changed code: If the diff calls existing functions (especially data mutation), verify those functions return/do what the caller assumes.
8133
+
2. Mutations on read paths: Flag any DB write or cache mutation triggered by a GET/read endpoint. Fire-and-forget data rewrites are high risk.
8134
+
3. Blast radius from initial state: If new code processes records conditionally (refresh if stale), check if null/default initial state means ALL existing records get hit at once.
8135
+
4. Data contract completeness: When code spreads/merges data sources ({...existing, ...fresh}), verify the fresh source returns ALL expected fields. Missing fields silently zero out existing data.
8136
+
8137
+
Diff: [include relevant sections]'"
8118
8138
```
8119
8139
8120
8140
Wait for BOTH reviewers to complete before proceeding.
@@ -8135,6 +8155,12 @@ For EACH finding from either reviewer:
8135
8155
3. **Check context**: Does surrounding code explain/mitigate it?
**Judge's Own Investigation** (do this even if neither reviewer flagged it):
8159
+
- For any new call path to existing code that mutates data, Read the called function and verify its contract
8160
+
- If you see a write/update triggered by a read endpoint, flag it
8161
+
- If you see conditional processing of existing records (e.g. "refresh if stale"), check the initial state — does null/default mean all records get hit at once?
8162
+
- If you see object spread/merge of data sources, verify field completeness of the source
8163
+
8138
8164
**Critical Rule**: Do NOT include any finding you haven't personally verified in the code.
0 commit comments