Skip to content

Commit d7c3a66

Browse files
bpamiriclaude
andauthored
ci: add convergence loop with address-review + senior-advisor stages (#2540)
Closes the gap between analytical reviewers (A and B) and actual code change. Before this PR, A's review and B's critique produced analytical context for the human to act on — but the human had to translate that context into code edits manually. The loop was open. Three architectural additions close it: ## 1. Reviewer A↔B convergence loop Reviewer A and Reviewer B now back-and-forth until they align on a recommendation: - bot-review-a.yml gains a second trigger (issue_comment) and a new /respond-to-critique prompt for response mode. The response is a review (state=COMMENT), which triggers B's next round. - review-the-review.md (B's prompt) gains the convergence verdict per round: - converged-approve — aligned on no changes needed - converged-changes — aligned on changes needed (triggers Stage 8) - (no marker) — not aligned, A responds in next round - Inner loop cap: 10 rounds per SHA (was 3 for B). Bun-published demos show A↔B going to 100 rounds; 10 is conservative and respects "alignment is the goal." ## 2. Stage 8: Address Review (Opus, code-modifying) New workflow bot-address-review.yml + new prompt /address-review. Fires on wheels-bot:converged-changes:<pr>:<sha> markers. Reads the consensus, applies changes to the PR's existing branch, pushes new commits. The new SHA triggers fresh Reviewer A → loop restarts. Coding stage: Opus model, broad allowlist with test runner, mirrors propose-fix's setup. Branch-aware scope: fix/bot-* allows code+tests, docs/bot-* allows doc paths only. Outer-loop cap: 5 implementations per PR. Combined with B's 10-round inner cap, max bot effort is bounded at 5 × 10 = 50 review rounds before human intervention. ## 3. Stage 9: Senior Advisor (Opus, deadlock resolver) New workflow bot-advisor.yml + new prompt /advise-on-deadlock. Fires on B's :terminal marker (cap reached without convergence). Reads the full A↔B exchange, the disputed code, and canonical references; rules on each disputed point with concrete evidence; issues a tie-breaking verdict (approve or changes) that drops back into the existing convergence flow. This is the only stage running Opus on a non-coding task. Cost is justified because the advisor's verdict overrides the deadlock — it must be right. Without the advisor, deadlocked PRs would just sit waiting for human intervention. With it, even disagreement-prone reviews resolve automatically. The advisor sits as a sibling to address-review: both fire from B's output but on different markers (converged- changes vs :terminal). ## What still doesn't change - WHEELS_BOT_ENABLED=false kill switch — still the active-incident safety net - Required approving review on develop's ruleset — still the merge gate - All bot PRs remain --draft; humans mark ready when ready - Author-identity checks on every auto-fire if: block — still load-bearing ## Model selection - Reviewer A (initial + respond): Sonnet (analytical) - Reviewer B (critique + arbitrate): Sonnet (analytical) - Address Review: Opus (modifies code) - Senior Advisor: Opus (adjudicates code-related disputes) Per the user's call: all coding-adjacent stages use Opus. ## Why convergence-then-implement (not implement-on-every-critique) The earlier sketch fired address-review on every Reviewer B comment, applying findings A flagged but B might dispute. Wasted work at best, regressive at worst. This design waits for genuine alignment (or the advisor's tie-breaking verdict) before any code change. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 8ce5bc9 commit d7c3a66

8 files changed

Lines changed: 980 additions & 71 deletions

File tree

.claude/commands/address-review.md

Lines changed: 159 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,159 @@
1+
# /address-review
2+
3+
Implementer for the convergence loop. After Reviewer A and Reviewer B
4+
have aligned on a "changes needed" verdict for a PR, read the consensus
5+
and apply the changes. Push commits to the PR's existing branch — new
6+
commits trigger a fresh Reviewer A on the new SHA, restarting the
7+
convergence loop until reviewers converge on `approve`.
8+
9+
This is a *coding* stage like propose-fix — Opus, broad allowlist with
10+
the test runner. Mirrors propose-fix's safety patterns.
11+
12+
## Rails
13+
14+
Read `.claude/commands/_shared-rails.md` first. Highlights:
15+
16+
- Use `gh` for GitHub state, full `git` for **the PR's existing branch
17+
only** (the workflow has checked you out on it).
18+
- Run tests via `bash tools/test-local.sh` for `fix/bot-*` PRs after
19+
changes.
20+
- For `docs/bot-*` PRs, restrict edits to doc paths only —
21+
`web/sites/guides/`, `.ai/wheels/`, `CLAUDE.md`, `CHANGELOG.md`.
22+
- Output is **commits to the PR branch** plus **one comment on the
23+
PR** summarizing what you addressed.
24+
25+
## Args
26+
27+
- `<pr-number>` — the PR with converged-changes markers to address
28+
29+
## Steps
30+
31+
1. **Idempotency + outer-loop cap.** Read PR comments via
32+
`gh pr view <pr-number> --json comments,headRefOid,headRefName`.
33+
- If any comment contains
34+
`wheels-bot:address-review:<pr>:<sha>:` for the current head
35+
SHA, exit silently — already addressed at this SHA.
36+
- Count comments matching `wheels-bot:address-review:<pr>:` for
37+
ANY SHA on this PR. If count ≥ 5, post:
38+
39+
```
40+
## Wheels Bot — Address Review (max iterations reached)
41+
42+
Five address-review rounds have run on this PR without the
43+
reviewers converging on `approve`. Handing back to humans —
44+
either the PR's scope is larger than the bot can resolve, or
45+
the reviewers are deadlocked on a design call.
46+
47+
<!-- wheels-bot:address-review:<pr>:<sha>:terminal -->
48+
```
49+
50+
and exit.
51+
- Otherwise: round number = (count of address-review comments) + 1.
52+
53+
2. **Read the consensus.**
54+
- Reviewer A's initial review (`wheels-bot:review-a:<pr>:<sha>:`)
55+
- All Reviewer A response reviews (`wheels-bot:review-a-response:`)
56+
- All Reviewer B comments
57+
(`wheels-bot:review-b:<pr>:<sha>:<round>`) — chronological order
58+
- The latest B comment carrying
59+
`wheels-bot:converged-changes:<pr>:<sha>` is the trigger; its
60+
body summarizes the alignment.
61+
62+
The **consensus changes** = the union of:
63+
- A's findings B did **not** mark as false positives
64+
- B's missed-issues findings A did **not** successfully refute in
65+
a response
66+
- Findings both A and B explicitly agreed on in any round
67+
68+
Skip (do not act on):
69+
- A's findings B successfully refuted as false positives
70+
- Findings A and B disagreed on across rounds (the
71+
converged-changes verdict means there's enough alignment on the
72+
above to act; isolated disputes are left for the next loop)
73+
74+
3. **Auto-downgrade safety net.** Before writing anything, if any
75+
consensus finding would touch:
76+
- `vendor/wheels/security/**`, auth flows, password / token code
77+
- `vendor/wheels/middleware/**` auth-related middleware
78+
- Migration files under
79+
`vendor/wheels/migrator/**` or `app/migrator/migrations/**`
80+
- `cli/lucli/services/deploy/**` or anything under `wheels deploy`
81+
- `vendor/wheels/di/**` or DI container internals
82+
83+
**Stop**. Post:
84+
85+
```
86+
## Wheels Bot — Address Review held for human review
87+
88+
The consensus findings touch a sensitive area (`<area>`) and the
89+
bot's safety net requires a human in the loop before any code
90+
change. The PR's reviewer-feedback exchange is preserved above
91+
for context.
92+
93+
<!-- wheels-bot:address-held:<pr>:<sha> -->
94+
```
95+
96+
and exit.
97+
98+
4. **Branch-aware scope check.** Read the PR's head ref via
99+
`gh pr view <pr-number> --json headRefName -q '.headRefName'`:
100+
- `fix/bot-*` → may modify code, tests, CHANGELOG
101+
- `docs/bot-*` → doc paths only. If a consensus finding requires
102+
touching code, post `address-held` and exit (the PR's scope is
103+
wrong for that finding).
104+
105+
5. **Apply the consensus changes.** For each consensus finding:
106+
- Read the cited file
107+
- Make the smallest change that addresses the finding
108+
- For `fix/bot-*` PRs: after the changes, re-run any affected spec
109+
via `bash tools/test-local.sh <layer>` to confirm nothing
110+
regressed. Capture the output.
111+
112+
6. **Stage and commit.** Single conventional commit on the existing
113+
branch. Don't open a new branch — push back to the same branch
114+
the PR is on.
115+
116+
- Type: `fix` (for `fix/bot-*` PRs) or `docs` (for `docs/bot-*` PRs)
117+
- Subject (≤ 100 chars):
118+
`address Reviewer A/B consensus findings (round <N>)`
119+
- Body: bullet list of what was addressed, with file references.
120+
121+
```bash
122+
git add <files>
123+
git commit -m "<message>"
124+
```
125+
126+
The workflow's "Push branch" step pushes after this prompt
127+
completes.
128+
129+
7. **Post the address-review comment** on the PR:
130+
131+
```
132+
## Wheels Bot — Address Review (round <N>)
133+
134+
Applied consensus findings from Reviewer A and Reviewer B's
135+
convergence (round <round-of-convergence-loop>):
136+
137+
<bulleted list — what was addressed, file:line references>
138+
139+
<if any findings were intentionally skipped because they weren't in
140+
the consensus, list them with "skipped: <reason>">
141+
142+
The new commit will trigger a fresh Reviewer A run on the updated
143+
SHA. Convergence loop continues until reviewers align on `approve`
144+
or the outer-loop cap (5 rounds) is reached.
145+
146+
<!-- wheels-bot:address-review:<pr>:<sha-before>:<N> -->
147+
```
148+
149+
8. **Self-check before posting.**
150+
- [ ] Branch-aware scope check passed — no files modified outside
151+
allowed paths
152+
- [ ] For `fix/bot-*`: tests re-run, output cited in the comment
153+
- [ ] Commit message is conventional, subject ≤ 100 chars
154+
- [ ] PR comment includes the marker with the correct
155+
`<sha-before>` (the head SHA at the start of this run, not after
156+
your commit)
157+
- [ ] Outer-loop count is correctly reflected in the round number
158+
159+
If any check fails, do not post; investigate and exit non-zero.
Lines changed: 161 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,161 @@
1+
# /advise-on-deadlock
2+
3+
Senior advisor for the wheels-bot review chain. Fires only when
4+
Reviewer A and Reviewer B fail to converge on a recommendation after
5+
the inner loop's 10-round cap (B emits a `:terminal` marker). Read
6+
the full exchange, identify the specific disputed points, and issue a
7+
tie-breaking verdict using deeper reasoning than the analytical
8+
reviewers brought.
9+
10+
This is the only stage in the pipeline that runs **Opus on a
11+
non-coding task**. The reasoning depth is justified because the
12+
advisor's verdict overrides the analytical reviewers' deadlock and
13+
drops back into the existing convergence flow — the advisor either
14+
ends the loop (`converged-approve`) or triggers `bot-address-review.yml`
15+
(`converged-changes`).
16+
17+
## Rails
18+
19+
Read `.claude/commands/_shared-rails.md` first. Highlights:
20+
21+
- Use `gh` for GitHub state and **read-only** `git`. No file writes,
22+
no edits, no commits.
23+
- Output is **one PR comment** with the verdict + a convergence
24+
marker. No PR reviews, no code modifications.
25+
- One-shot stage — fires once per terminal marker per SHA and exits.
26+
No iteration.
27+
28+
## Args
29+
30+
- `<pr-number>` — the PR with deadlocked A↔B exchange
31+
32+
## Steps
33+
34+
1. **Idempotency check.** Read PR comments via
35+
`gh pr view <pr-number> --json comments,headRefOid`. If any
36+
comment contains `wheels-bot:advisor:<pr>:<sha>` for the current
37+
head SHA, exit silently — already advised at this SHA.
38+
39+
2. **Confirm the deadlock.** Look for a comment containing
40+
`wheels-bot:review-b:<pr>:<sha>:terminal` for the current head
41+
SHA. That's the trigger marker. If no terminal marker is present
42+
for the current SHA, exit silently (this command shouldn't have
43+
fired).
44+
45+
3. **Read the full exchange.**
46+
- The PR diff via `gh pr diff <pr-number>`.
47+
- The PR title/body via `gh pr view <pr-number>` for original
48+
context (and the `Fixes #<issue>` link, if any — the original
49+
issue's framing matters).
50+
- All `wheels-bot[bot]` PR reviews on the current SHA: A's initial
51+
review and any response reviews
52+
(`wheels-bot:review-a-response:`).
53+
- All `wheels-bot[bot]` PR comments on the current SHA matching
54+
`wheels-bot:review-b:<pr>:<sha>:` — the full B critique chain in
55+
chronological order.
56+
57+
4. **Identify the deadlock.** Make a precise list of the SPECIFIC
58+
points where A and B disagreed and never resolved across rounds:
59+
- Findings A flagged but B persistently called false positives
60+
(and A defended).
61+
- Issues B raised but A persistently refuted (and B re-raised).
62+
- Verdict disagreements (A says `approve`, B says
63+
`request-changes`, or vice versa).
64+
65+
For each disputed point, capture: A's position, B's position, and
66+
why neither yielded.
67+
68+
5. **Read the disputed code.** For each disputed point, `Read` the
69+
actual file at the cited line. Don't rely solely on quoted
70+
snippets in the exchange — the source of truth is the code on the
71+
PR's branch (you're checked out on its head SHA).
72+
73+
6. **Consult canonical references.** Before each ruling:
74+
- `.ai/wheels/<layer>/` for the layer in dispute (model, view,
75+
controller, etc.).
76+
- `CLAUDE.md` § "Critical Anti-Patterns" + § "Wheels Conventions"
77+
+ § "Commit Message Conventions" — these are authoritative.
78+
- `.ai/wheels/cross-engine-compatibility.md` if the dispute
79+
touches Lucee/Adobe/BoxLang behavior.
80+
- Existing precedent: `Grep`/`Glob` for similar code elsewhere in
81+
the repo to see how this convention is handled when it's not
82+
contested.
83+
84+
7. **Rule on each disputed point.** For each, decide:
85+
- **A was right** (cite the evidence)
86+
- **B was right** (cite the evidence)
87+
- **Both partially right** (synthesize the actually-correct
88+
position)
89+
- **Neither was right** (the dispute itself was misframed; here's
90+
the real concern)
91+
92+
Cite a concrete reference for each ruling — file:line, doc path,
93+
or both.
94+
95+
8. **Synthesize the verdict.** Roll up the per-point rulings into one
96+
final recommendation:
97+
98+
- **`approve`** — disputed points were minor, or A and B were
99+
debating preferences rather than correctness. PR is fine to
100+
merge as-is. Use this when the per-point rulings are dominated
101+
by "both partially right" or "neither was right" outcomes.
102+
- **`changes`** — at least one disputed point clearly required a
103+
change (you ruled on a real correctness issue, anti-pattern, or
104+
security concern). Specify which findings address-review should
105+
act on (the ones you ruled in favor of the side requesting the
106+
change) and which to drop.
107+
108+
9. **Post the advisor comment** on the PR. Use
109+
`gh pr comment <pr-number> --body "<...>"`:
110+
111+
```
112+
## Wheels Bot — Senior Advisor (deadlock resolution)
113+
114+
Reviewer A and Reviewer B reached the 10-round inner-loop cap
115+
without converging. After re-reading the full exchange and the
116+
disputed code, here are the rulings on each contested point:
117+
118+
### Disputed points
119+
120+
1. **<short title>** — A claimed `<A's position>`; B claimed
121+
`<B's position>`. **Ruling:** `<A right | B right | both
122+
partially | neither>`. **Evidence:** `<file:line | doc path |
123+
both>`. `<one-sentence reasoning>`.
124+
2. ...
125+
126+
### Verdict: `<approve | changes>`
127+
128+
<one paragraph: synthesizing the rulings into the recommendation>
129+
130+
<if verdict is `changes`, list the specific findings address-review
131+
should act on:>
132+
133+
### Findings for address-review to apply
134+
- **<finding>** at `<file:line>` — `<concrete action>`
135+
- ...
136+
137+
<if verdict is `approve`, note that the disputed findings should be
138+
dropped and the PR is fine to merge as-is.>
139+
140+
<!-- wheels-bot:advisor:<pr>:<sha> -->
141+
<CONVERGENCE_MARKER>
142+
```
143+
144+
Where `<CONVERGENCE_MARKER>` is:
145+
- `<!-- wheels-bot:converged-approve:<pr>:<sha> -->` if verdict is
146+
`approve`
147+
- `<!-- wheels-bot:converged-changes:<pr>:<sha> -->` if verdict is
148+
`changes` (triggers `bot-address-review.yml`)
149+
150+
10. **Self-check before posting.**
151+
- [ ] Each ruling cites a concrete file:line, doc path, or both —
152+
no vague handwaves.
153+
- [ ] Read the actual disputed code (not just exchange quotes).
154+
- [ ] Consulted `CLAUDE.md` and `.ai/wheels/` where applicable.
155+
- [ ] Verdict is one of `approve` or `changes` — not "kinda
156+
mostly", not equivocal.
157+
- [ ] Convergence marker is consistent with the verdict.
158+
- [ ] Advisor marker present.
159+
160+
If any check fails, fix before posting. The advisor's verdict is
161+
authoritative within the convergence loop — get it right.

0 commit comments

Comments
 (0)