You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
1.1.0: Atomic signal protocol, unstick script, deadlock-resistant handoffs
Fixes a recurring cross-turn deadlock where one side wrote State.json
and the other side never woke up to see it. Signals now explicitly
pair the State.json flip with a backgrounded wait-for-state.sh watcher,
and a new unstick.sh diagnostic recovers from lingering stalls.
Copy file name to clipboardExpand all lines: commands/init.md
+8Lines changed: 8 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -369,6 +369,14 @@ Populate each with project-specific context. Key requirements:
369
369
370
370
**Generator.md should reference existing project skills** that are relevant (e.g., "Load `swift-code-context` before writing Swift code", "Load `swiftui-code-context` for SwiftUI work", "Use `review-swift-changes` for validation"). **For macOS apps, reference `macos-peekaboo` (runtime UI automation) and `macos-accessibility-ids` (so new/touched SwiftUI/AppKit views are automatable by Peekaboo and XCUITest out of the box).**
371
371
372
+
**Generator.md AND Evaluator.md must both include a top-of-file reminder** pointing at the Signal Protocol in SKILL.md. Suggested exact wording (add verbatim at the top of each file, under the first heading):
373
+
374
+
```markdown
375
+
> ⛔ **Signal Protocol — Atomic (NON-NEGOTIABLE):** every round handoff is the two-step SIGNAL from the Generator/Evaluator SKILL.md §"Signal Protocol" — flip State.json **and** launch `wait-for-state.sh` via `Bash run_in_background: true`**before the response ends**. A State.json write without the watcher deadlocks the loop; a foreground `ls`/`until` poll is NOT a substitute — those die when the turn ends. If the user ever asks "why did you stop?", run `scripts/unstick.sh <mission>` and follow the "If the user asks…" section of SKILL.md.
376
+
```
377
+
378
+
This reminder is the project-level safety net that makes the skill-level rule impossible to miss, even if the agent somehow skims past it in SKILL.md.
379
+
372
380
**Planner.md should reference ALL major documentation areas** — not just code docs. Include research folders, exploration docs, context directories, domain-specific reference material with "When to read" guidance.
ACTION="Evaluator should read Generator/Round-$ROUND.md and evaluate."
82
+
;;
83
+
"ready-for-eval/evaluating")
84
+
AT_FAULT="evaluator"
85
+
ACTION="Evaluator claimed evaluating but no verdict landed. Resume evaluation or signal done with the atomic template."
86
+
;;
87
+
"ready-for-eval/done")
88
+
AT_FAULT="generator"
89
+
ACTION="Generator should read Evaluator/Round-$ROUND.md (verdict: $VERDICT) and start the next round."
90
+
;;
91
+
"working/"*)
92
+
AT_FAULT="generator"
93
+
ACTION="Generator is working (healthy if actively implementing; stalled if no progress for 15+ min). If stalled, resume implementation and signal ready-for-eval when done."
94
+
;;
95
+
"researching/"*)
96
+
AT_FAULT="generator"
97
+
ACTION="Generator is in research mode. If the Evaluator is already watching, proceed to implementation."
98
+
;;
99
+
*)
100
+
ACTION="State combination ($G_STATUS/$E_STATUS) does not match a known waiting pattern — investigate manually."
101
+
;;
102
+
esac
103
+
104
+
# Flag the likely watcher-missing side separately from the at-fault side. The two can diverge: e.g. the Evaluator
105
+
# might be at-fault (their turn to act) AND their watcher is dead, which compounds the deadlock because their
106
+
# dead watcher means the next Generator signal will also be missed.
107
+
WATCHER_NOTE=""
108
+
if [[ "$AT_FAULT"=="evaluator"&&"$WATCHERS_EVAL_SIDE"=="0" ]];then
109
+
WATCHER_NOTE="⚠ Evaluator has no live watcher — it will not wake when the Generator signals again."
Copy file name to clipboardExpand all lines: skills/evaluator/SKILL.md
+95-29Lines changed: 95 additions & 29 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -18,6 +18,67 @@ You are the Evaluator. Your job is to verify the Generator's work against the sp
18
18
2.**Report format** is in `templates/Evaluator-Round-Format.md`. **Strategies** are in `strategies/`.
19
19
3. Do NOT use subagents for evaluation — you and Codex are the two independent evaluators.
20
20
21
+
## ⛔ Signal Protocol — Atomic (NON-NEGOTIABLE) ⛔
22
+
23
+
**A "signal" from the Evaluator to the Generator is NOT just a State.json write. It is a two-step atomic operation, and both steps must happen before your response ends. Skipping the second step deadlocks the loop — the Generator can flip to `ready-for-eval` in the next round but nothing will wake you to respond.**
24
+
25
+
The same applies to the readiness signal at Step 2 (`evaluatorStatus: watching`) and to the "keep watching" watchers after every verdict.
26
+
27
+
### The SIGNAL template — use this EVERY time you hand off or wait for a round
**A signal is incomplete without both steps.** If you wrote Step 1 and did not start Step 2 before the response ended, you violated the protocol. The Generator's next round signal will sit unseen until the user manually intervenes.
44
+
45
+
### Why it MUST be the backgrounded watcher
46
+
47
+
Within one turn, foreground `ls` polls or `until` loops inside a single Bash call work fine. But **the moment your response ends, foreground polls die**. The only thing that wakes you across turn boundaries is a `run_in_background: true` Bash task completing and firing a `<task-notification>` into your session. `wait-for-state.sh` exists specifically for this purpose:
48
+
49
+
- It uses watchman-wait when available, else md5-polls State.json every 5 seconds.
50
+
- When the watched field matches, it prints `READY` and exits cleanly.
51
+
- Exit → Claude Code fires `<task-notification>` → your next turn starts automatically → read current State.json, do the work, signal again (with the same atomic template).
52
+
53
+
### Before your response ends — pre-flight checklist
54
+
55
+
If your response is about to end, verify **all three** of these:
56
+
-[ ] State.json is in the correct state (`watching` / `evaluating` / `done` + `verdict` if applicable).
57
+
-[ ] A `wait-for-state.sh … generatorStatus ready-for-eval` watcher is running via `Bash run_in_background: true` (plus a `phase complete` watcher after a verdict — see Step 6).
58
+
-[ ] The last thing you did was a tool call (ideally the watcher launch), not explanatory text. Closing narration like "Watching for next round…" = the deadlock pattern.
59
+
60
+
If any box is unchecked: **do not let the response end.** Fix it with another tool call.
61
+
62
+
### Why this is non-negotiable
63
+
64
+
This pattern has caused real cross-turn deadlocks in live missions in BOTH directions — Evaluator PASSes sitting unseen because the Generator didn't arm its wake-up watcher, and Generator signals sitting unseen because the Evaluator ended its response after writing the verdict without arming the next-round watcher. The atomic template above is the only reliable fix.
65
+
66
+
## If the user asks "why did you stop?" / "what are you waiting for?" / "why are both frozen?"
67
+
68
+
Treat it as an unstick request. Run the diagnostic:
-**If YOU (Evaluator) are at-fault:** resume work immediately — re-read State.json, pick up the current round, do the next step per this SKILL. Re-signal at the end with the atomic template. No `--touch` needed; doing the work IS the fix.
78
+
-**If the Generator is at-fault and your watcher is alive:** no action — your watcher will fire when they move. Show the diagnosis to the user.
79
+
-**If the Generator is at-fault and your watcher is dead:** re-arm it immediately (Step 6 of this SKILL — both watchers) so their eventual signal doesn't get missed a second time.
80
+
-**If the diagnosis says your watcher is alive but the Generator is stuck:** re-run with `--touch` to refresh State.json's mtime. That re-fires the Generator's live watcher if theirs is still alive. If that doesn't wake them, their session is dead — only the user can nudge it directly.
81
+
21
82
## Mindset + Anti-Bias Rules
22
83
23
84
-**Assume the Generator made mistakes.** Your job is to find them.
@@ -82,19 +143,23 @@ The user invokes this skill with `/tandemkit:evaluator NNN-MissionName`. First r
82
143
4. Read any `UserFeedback/` files if this is a post-feedback round
83
144
5.**Scan `.claude/skills/` for skills relevant to this mission's topic.** Load any that seem related — they may contain domain knowledge, validation rules, or conventions critical for correct evaluation. If the Spec mentions specific skills, load those too.
84
145
85
-
## Step 2 — Signal Readiness and Wait for Generator
146
+
## Step 2 — Signal Readiness and Wait for Generator (ATOMIC SIGNAL)
147
+
148
+
This is a SIGNAL per the "⛔ Signal Protocol" section above. Both halves mandatory before response ends.
149
+
150
+
5.**Half 1: flip State.json** → `evaluatorStatus: "watching"`. Read-modify-write only your field.
86
151
87
-
5. Update State.json: `evaluatorStatus: "watching"`. Read-modify-write only your field.
88
-
6. Check `generatorStatus`:
89
-
-`"ready-for-eval"` → Proceed to Step 3 immediately.
90
-
- Any other value → Wait:
152
+
6.**Half 2: launch the wake-up watcher.** Check `generatorStatus`:
153
+
- If already `"ready-for-eval"` → proceed to Step 3 immediately within this turn (no watcher needed — just continue).
154
+
- Otherwise → **before ending this response**, launch the watcher via `Bash run_in_background: true`:
Run with `run_in_background: true`. When it prints "READY", proceed to Step 3.
159
+
The script's exit fires a `<task-notification>` that auto-starts your next turn with Step 3. Foreground polls won't survive the turn boundary.
95
160
96
161
════════════════════════════════════════
97
-
→ Watching — Waitingfor Generator
162
+
→ Watching — watcher armed, waitingfor Generator
98
163
════════════════════════════════════════
99
164
100
165
## Step 3 — Parallel Independent Evaluation (Round 1 of each eval cycle)
@@ -239,37 +304,38 @@ The user invokes this skill with `/tandemkit:evaluator NNN-MissionName`. First r
239
304
240
305
**Efficiency tip:** When the changes between rounds are small (e.g., a few findings adjusted, one section updated), consider copying the previous file and editing only the changed parts (`cp` + Edit tool) instead of writing the entire file from scratch. This saves output tokens and time. Use your judgment — if the restructuring is substantial, a fresh Write is cleaner.
241
306
242
-
## Step 5 — Signal Generator
307
+
## Step 5 — Signal Generator + Arm Watchers (ATOMIC SIGNAL)
308
+
309
+
**This is a SIGNAL per the "⛔ Signal Protocol" section above. All three halves mandatory before response ends.** Writing the verdict without arming the watchers is the deadlock pattern — do not skip any half.
When it fires, print the closing banner and stop — the mission is over.
245
326
246
327
════════════════════════════════════════
247
-
→ Verdict: [PASS/FAIL/PASS_WITH_GAPS/BLOCKED] — Watching for next round
328
+
→ Verdict: [PASS/FAIL/PASS_WITH_GAPS/BLOCKED]
329
+
→ Both watchers armed — response may end safely
248
330
════════════════════════════════════════
249
331
250
332
## Step 6 — Keep Watching
251
333
252
334
**CRITICAL: You are NEVER done until `phase` is `"complete"`.** A PASS verdict does NOT end your watch duty. The user may give feedback, the Generator will iterate, and you will evaluate again. Only `phase: "complete"` (set by the user through the Generator) or the user exiting your session ends your job.
253
335
254
-
After writing your verdict, IMMEDIATELY start TWO background watchers:
Run both with `run_in_background: true`. When either returns:
269
-
- If `generatorStatus: ready-for-eval` → go back to **Step 3**
270
-
- If `phase: complete` → print the closing banner and stop
336
+
The two watchers from Step 5 Halves 2 and 3 are the mechanism that lets you end the current response safely. Their completion triggers new turns automatically.
271
337
272
-
If a watch times out (10 minutes), re-read State.json and restart the watchers. NEVER go idle.
338
+
If a watcher times out (default 10 min in `wait-for-state.sh`), the runtime delivers a completion notification anyway — on that wake-up, re-read State.json, decide next action, and re-arm watchers if still waiting. NEVER go idle without a watcher armed.
0 commit comments