Skip to content

Commit 7465c7c

Browse files
authored
docs(workflows): teach repairable gate patterns (#32)
* docs(workflows): teach repairable gate patterns * docs(workflows): add post-repair proof gates
1 parent c1bc425 commit 7465c7c

3 files changed

Lines changed: 233 additions & 38 deletions

File tree

prpm.json

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
{
22
"name": "agent-workforce-skills",
3-
"version": "1.0.1",
3+
"version": "1.0.2",
44
"description": "Skills for multi-agent coordination - swarm patterns, workflow building, relay usage, and headless orchestration",
55
"author": "khaliqgant",
66
"organization": "agent-relay",
@@ -28,7 +28,7 @@
2828
},
2929
{
3030
"name": "writing-agent-relay-workflows",
31-
"version": "1.6.2",
31+
"version": "1.6.3",
3232
"description": "Use when building multi-agent workflows with the relay broker-sdk - covers conversation-shape vs pipeline-shape coordination, WorkflowBuilder API, DAG step dependencies, agent definitions, output chaining via {{steps.X.output}}, verification gates, evidence-based completion, channels, swarm patterns, chat-native coordination recipes (Q/A, broadcast-ack, peer review, standup, hand-off), error handling, event listeners, step sizing, lead+workers team pattern, and parallel wave planning",
3333
"format": "claude",
3434
"subtype": "skill",
@@ -107,7 +107,7 @@
107107
},
108108
{
109109
"name": "relay-80-100-workflow",
110-
"version": "1.0.0",
110+
"version": "1.0.1",
111111
"description": "Use when writing agent-relay workflows that must fully validate features end-to-end before merging - covers the 80-to-100 pattern with PGlite in-memory Postgres testing, mock sandbox patterns, test-fix-rerun loops, verify gates, and full lifecycle from implementation through passing tests to commit",
112112
"format": "claude",
113113
"subtype": "skill",
@@ -166,7 +166,7 @@
166166
"id": "agent-relay-starter",
167167
"name": "Agent Relay Starter",
168168
"description": "Essential skills for building multi-agent systems with Agent Relay - swarm pattern selection, workflow authoring, and trail debugging",
169-
"version": "1.0.3",
169+
"version": "1.0.4",
170170
"category": "development",
171171
"tags": [
172172
"multi-agent",

skills/relay-80-100-workflow/SKILL.md

Lines changed: 83 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
22
name: relay-80-100-workflow
3-
description: Use when writing agent-relay workflows that must fully validate features end-to-end before merging. Covers the 80-to-100 pattern - going beyond "code compiles" to "feature works, tested E2E locally." Includes PGlite for in-memory Postgres testing, mock sandbox patterns, test-fix-rerun loops, verify gates after every edit, and the full lifecycle from implementation through passing tests to commit.
3+
description: Use when writing agent-relay workflows that must fully validate features end-to-end before merging. Covers the 80-to-100 pattern - going beyond "code compiles" to "feature works, tested E2E locally." Includes repair-before-failure validation gates, PGlite for in-memory Postgres testing, mock sandbox patterns, test-fix-rerun loops, verify gates after every edit, and the full lifecycle from implementation through passing tests to commit.
44
---
55

66
# Writing 80-to-100 Validated Workflows
@@ -26,9 +26,22 @@ implement → write tests → run tests → fix failures → re-run → build ch
2626

2727
This means the commit at the end of the workflow represents code that is **proven working**, not just code that an agent wrote and claimed works.
2828

29+
## Repair Before Failure
30+
31+
An 80-to-100 workflow should not stop merely because a test, typecheck, lint, schema, or E2E gate turns red. That red output is work for the agent team. Capture it, hand it to a repair owner, fix it, and rerun. Reserve workflow failure for cases the team cannot repair in the current run, such as missing credentials, wrong repository, exhausted repair budget, or an unsafe dirty worktree.
32+
33+
Use this shape for every meaningful gate:
34+
35+
1. `run-*`: deterministic command with `captureOutput: true` and `failOnError: false`.
36+
2. `fix-*`: agent step that reads `{{steps.run-*.output}}`, fixes source/tests/config, and reruns the command locally until green.
37+
3. `verify-*`: deterministic rerun, usually still `failOnError: false`, followed by a final repair step if red.
38+
4. `commit-if-green`: deterministic step that reruns the full acceptance command and commits only when every exit code is zero.
39+
40+
AgentWorkforce/relay#827 added repair-aware reliability to the SDK (`.reliable()` / `.repairable()` and repair-aware retry-mode workflows). Prefer those presets when available, but still model explicit repair owners when gate output needs domain-specific fixing.
41+
2942
## The Test-Fix-Rerun Pattern
3043

31-
Every testable feature in a workflow should follow this three-step pattern:
44+
Every testable feature in a workflow should follow this four-step pattern:
3245

3346
```typescript
3447
// Step 1: Run tests (allow failure — we expect issues on first run)
@@ -58,20 +71,31 @@ If there are failures:
5871
verification: { type: 'exit_code' },
5972
})
6073

61-
// Step 3: Deterministic final run — this one MUST pass
74+
// Step 3: Deterministic rerun — capture result for a final repair pass
6275
.step('run-tests-final', {
6376
type: 'deterministic',
6477
dependsOn: ['fix-tests'],
6578
command: 'npx tsx --test tests/my-feature.test.ts 2>&1',
6679
captureOutput: true,
67-
failOnError: true, // <-- Hard fail if tests still broken
80+
failOnError: false,
81+
})
82+
83+
// Step 4: Repair again if the rerun is still red
84+
.step('fix-tests-final', {
85+
agent: 'tester',
86+
dependsOn: ['run-tests-final'],
87+
task: `If the final test rerun passed, record the green evidence.
88+
If it failed, fix the remaining issue and rerun until green:
89+
{{steps.run-tests-final.output}}`,
90+
verification: { type: 'exit_code' },
6891
})
6992
```
7093

71-
**Why three steps instead of one?**
94+
**Why four steps instead of one?**
7295
- The first run captures output for the agent to diagnose
7396
- The agent step can iterate (read errors, fix, re-run) multiple times
74-
- The final deterministic run is the gate — no agent judgment, just pass/fail
97+
- The final deterministic run is still evidence-based, but a repair agent sees it before the workflow stops
98+
- The last repair step keeps the workflow aligned with the agent-team model instead of ending on a fixable failure
7599

76100
## PGlite: In-Memory Postgres for Database Testing
77101

@@ -170,9 +194,15 @@ Never trust that an agent edited a file correctly. Add a deterministic verify ga
170194
dependsOn: ['edit-schema'],
171195
command: `if git diff --quiet packages/web/lib/db/schema.ts; then echo "NOT MODIFIED"; exit 1; fi
172196
grep "my_new_table" packages/web/lib/db/schema.ts >/dev/null && echo "OK" || (echo "MISSING"; exit 1)`,
173-
failOnError: true,
197+
failOnError: false,
174198
captureOutput: true,
175199
})
200+
.step('fix-schema-verification', {
201+
agent: 'impl',
202+
dependsOn: ['verify-schema'],
203+
task: `Fix the schema edit if verification failed. Output:\n{{steps.verify-schema.output}}`,
204+
verification: { type: 'exit_code' },
205+
})
176206
```
177207

178208
**What to verify:**
@@ -268,6 +298,7 @@ const result = await workflow('my-feature')
268298
.channel('wf-my-feature')
269299
.maxConcurrency(3)
270300
.timeout(3_600_000)
301+
.repairable()
271302

272303
.agent('impl', { cli: 'claude', preset: 'worker', retries: 2 })
273304
.agent('tester', { cli: 'claude', preset: 'worker', retries: 2 })
@@ -293,9 +324,15 @@ Only edit this one file.`,
293324
type: 'deterministic',
294325
dependsOn: ['edit-target'],
295326
command: 'git diff --quiet path/to/file.ts && (echo "NOT MODIFIED"; exit 1) || echo "OK"',
296-
failOnError: true,
327+
failOnError: false,
297328
captureOutput: true,
298329
})
330+
.step('fix-target-verification', {
331+
agent: 'impl',
332+
dependsOn: ['verify-target'],
333+
task: `Fix the target edit if verification failed. Output:\n{{steps.verify-target.output}}`,
334+
verification: { type: 'exit_code' },
335+
})
299336

300337
// ── Phase 3: Test infrastructure ─────────────────────────────────
301338
.step('install-pglite', {
@@ -311,7 +348,7 @@ Only edit this one file.`,
311348
})
312349
.step('create-tests', {
313350
agent: 'tester',
314-
dependsOn: ['create-test-helpers', 'verify-target'],
351+
dependsOn: ['create-test-helpers', 'fix-target-verification'],
315352
task: 'Create tests/my-feature.test.ts with <test descriptions>...',
316353
verification: { type: 'file_exists', value: 'tests/my-feature.test.ts' },
317354
})
@@ -335,13 +372,19 @@ Only edit this one file.`,
335372
dependsOn: ['fix-tests'],
336373
command: 'npx tsx --test tests/my-feature.test.ts 2>&1',
337374
captureOutput: true,
338-
failOnError: true,
375+
failOnError: false,
376+
})
377+
.step('fix-tests-final', {
378+
agent: 'tester',
379+
dependsOn: ['run-tests-final'],
380+
task: `If the final test rerun is red, fix and rerun until green. Output:\n{{steps.run-tests-final.output}}`,
381+
verification: { type: 'exit_code' },
339382
})
340383

341384
// ── Phase 5: Build + regression ──────────────────────────────────
342385
.step('build-check', {
343386
type: 'deterministic',
344-
dependsOn: ['run-tests-final'],
387+
dependsOn: ['fix-tests-final'],
345388
command: 'npx tsc --noEmit 2>&1 | tail -20; echo "EXIT: $?"',
346389
captureOutput: true,
347390
failOnError: false,
@@ -370,7 +413,28 @@ Only edit this one file.`,
370413
.step('commit', {
371414
type: 'deterministic',
372415
dependsOn: ['fix-regressions'],
373-
command: 'git add <files> && git commit -m "feat: ..."',
416+
command: [
417+
'npx tsx --test tests/my-feature.test.ts',
418+
'npm test',
419+
'git add <files>',
420+
'git commit -m "feat: ..."',
421+
].join(' && '),
422+
captureOutput: true,
423+
failOnError: false,
424+
})
425+
.step('repair-commit', {
426+
agent: 'impl',
427+
dependsOn: ['commit'],
428+
task: `If commit failed, fix the blocker, rerun the feature and regression tests, and create the commit.
429+
If commit passed, confirm the commit subject.
430+
Output:
431+
{{steps.commit.output}}`,
432+
verification: { type: 'exit_code' },
433+
})
434+
.step('verify-commit-created', {
435+
type: 'deterministic',
436+
dependsOn: ['repair-commit'],
437+
command: 'git log -1 --pretty=%s | grep -q "^feat: " && echo "COMMIT_OK" || (echo "COMMIT_MISSING"; exit 1)',
374438
captureOutput: true,
375439
failOnError: true,
376440
})
@@ -386,21 +450,22 @@ Only edit this one file.`,
386450
| Tests exist | `file_exists` verification on test file |
387451
| Tests actually run | Deterministic step executes them |
388452
| Test failures get fixed | Agent step reads output, fixes, re-runs |
389-
| Final test run is hard-gated | `failOnError: true` on last test step |
453+
| Final test run is repairable | Deterministic rerun captures output, then a repair owner gets one more pass |
390454
| Build passes | `npx tsc --noEmit` deterministic step |
391455
| No regressions | Existing test suite runs after changes |
392-
| Every edit is verified | `git diff --quiet` + grep after each agent edit |
393-
| Commit only happens after all gates | `dependsOn` chains to final verification |
456+
| Every edit is verified and repairable | `git diff --quiet` + grep after each agent edit, followed by a fix step |
457+
| Commit only happens after green evidence | Final commit step reruns acceptance checks and commits only on zero exit codes |
394458

395459
## Common Anti-Patterns
396460

397461
| Anti-pattern | Why it fails | Fix |
398462
|-------------|-------------|-----|
399463
| Tests written but never executed | Agent claims they pass, they don't | Add deterministic `run-tests` step |
400-
| Single `failOnError: true` test run | First failure kills workflow, no chance to fix | Use the three-step test-fix-rerun pattern |
464+
| Single `failOnError: true` test run | First failure kills workflow, no chance to fix | Use repairable run-fix-rerun-final-fix loops |
401465
| No regression test | New feature works, old features break | Run `npm test` after build check |
402466
| Agent asked to "write and run tests" in one step | Agent writes tests, runs them, they fail, it edits, output is garbled | Separate write/run/fix into distinct steps |
403467
| PGlite DDL doesn't match Drizzle schema | Tests pass on wrong schema | Derive DDL from schema.ts or test with real migration |
404-
| `failOnError: false` on final test run | Broken tests get committed | Always `failOnError: true` on the gate step |
468+
| Final test output not handed to an agent | Broken tests can stop the run or get ignored | Add a final repair owner before commit |
405469
| Testing only happy path | Edge cases break in prod | Specify edge case tests in the task prompt |
406-
| No verify gate after agent edits | Agent exits 0 without writing anything | Add `git diff --quiet` check after every edit |
470+
| No verify gate after agent edits | Agent exits 0 without writing anything | Add `git diff --quiet` check after every edit, then route failures to a repair step |
471+
| Committing after `failOnError: false` without checking exits | Broken work can be committed because the shell step returned successfully | In `commit-if-green`, record each exit code and skip commit unless all are zero |

0 commit comments

Comments
 (0)