You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add /shipLane command and portable ship-lane playbook
Introduces an autonomous PR-to-merge driver that runs automate → finalize
once, then polls CI and review comments on a self-paced 12-min cadence,
fixing valid comments and failing tests in place. Prefers TeamCreate agent
teams when available, falls back to parallel Agent calls otherwise. Opens
the PR via `ade prs create` when possible so it shows up in ADE's PR
tracking; falls back to `gh pr create` only after the agent has genuinely
exhausted the ADE path via `--help`-driven discovery.
Also narrows /automate to run only the new and affected tests (not the
full suite), and makes /finalize's 8-shard parallel run explicit so shards
don't get chained serially.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy file name to clipboardExpand all lines: .claude/commands/automate.md
+20-39Lines changed: 20 additions & 39 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -35,7 +35,7 @@ Phase 3: Parallel test writing (agents)
35
35
├── desktop-tester-1..N (desktop app tests)
36
36
└── mcp-tester (mcp server tests, if applicable)
37
37
Phase 4: Test reality check (lead, after all testers done)
38
-
Phase 5: Full test run (lead)
38
+
Phase 5: Scoped test run (new + affected) (lead)
39
39
Phase 6: CI verification (lead)
40
40
Phase 7: Summary (lead)
41
41
```
@@ -247,60 +247,40 @@ If issues are found, fix them directly.
247
247
248
248
---
249
249
250
-
## Phase 5: Full Test Run
250
+
## Phase 5: Scoped Test Run
251
251
252
-
After reality check passes, run ALL created tests to confirm everything passes together.
252
+
Verify the tests **this command just wrote** pass. Do NOT run the full suite — that is `/finalize`'s job, and running it here doubles the wait with no new signal.
253
253
254
-
### 5a. Desktop tests (all new test files)
254
+
### 5a. New test files together
255
255
256
-
Run new test files together first:
256
+
Run every test file created in Phase 3 in a single invocation:
257
257
258
258
```bash
259
259
cd apps/desktop && npx vitest run [space-separated list of all new test files]
260
260
```
261
261
262
-
### 5b. Desktop tests (full sharded run — match CI)
262
+
All new tests must pass. If any fail, fix in place and re-run only the failing files.
263
263
264
-
Run the full suite the same way CI does — sharded 8-way. Run all 8 shards in parallel:
264
+
### 5b. Affected existing tests
265
265
266
-
```bash
267
-
cd apps/desktop && npx vitest run --shard=1/8
268
-
cd apps/desktop && npx vitest run --shard=2/8
269
-
cd apps/desktop && npx vitest run --shard=3/8
270
-
cd apps/desktop && npx vitest run --shard=4/8
271
-
cd apps/desktop && npx vitest run --shard=5/8
272
-
cd apps/desktop && npx vitest run --shard=6/8
273
-
cd apps/desktop && npx vitest run --shard=7/8
274
-
cd apps/desktop && npx vitest run --shard=8/8
275
-
```
276
-
277
-
Or run a specific workspace project:
278
-
279
-
```bash
280
-
cd apps/desktop && npx vitest run --project unit-main
281
-
cd apps/desktop && npx vitest run --project unit-renderer
282
-
cd apps/desktop && npx vitest run --project unit-shared
283
-
```
284
-
285
-
### 5c. MCP server tests (if applicable)
286
-
287
-
```bash
288
-
cd apps/mcp-server && npm test
289
-
```
290
-
291
-
### 5d. Run affected existing tests
292
-
293
-
If code changes could break existing tests (e.g., changed a service function's signature), run those existing test files too:
266
+
If the branch's source changes could break existing tests (e.g., changed a service function's signature, renamed an exported type, altered shared contracts), run those existing test files — NOT the full suite:
294
267
295
268
```bash
296
269
cd apps/desktop && npx vitest run [affected existing test files]
297
270
```
298
271
272
+
Scope "affected" narrowly — direct importers of touched modules and their test siblings. Do not expand to "everything in the same feature folder."
273
+
299
274
**If tests fail:**
300
275
- Check if it's a flaky test (retry once)
301
276
- If a specific test fails consistently, fix it and re-run only that file
302
277
- Do NOT re-run all tests — only the failed ones
303
278
279
+
### 5c. Not this command's job
280
+
281
+
-**Full sharded suite run:**`/finalize` runs all 8 shards (and `test-ade-cli`) the same way CI does. Skip it here.
282
+
-**Build / typecheck / lint:** also deferred to `/finalize`.
Copy file name to clipboardExpand all lines: .claude/commands/finalize.md
+28-12Lines changed: 28 additions & 12 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -285,35 +285,51 @@ cd apps/web && npm run typecheck
285
285
cd apps/desktop && npm run lint
286
286
```
287
287
288
-
### 3e. Desktop tests (sharded — match CI exactly)
288
+
### 3e. Desktop tests — full suite, sharded 8-way, run in PARALLEL
289
289
290
-
Shard like CI (8 shards in parallel) to avoid timeout. The workspace has 3 projects (`unit-main`, `unit-renderer`, `unit-shared`) — sharding runs across all of them automatically:
290
+
`/finalize` is the gate that runs the whole test suite. Run **all 8 shards concurrently** — not sequentially. Running them serially takes 8× longer and masks real CI wall-clock behavior.
291
+
292
+
The command must be identical to `.github/workflows/ci.yml` (job `test-desktop`, matrix shard 1–8, step at line 139):
293
+
294
+
```
295
+
- run: cd apps/desktop && npx vitest run --shard=${{ matrix.shard }}/8
296
+
```
297
+
298
+
Locally that maps to 8 parallel Bash invocations in a single tool-call round:
291
299
292
300
```bash
293
-
cd apps/desktop && npx vitest run --shard=1/8
294
-
cd apps/desktop && npx vitest run --shard=2/8
295
-
cd apps/desktop && npx vitest run --shard=3/8
296
-
cd apps/desktop && npx vitest run --shard=4/8
297
-
cd apps/desktop && npx vitest run --shard=5/8
298
-
cd apps/desktop && npx vitest run --shard=6/8
299
-
cd apps/desktop && npx vitest run --shard=7/8
300
-
cd apps/desktop && npx vitest run --shard=8/8
301
+
cd apps/desktop && npx vitest run --shard=1/8# shard 1 of 8
302
+
cd apps/desktop && npx vitest run --shard=2/8# shard 2 of 8
303
+
cd apps/desktop && npx vitest run --shard=3/8# shard 3 of 8
304
+
cd apps/desktop && npx vitest run --shard=4/8# shard 4 of 8
305
+
cd apps/desktop && npx vitest run --shard=5/8# shard 5 of 8
306
+
cd apps/desktop && npx vitest run --shard=6/8# shard 6 of 8
307
+
cd apps/desktop && npx vitest run --shard=7/8# shard 7 of 8
308
+
cd apps/desktop && npx vitest run --shard=8/8# shard 8 of 8
301
309
```
302
310
303
-
Or run specific projects when you only need a subset:
311
+
Issue these as 8 concurrent Bash tool calls in a single message (one call per shard). Do not chain them with `&&` or `;` or run them one at a time. The workspace has 3 projects (`unit-main`, `unit-renderer`, `unit-shared`) — sharding distributes across all three automatically.
312
+
313
+
If a shard fails, re-run **only that shard** (or, better, only the specific failing test file inside it). Never re-run all 8 shards to verify a one-file fix.
314
+
315
+
Workspace-project subsets exist for debugging only; they are NOT a substitute for the sharded run in `/finalize`:
304
316
305
317
```bash
306
318
cd apps/desktop && npx vitest run --project unit-main # ~150+ main-process tests
307
319
cd apps/desktop && npx vitest run --project unit-renderer # ~85+ renderer tests
308
320
cd apps/desktop && npx vitest run --project unit-shared # ~7 shared/preload tests
309
321
```
310
322
311
-
### 3f. ADE CLI tests
323
+
### 3f. ADE CLI tests — separate CI job, run alongside the 8 shards
324
+
325
+
CI runs `test-ade-cli` as its own parallel job (`.github/workflows/ci.yml:156`). Locally, include it in the same parallel tool-call round as the 8 desktop shards — it's effectively a 9th concurrent invocation, not something to run after:
312
326
313
327
```bash
314
328
cd apps/ade-cli && npm test
315
329
```
316
330
331
+
Do NOT run apps/mcp-server tests — the MCP server was removed; the agent-facing surface lives in `apps/ade-cli`.
description: 'Autonomously drive a lane through CI + review until merged (automate → finalize → poll/fix loop, self-paced wake-ups, max 5 iterations)'
4
+
---
5
+
6
+
# Ship Lane Command
7
+
8
+
Drive the current lane from "work is ready" to "merged on main" without manual shepherding.
9
+
10
+
**Usage:**
11
+
-`/shipLane` — auto-detects state (existing PR on current branch, or needs initial push)
12
+
-`/shipLane <pr-number>` — operate on a specific PR (useful if you checked out a different branch mid-loop)
13
+
14
+
**Arguments:** $ARGUMENTS
15
+
16
+
---
17
+
18
+
## Source of truth
19
+
20
+
**Follow the playbook at `docs/playbooks/ship-lane.md`.** All phase logic, state schema, commands, decision rules, and bot-ping rules live there. This wrapper only defines how Claude Code's team + wake-up primitives map onto the playbook.
21
+
22
+
If you are re-invoked by a scheduled wake-up, read `.ade/shipLane/<sanitized-branch>.json` first. If `status == running`, skip Phase 0 and go straight to Phase 1.
23
+
24
+
---
25
+
26
+
## Execution mode: autonomous
27
+
28
+
This command runs end-to-end without user interaction. Do NOT:
29
+
- Ask the user to confirm, choose, or approve anything.
30
+
- Pause between phases to request direction.
31
+
- Stop on non-fatal warnings — log them and continue.
32
+
- Ask whether to apply a fix — apply, verify, commit.
33
+
34
+
The only user-visible output is the per-iteration summary and the final Phase 5 exit summary.
35
+
36
+
---
37
+
38
+
## Concurrency: TeamCreate is MANDATORY
39
+
40
+
Check the available tools. If `TeamCreate` is in scope, you MUST use it. Do not fall back to `Agent` calls when a team is available.
41
+
42
+
### Team composition
43
+
44
+
Create one team at the start of the invocation, reuse it across iterations.
45
+
46
+
```
47
+
ship-lane team
48
+
├── lead (this session's main agent)
49
+
├── poll-agent — runs every iteration, returns structured summary only
50
+
├── rebase-agent — spawned only when behindMain or conflicts exist
51
+
├── ci-fix-agent — spawned only when CI failures exist
52
+
├── review-fix-agent — spawned only when new valid comments exist
53
+
└── conflict-resolver — spawned by rebase-agent for >5-file conflicts
54
+
```
55
+
56
+
Initial team setup should also create:
57
+
-`automate-agent` — invoked once in Phase 0 (only when there is no existing PR)
58
+
-`finalize-agent` — invoked once in Phase 0 (only when there is no existing PR)
59
+
60
+
### Delegation rules
61
+
62
+
- The lead NEVER reads raw CI logs or full comment threads. It reads the poll-agent's structured summary (see playbook §1.3).
63
+
- Fix agents get minimum scope: failing test paths + error snippets, or comment bodies + file anchors.
64
+
- Fix agents edit files directly; they do not commit.
65
+
- The lead commits and pushes after verifying `git diff`.
66
+
- Rebase-agent runs alone when active — no concurrent file edits from other agents.
67
+
68
+
### Fallback (TeamCreate not available)
69
+
70
+
If `TeamCreate` is genuinely not in scope for this session:
71
+
72
+
- Use parallel `Agent` tool calls for independent work (poll, ci-fix + review-fix in the same iteration).
73
+
- Use serial `Agent` calls for rebase (must run alone) and Phase 0 setup (automate then finalize).
74
+
- Same delegation rules apply — keep the lead's context clean by summarizing sub-agent output aggressively.
75
+
76
+
---
77
+
78
+
## Scheduling wake-ups
79
+
80
+
Use `ScheduleWakeup` at the end of each iteration (playbook §5.3) with the same command re-invocation as the `prompt`:
81
+
82
+
```
83
+
ScheduleWakeup({
84
+
delaySeconds: <270 | 720 | 1800 per playbook>,
85
+
reason: "shipLane iter <N>: <CI running | waiting on review | just pushed>",
86
+
prompt: "/shipLane $ARGUMENTS"
87
+
})
88
+
```
89
+
90
+
Pass `$ARGUMENTS` through so a PR-number argument is preserved across wake-ups.
91
+
92
+
Do NOT schedule a wake if `status` is `done-clean`, `done-max`, or `blocked` — print the summary and stop.
93
+
94
+
---
95
+
96
+
## Phase 0 safety rails (Claude Code specific)
97
+
98
+
Before running `automate-agent` and `finalize-agent` in Phase 0:
99
+
100
+
1. Confirm `$ARGUMENTS` is empty OR matches a PR number on the current branch. If the PR number is for a different branch, `git checkout` to that branch first.
101
+
2. Confirm `git status` is clean of foreign changes you don't expect. If the working tree has staged changes, commit them with `ship: checkpoint before automate/finalize` so the automate/finalize pipeline runs against a known baseline.
102
+
3. Confirm `origin` is a GitHub remote (`git remote get-url origin`) — `gh pr create` needs it.
103
+
104
+
If any rail fails, exit `blocked` with a clear reason in the state file and stop.
105
+
106
+
---
107
+
108
+
## References
109
+
110
+
-`docs/playbooks/ship-lane.md` — full phase logic (source of truth).
111
+
-`.claude/commands/automate.md` — invoked by `automate-agent` in Phase 0.
112
+
-`.claude/commands/finalize.md` — invoked by `finalize-agent` in Phase 0.
113
+
-`.github/workflows/ci.yml` — CI job names and shard count (`8`) that the local fallback tests mirror.
Copy file name to clipboardExpand all lines: AGENTS.md
+4Lines changed: 4 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -7,6 +7,10 @@
7
7
- The ADE CLI lives in `apps/ade-cli` and shares core services with the desktop app.
8
8
- State is primarily stored under `.ade/` inside the active project, with runtime metadata in SQLite and machine-local files under `.ade/secrets`, `.ade/cache`, and `.ade/artifacts`.
9
9
10
+
## Playbooks
11
+
12
+
-`docs/playbooks/ship-lane.md` — autonomous PR-to-merge driver (automate → finalize → poll-fix loop). Any agent CLI can follow it directly; Claude Code wraps it as `/shipLane`.
13
+
10
14
## Working norms
11
15
12
16
- Preserve existing desktop app patterns before introducing new abstractions.
0 commit comments