You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Renumber auto-pipeline integration gate as Step 3 (not 2.5)
Bumps review-pipeline to Step 4. Diagram, intro paragraph, cross-step
references, and Common Mistakes table updated accordingly.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Copy file name to clipboardExpand all lines: .claude/skills/auto-pipeline/SKILL.md
+14-14Lines changed: 14 additions & 14 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -7,7 +7,7 @@ description: Use when you want to take a Backlog issue all the way to Final revi
7
7
8
8
Take **one** Backlog issue all the way from quality gate to **Final review** without human intervention. The merge step itself is still left to the human (see `/final-review`).
9
9
10
-
This skill is an **orchestrator**: it never runs the heavy work itself. Each phase is delegated to a fresh-context subagent. Most phases invoke an existing skill (`check-issue`, `fix-issue`, `run-pipeline`, `review-pipeline`); Phase 2.5 is owned by the orchestrator and runs raw `cargo test --workspace` + `make paper` to catch breakage the per-item sub-skills cannot see. The only thing the main agent does directly is:
10
+
This skill is an **orchestrator**: it never runs the heavy work itself. Each phase is delegated to a fresh-context subagent. Most phases invoke an existing skill (`check-issue`, `fix-issue`, `run-pipeline`, `review-pipeline`); Phase 3 is owned by the orchestrator and runs raw `cargo test --workspace` + `make paper` to catch breakage the per-item sub-skills cannot see. The only thing the main agent does directly is:
11
11
12
12
1. pick the issue,
13
13
2. read structured reports from subagents,
@@ -68,8 +68,8 @@ digraph auto_pipeline {
68
68
"Move to OnHold + comment" [shape=box, style=filled, fillcolor="#ffcccc"];
"Phase 4: review-pipeline (subagent)" -> "Final report";
92
92
}
93
93
```
94
94
@@ -331,7 +331,7 @@ Return ONLY this JSON shape:
331
331
332
332
When the subagent returns:
333
333
334
-
-**`outcome == "success"`** → continue to Step 2.5.
334
+
-**`outcome == "success"`** → continue to Step 3.
335
335
-**`outcome == "failure"`** → STOP. The `run-pipeline` skill already moves the card to OnHold and posts a diagnostic comment, so we do not duplicate. Print:
336
336
337
337
```
@@ -344,9 +344,9 @@ When the subagent returns:
344
344
345
345
Do NOT call codex to rescue here — implementation failures are CI/code-shape problems that need human eyes.
The per-item sub-skills only test the new item in isolation, so cross-crate regressions (e.g. a relaxed model validator breaking pre-existing CLI tests) and paper-compile errors (orphan bib keys, math-mode typos like `intersect` vs Typst's `inter`) slip through Phase 2 and Phase 3. CI catches both, but in batch mode (many issues on one branch) breakage accumulates silently. Running this gate after every Phase 2 success closes the loop.
349
+
The per-item sub-skills only test the new item in isolation, so cross-crate regressions (e.g. a relaxed model validator breaking pre-existing CLI tests) and paper-compile errors (orphan bib keys, math-mode typos like `intersect` vs Typst's `inter`) slip through Phase 2 and the per-item structural review. CI catches both, but in batch mode (many issues on one branch) breakage accumulates silently. Running this gate after every Phase 2 success closes the loop.
350
350
351
351
Dispatch a fresh subagent (`subagent_type=general-purpose`, not invoking any existing skill):
352
352
@@ -359,10 +359,10 @@ Do not modify files. Return ONLY:
359
359
"first_failure": "<first failing test or typst error, or empty>"}
360
360
```
361
361
362
-
- Both `pass` → continue to Step 3.
363
-
- Either `fail` → hand the `first_failure` to `codex:codex-rescue` for a fix-it pass (CI-class problems are usually small: deleting a stale test, fixing a typo'd bib key, swapping `intersect` for `inter`). After codex returns, re-run Step 2.5 once. If still failing, park on OnHold.
362
+
- Both `pass` → continue to Step 4.
363
+
- Either `fail` → hand the `first_failure` to `codex:codex-rescue` for a fix-it pass (CI-class problems are usually small: deleting a stale test, fixing a typo'd bib key, swapping `intersect` for `inter`). After codex returns, re-run Step 3 once. If still failing, park on OnHold.
Dispatch the existing `review-pipeline` skill against the PR:
368
368
@@ -404,4 +404,4 @@ Auto-pipeline complete:
404
404
| Letting the codex subagent edit GitHub | The orchestrator owns all `gh issue edit` calls — codex only returns text |
405
405
| Treating implementation failures as substantive issue problems | Step 2 failures go straight to a stop; they are not eligible for codex rescue |
406
406
| Picking from a non-Backlog column when no issue number is given | Auto-pick must read from Backlog only — never from OnHold, Ready, or elsewhere |
407
-
| Skipping Step 2.5 because Phase 2 reported `success`| Phase 2 success is scoped to the new item's own tests; workspace-wide regressions and paper-compile bugs are only visible from `make check` + `make paper`. |
407
+
| Skipping Step 3 because Phase 2 reported `success`| Phase 2 success is scoped to the new item's own tests; workspace-wide regressions and paper-compile bugs are only visible from `make check` + `make paper`. |
0 commit comments