Skip to content

v1.39.1.0 feat: EXIT PLAN MODE GATE for plan-mode review skills#1512

Merged
garrytan merged 6 commits into
mainfrom
garrytan/yokohama-v1
May 15, 2026
Merged

v1.39.1.0 feat: EXIT PLAN MODE GATE for plan-mode review skills#1512
garrytan merged 6 commits into
mainfrom
garrytan/yokohama-v1

Conversation

@garrytan
Copy link
Copy Markdown
Owner

Summary

Adds a terminal blocking gate that fires before every ExitPlanMode call in plan-mode review skills, reframing the call as a precondition-bearing contract (verify ## GSTACK REVIEW REPORT is the file's terminal heading) rather than a soft permission to defer.

Gate placement:

  • Terminal ## heading at EOF of all four /plan-*-review SKILL.md files
  • Inside Step 2A of /codex SKILL.md (review mode only — Step 2B/2C are not plan-touching)

Preamble retoned: the old ## Plan Status Footer ("PLAN MODE EXCEPTION — always allowed") was soft permission language injected into every gstack skill via {{PREAMBLE}}. It now reads as a neutral forward reference, so operational skills (/ship, /qa, /review, /health) don't inherit review-report rules they can't satisfy.

Test Coverage

New EXIT PLAN MODE GATE placement describe block in test/gen-skill-docs.test.ts:3097 (2 tests, 460 tests total in the file). Asserts:

  1. Gate is the terminal ## heading in all 4 plan-* SKILL.md files (with fenced-code-block stripping — PLAN_FILE_REVIEW_REPORT's embedded ## GSTACK REVIEW REPORT example would false-match a naive regex).
  2. Codex SKILL.md contains the gate body via toContain (mid-file per design).

test/host-config.test.ts golden baselines updated for claude-ship, codex-ship, factory-ship SKILL.md to capture the new preamble wording.

bun test test/gen-skill-docs.test.ts test/host-config.test.ts → 460 pass, 0 fail.

Pre-Landing Review

Three review decisions resolved interactively via /plan-eng-review:

  • D1=A: 4 plan-* reviews + codex (autoplan, office-hours deferred — different artifact shapes)
  • D2=B → D4=A: tone preamble down to neutral forward reference (codex cross-model finding — strengthening it would bleed into operational skills)
  • D3=A: add automated test in test/gen-skill-docs.test.ts

Outside voice (codex): 11 findings raised; 6 factual fixes folded into plan before push (line numbers were not terminal, regex would false-match inside code fences, REVIEW_SKILLS constant was missing plan-devex, gate check 4 conflicted with PLAN_FILE_REVIEW_REPORT's skip-silently clause, sequencing ordering, hardcoded paths flagged but deferred). 2 scope tensions surfaced as D4+D5 for user decision. 3 lower-priority items moved to TODOs/NOT-in-scope.

Plan Completion

Plan file at ~/.claude/plans/system-instruction-you-are-working-unified-coral.md. All implementation items DONE:

  • generateExitPlanModeGate resolver — scripts/resolvers/review.ts:161
  • EXIT_PLAN_MODE_GATE registry entry — scripts/resolvers/index.ts:42
  • Gate appended at EOF of 4 plan-* templates
  • Gate inserted into codex Step 2A
  • Preamble retoned to neutral forward reference
  • Test added with fence-stripping + fresh skill list
  • Golden baselines updated for ship/SKILL.md across claude/codex/factory hosts
  • All 8 hosts regenerated

UNVERIFIABLE: none. NOT DONE: none.

Test plan

  • bun test test/gen-skill-docs.test.ts — 389 cases pass
  • bun test test/host-config.test.ts — 73 cases pass (golden baselines green)
  • bun test full suite — exit 0 (verbose eval output suppresses summary line but exit code confirms green)
  • Manual: gate is the terminal ## heading in plan-eng/ceo/design/devex SKILL.md
  • Manual: codex/SKILL.md contains gate mid-file (Step 2A)

🤖 Generated with Claude Code

garrytan and others added 2 commits May 14, 2026 20:50
Add a terminal BLOCKING checklist that verifies the plan file ends with
`## GSTACK REVIEW REPORT` before ExitPlanMode is called. Lives at EOF of all
four plan-* review skills (eng/ceo/design/devex) and inside codex Step 2A.
Tones down the preamble's "Plan Status Footer" to a neutral forward reference
so review-report rules don't bleed into operational skills (/ship /qa /review).

Single source of truth: `generateExitPlanModeGate` in scripts/resolvers/review.ts,
registered as EXIT_PLAN_MODE_GATE in scripts/resolvers/index.ts. New test in
test/gen-skill-docs.test.ts strips fenced code blocks before matching `## `
headings and asserts the gate is the terminal heading in all four plan-* review
SKILL.md files. Codex's SKILL.md uses toContain (mid-file by design — Step 2B/2C
are not plan-touching modes).

Decisions locked via /plan-eng-review + /codex outside-voice:
- D1=A: 4 plan-* reviews + codex (autoplan, office-hours deferred)
- D2=B → D4=A: tone preamble down to neutral forward reference
- D3=A: add automated test in test/gen-skill-docs.test.ts
- D5=B: keep codex gate inside Step 2A (mid-file acceptable per gate self-gating)

Codex pre-merge findings folded in: line numbers obsolete (use EOF), test regex
must strip fences, fresh skill list (not stale REVIEW_SKILLS constant), gate
check 4 short-circuits when no plan file in context.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 15, 2026

E2E Evals: ✅ PASS

58/58 tests passed | $7.37 total cost | 12 parallel runners

Suite Result Status Cost
e2e-browse 4/4 $0.19
e2e-deploy 6/6 $1.26
e2e-design 4/4 $0.56
e2e-plan 8/8 $1.86
e2e-qa-workflow 3/3 $1.09
e2e-review 6/6 $1.49
e2e-workflow 3/3 $0.44
llm-judge 24/24 $0.48

12x ubicloud-standard-8 (Docker: pre-baked toolchain + deps) | wall clock ≈ slowest suite

garrytan and others added 4 commits May 14, 2026 21:25
# Conflicts:
#	CHANGELOG.md
#	VERSION
#	package.json
# Conflicts:
#	CHANGELOG.md
#	VERSION
#	package.json
The three `{ git rev-parse HEAD 2>/dev/null || true; } > path/.version`
brace groups in the build script regressed when v1.38.0.0 merged into this
branch (resolved with --ours during conflict). Bun on Windows can't parse
brace groups in this position; the v1.38.0.0 invariant requires `(...)`
subshells. Windows CI test `package.json build scripts — POSIX shell compat`
caught it.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
# Conflicts:
#	CHANGELOG.md
#	VERSION
#	package.json
@garrytan garrytan merged commit f589770 into main May 15, 2026
23 checks passed
anbangr added a commit to anbangr/gstack that referenced this pull request May 16, 2026
….0 (release-daemon)

Both my fork and upstream chose v1.39.1.0 in parallel:
  - Fork (this PR #37 at 548baca): release-daemon PATH fix + doctor subcommand + migration
  - Upstream (PR garrytan#1512 at f589770): EXIT PLAN MODE GATE blocking checklist

Per user choice (B), kept my v1.39.1.0 entry as the fork's headline release
and folded upstream's release notes in as a sub-section "Upstream sync:
ExitPlanMode gate" inside the same v1.39.1.0 entry. The runtime at
v1.39.1.0 now contains both feature sets. A disclosure note at the top of
the entry calls out the parallel-version situation explicitly. Future fork
ships will bump past v1.39.1.0 cleanly.

Conflicts resolved:
  - CHANGELOG.md: kept my v1.39.1.0 heading + body; added upstream content
    as an "### Upstream sync" sub-section within the same entry.
  - scripts/resolvers/index.ts: kept fork-local `generateBuildCliCandidates`,
    added upstream's `generateExitPlanModeGate` import + resolver entry.
  - test/gen-skill-docs.test.ts: took upstream's preamble assertion rewrite
    (new behavior: no `NO REVIEWS YET` in operational skills, `EXIT PLAN
    MODE GATE` reference instead) — verified the regenerated office-hours
    SKILL.md matches the new test.

Side effects of upstream's preamble retoning: 53 generated SKILL.md files
regenerated by `bun run gen:skill-docs --host all`. Diff is mechanical:
"Plan Status Footer" section in every skill's preamble now reads as a
neutral forward reference to EXIT PLAN MODE GATE instead of imposing
review-report rules on operational skills.

Test status:
  - `bun test`: 1 failure, identical to the pre-existing P0 TODO from PR
    #37 (Step 4.8 fork overlay test). 8 other pre-existing failures
    apparently fixed by upstream changes (down from 9 to 1).
  - cli.test.ts (release-daemon work): all 202 tests pass.
  - migration-v1.39.1.0.test.ts: all 8 scenarios pass (6 darwin + 2 linux
    skipped on darwin host).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant