Skip to content

Commit 6e1625c

Browse files
garrytanclaude
andauthored
v1.25.0.0 fix: AskUserQuestion resolves to host MCP variant when native is disallowed (#1287)
* test(harness): plumb extraArgs and auto_decided outcome through PTY runner runPlanSkillObservation now accepts extraArgs that pass through to launchClaudePty (which already supported them at the lower level), and exposes a new 'auto_decided' outcome detected via isAutoDecidedVisible when the AUTO_DECIDE preamble template fires (Auto-decided ... (your preference)). Both pieces are needed for the v1.21+ AskUserQuestion-blocked regression tests in the next commit. Detection order is deliberate: 'asked' (rendered numbered list) wins over 'auto_decided' (text only, no list), which wins over 'plan_ready' so the auto-decide evidence isn't masked by a downstream plan-mode confirmation. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(e2e): add AskUserQuestion-blocked regression cases for 6 plan-mode skills Conductor launches Claude Code with --disallowedTools AskUserQuestion --permission-mode default --permission-prompt-tool stdio (verified by inspecting the live conductor claude process via ps -p ... -o args=). Native AskUserQuestion is removed from the model's tool registry; without fallback guidance the plan-mode skills (plan-ceo-review, plan-eng-review, plan-design-review, plan-devex-review, autoplan, office-hours) silently proceed and never surface decisions to the user. Adds 6 gate-tier real-PTY regression cases: - 4 inline test cases inside the existing plan-X-review-plan-mode.test files, each exercising the same skill with extraArgs ['--disallowedTools', 'AskUserQuestion'] and asserting outcome === 'asked'. plan-design-review keeps the ['asked', 'plan_ready'] envelope (legitimate short-circuit on no-UI-scope) but explicitly fails on 'auto_decided'. - 2 standalone test files for autoplan + office-hours (which had no prior plan-mode test). autoplan asserts the FIRST non-auto-decided gate fires (Phase 1 premise confirmation) — autoplan auto-decides intermediate questions BY DESIGN. Touchfile entries: - autoplan-auto-mode + office-hours-auto-mode added to E2E_TOUCHFILES + E2E_TIERS (gate) - existing plan-X-review-plan-mode entries gain question-tuning.ts and generate-ask-user-format.ts touchfile deps so AUTO_DECIDE-related resolver changes correctly invalidate the regression tests - touchfiles.test.ts count updated 18 -> 19 to cover the autoplan touchfile dependency on plan-ceo-review/** Filenames retain `auto-mode` for branch-history continuity. Auto-mode (the AUTO_DECIDE preamble path when QUESTION_TUNING=true) is a related but distinct silencing mechanism; both share the same fix surface in the preamble. These tests are expected to FAIL on this branch until the fix lands. The failure is the receipt for the regression. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(preamble): teach the model to prefer mcp__*__AskUserQuestion when registered When a host launches Claude Code with --disallowedTools AskUserQuestion (Conductor does this by default — verified via ps on the live conductor claude process), the native AskUserQuestion tool is removed from the model's tool registry. Skill templates that say "call AskUserQuestion" silently fail in that environment: the model can't ask, the user never sees the question, the skill auto-proceeds without input. The fix is preamble guidance, not a skill-template change: generate-ask-user-format.ts: new "Tool resolution" section at the top of the AskUserQuestion Format block. Tells the model that "AskUserQuestion" can resolve to two tools at runtime — the host MCP variant (e.g. mcp__conductor__AskUserQuestion, registered when the host injects it) and the native tool — and to PREFER any mcp__*__AskUserQuestion variant. Same questions/options shape; same decision-brief format. If neither variant is callable, fall back to writing a "## Decisions to confirm" section into the plan file plus ExitPlanMode (the native plan-mode confirmation surfaces it). Never silently auto-decide. generate-completion-status.ts: the plan-mode-info block (preamble position 1) now explicitly notes that AskUserQuestion satisfies plan mode's end-of-turn requirement for "any variant" and points at the Tool resolution section for the fallback path. This puts the resolution rule in front of every tier-≥2 skill via the preamble, so plan-mode review skills (plan-ceo-review, plan-eng-review, plan-design-review, plan-devex-review, autoplan, office-hours) all gain the fix without per-template surgery. Includes regenerated SKILL.md files for all 41 skills + the 3 host-ship golden fixtures used by test/host-config.test.ts. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(periodic): AUTO_DECIDE opt-in preserved under Conductor flags Periodic-tier eval that exercises the legitimate /plan-tune AUTO_DECIDE path under the same flags Conductor uses (--disallowedTools AskUserQuestion). Confirms the new Tool resolution preamble doesn't trip opt-in users: when the user has set a never-ask preference for a question, the model should auto-pick (outcome 'auto_decided' or 'plan_ready') rather than surface the prompt. Setup runs in an isolated GSTACK_HOME tmpdir — never touches the user's real ~/.gstack state. Writes question_tuning=true + a never-ask preference for plan-ceo-review-mode (source: 'plan-tune', which bypasses the inline-user origin gate). Spawns claude with --disallowedTools AskUserQuestion in plan mode, runs /plan-ceo-review, asserts outcome is NOT 'asked' (i.e., the model honored the preference). Periodic tier because AUTO_DECIDE behavior depends on the model adhering to the QUESTION_TUNING preamble injection — non-deterministic, weekly cron is the right cadence rather than CI gating. Touchfiles cover the AUTO_DECIDE-bearing resolvers + the question-tuning binaries the test setup invokes. touchfiles.test.ts count updates 19 -> 20 because auto-decide-preserved also depends on plan-ceo-review/**. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * v1.21.0.0: AskUserQuestion resolves to host MCP variant when native is disallowed MINOR scale per scale-aware bumps in CLAUDE.md: substantial coordinated multi-file change (preamble fix + new test infrastructure + 6 gate-tier regression cases + 1 periodic eval) and a user-visible regression fix that affects every plan-mode review skill running under Conductor's default flag set. User originally targeted v1.21.2.0; landing as v1.21.0.0 since this is the first 1.21.x release on main and there's no prior 1.21.0.0/1.21.1.0 to skip past. Adjust at /ship time if a different number is preferred. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(harness): fix detection order + whitespace-tolerant pattern matching Two bugs surfaced when validating the v1.21 fix end-to-end: 1. PlanSkillObservation outcome detection ran 'asked' (any numbered options list) BEFORE 'plan_ready'. Plan-mode's "Ready to execute?" confirmation IS a numbered options list (1=auto, 2=manual, ...), so any skill that successfully reached the native confirmation got misclassified as 'asked'. Reorder: 'auto_decided' (most specific, requires AUTO_DECIDE annotation) > 'plan_ready' (next, requires the "ready to execute" stem) > 'asked' (any remaining numbered list). 2. isPlanReadyVisible and isAutoDecidedVisible regexes only matched spaced forms ("ready to execute", "(your preference)"). stripAnsi removes cursor-positioning escapes (`\x1b[40C`) entirely instead of replacing them with spaces, so the same text can render as "readytoexecute" or "(yourpreference)". Both detectors now test the spaced form first, fall through to a whitespace-collapsed comparison. Inline unit smoke confirms both forms match. Updates to the 5 strict 'asked' regression test cases (plan-ceo, plan-eng, plan-devex, autoplan, office-hours): with the detection order corrected, the model's plan-file fallback flow legitimately lands at 'plan_ready' instead of 'asked'. Pass envelope expanded to ['asked', 'plan_ready'] (matching plan-design-review's existing pattern). Failure signals tightened to include 'auto_decided' (catches AUTO_DECIDE without opt-in) plus the standard silent_write/exited/timeout. plan-design was already on this contract from v1.21's first commit, no change needed. The expanded envelope is correct: under --disallowedTools AskUserQuestion the Tool resolution preamble routes the question through plan-mode's native "Ready to execute?" surface — the user still sees the decision, just via the plan-file flow rather than a numbered prompt. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(harness): require ## Decisions section under --disallowedTools plan_ready Adversarial review (during /ship Step 11) found that the previous gate-test envelope ['asked', 'plan_ready'] for the AskUserQuestion-blocked regression cases accepted the bug they exist to catch: a model that silently skips Step 0 entirely (writes a plan with no questions, no `## Decisions to confirm` section, just ExitPlanModes) reaches plan_ready and passes. The fix tightens the contract in two layers: 1. Harness: PlanSkillObservation gains a `planFile?: string` field populated when outcome is plan_ready. extractPlanFilePath() walks the visible TTY buffer for "Plan saved to:", "Plan file:", or ".claude/plans/<name>.md" patterns and resolves tilde to absolute. planFileHasDecisionsSection() reads the resolved file and returns true if it contains a `## Decisions` heading (any form: "to confirm", "needed", etc.). 2. Tests: 5 of 6 regression cases now require, when outcome is plan_ready, that obs.planFile is set AND planFileHasDecisionsSection returns true. Otherwise the test fails with a "Step 0 was silently skipped" diagnosis. plan-design-review remains the sole exception — it legitimately short-circuits to plan_ready on no-UI-scope branches and we have no deterministic way to distinguish that from a silent skip. This closes the loophole the adversarial review identified. The fix preamble flow already tells the model to write `## Decisions to confirm` when neither AUQ variant is callable — now the test verifies the model actually did it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(harness): anchor extractPlanFilePath path captures on /Users|~|/home|/var|/tmp Adversarial-tightened gate sweep surfaced a real bug in the path extraction: stripAnsi collapses whitespace via cursor-positioning escape removal, so "yet at /Users/..." in the visible buffer becomes "yetat/Users/..." with no space between. The previous fallback pattern `(~?\/?\S*\.claude\/plans\/[\w-]+\.md)` greedily matched non-whitespace characters BEFORE the path, producing `yetat/Users/garrytan/.claude/...` which then fails fs.readFileSync. Fix: every regex now requires the path to START at a known path-anchor: `~/`, `/Users/`, `/home/`, `/var/`, `/tmp/`, or `./`. Earlier non-whitespace runs can't be glommed in. Verified against the failing fixture (`yetat/Users/...`) plus the four canonical render forms ("Plan saved to:", "Plan file:", `·`-decorated ctrl-g hint, and the bare fallback). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 0570ef9 commit 6e1625c

60 files changed

Lines changed: 1119 additions & 73 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

AGENTS.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,8 @@ Invoke them by name (e.g., `/office-hours`).
3535
| `/devex-review` | Live developer experience audit (TTHW measured against the real flow). |
3636
| `/qa` | Open a real browser, find bugs, fix them, re-verify. |
3737
| `/qa-only` | Same methodology as /qa but report only — no code changes. |
38+
| `/scrape` | Pull data from a web page. First call prototypes; codified call runs in ~200ms. |
39+
| `/skillify` | Codify the most recent successful `/scrape` flow into a permanent browser-skill. |
3840

3941
### Release + deploy
4042

CHANGELOG.md

Lines changed: 72 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,77 @@
11
# Changelog
22

3+
## [1.25.0.0] - 2026-05-01
4+
5+
## **Plan-mode skills surface every decision again, even when the host disallows AskUserQuestion.**
6+
7+
Conductor launches Claude Code with `--disallowedTools AskUserQuestion --permission-mode default --permission-prompt-tool stdio` (verified by inspecting the live conductor claude process via `ps`). The native AskUserQuestion tool is removed from the model's tool registry, so when a plan-mode skill instructs the model to "call AskUserQuestion," the call silently fails: the model can't ask, the user never sees the question, and the skill auto-proceeds without input. The whole interactive premise of `/plan-ceo-review`, `/plan-eng-review`, `/plan-design-review`, `/plan-devex-review`, `/autoplan`, and `/office-hours` was broken in any Conductor session.
8+
9+
The fix is preamble guidance, not skill-template surgery. A new `Tool resolution` section in `scripts/resolvers/preamble/generate-ask-user-format.ts` tells the model to check its tool list and prefer any `mcp__*__AskUserQuestion` variant (e.g. `mcp__conductor__AskUserQuestion`) over the native tool. Hosts that disable native AskUserQuestion register their own MCP variant; the variant takes the same questions/options shape and the host renders the prompt through its own UI surface. If neither variant is callable, the model falls back to writing a `## Decisions to confirm` section into the plan file and calling ExitPlanMode — plan-mode's native "Ready to execute?" confirmation surfaces the decisions through TTY UI. **Never silently auto-decide.**
10+
11+
Six gate-tier real-PTY regression tests reproduce the exact Conductor flag set (`extraArgs: ['--disallowedTools', 'AskUserQuestion']`) for every plan-mode skill, plus a periodic-tier eval that protects the legitimate `/plan-tune` AUTO_DECIDE opt-in path from being broken by the fix. The harness gains a new `'auto_decided'` outcome and whitespace-tolerant detectors that survive TTY cursor-positioning escape sequences (which `stripAnsi` removes without leaving spaces, collapsing "ready to execute" to "readytoexecute").
12+
13+
### What you can now do
14+
15+
- **Use plan-mode review skills in Conductor.** Open a Conductor workspace, run `/plan-ceo-review` against a plan, and the scope-mode question actually appears for you to answer. Same for `/plan-eng-review`, `/plan-design-review`, `/plan-devex-review`, `/autoplan`'s premise gate, and `/office-hours`.
16+
- **Stay in control under `--disallowedTools` without writing template overrides.** The Tool resolution section sits at preamble position 1 in every tier-≥2 skill; new hosts that disable native AUQ via the same pattern get the fix transparently as long as they register an MCP variant.
17+
- **Opt-in to AUTO_DECIDE without losing the regression guard.** `/plan-tune` users who set `never-ask` for specific questions keep auto-pick under Conductor flags; the periodic-tier `auto-decide-preserved` eval protects this path.
18+
19+
### The numbers that matter
20+
21+
Source: `ps -p <conductor-claude-pid> -o args=` for the regression mechanism (verified primary source). 6 new gate-tier regression cases + 1 periodic-tier AUTO_DECIDE eval; coverage in `test/skill-e2e-plan-{ceo,eng,design,devex}-plan-mode.test.ts` (parameterized inline) + `test/skill-e2e-{autoplan,office-hours}-auto-mode.test.ts` (standalone) + `test/skill-e2e-auto-decide-preserved.test.ts` (periodic).
22+
23+
| Surface | Shape |
24+
|---|---|
25+
| Skills that regain interactivity in Conductor | 6 (`/plan-ceo-review`, `/plan-eng-review`, `/plan-design-review`, `/plan-devex-review`, `/autoplan`, `/office-hours`) |
26+
| New gate-tier regression test cases | 6 (one per skill; `--disallowedTools AskUserQuestion` parameterized) |
27+
| New periodic-tier eval | 1 (`auto-decide-preserved`, protects `/plan-tune` opt-in path) |
28+
| New `ClassifyResult` outcome | `auto_decided` — TTY shows "Auto-decided … (your preference)" |
29+
| New `runPlanSkillObservation` parameter | `extraArgs?: string[]` — plumbs raw flags to spawned `claude` |
30+
| Preamble resolvers touched | 2 (`generate-ask-user-format.ts`, `generate-completion-status.ts`) |
31+
| SKILL.md files regenerated | 41 |
32+
| `classifyVisible` branch order | `silent_write``auto_decided``plan_ready``asked` (each more specific than the next) |
33+
| Whitespace-tolerant detectors | `isPlanReadyVisible`, `isAutoDecidedVisible` (defeats stripAnsi cursor-positioning collapse) |
34+
| Verified by | `ps -p <conductor-claude-pid> -o args=` showing `--disallowedTools AskUserQuestion --permission-mode default` |
35+
36+
### What this means for builders
37+
38+
If you ran `/plan-ceo-review` or any plan-mode review skill in Conductor before this release, the skill silently produced a plan you didn't shape — the scope-mode question, expansion proposals, and per-section STOPs never reached you. After upgrading, the skill stops for every gate the template defines. The fix is in the preamble, so you don't update skill templates yourself — just upgrade gstack and the next plan review you run honors your input.
39+
40+
If you opted into auto-deciding specific questions via `/plan-tune`, the periodic eval guards that path. The fix is "prefer MCP variant when registered," not "force every question to surface" — your `never-ask` preferences still auto-pick, the AUTO_DECIDE annotation still renders, nothing changes for opt-in users.
41+
42+
The gstack-side regression test surface now mirrors what real users hit. Each plan-mode test file gained a second `test()` block that sets `extraArgs: ['--disallowedTools', 'AskUserQuestion']` and asserts the AskUserQuestion still surfaces. Builds on v1.21.1.0's `classifyVisible()` extraction — the new auto-decided branch slots in cleanly between silent_write and plan_ready.
43+
44+
### Itemized changes
45+
46+
#### Added — Tool resolution preamble
47+
48+
- `scripts/resolvers/preamble/generate-ask-user-format.ts` gets a new `### Tool resolution (read first)` section at the top of the AskUserQuestion Format block. Tells the model: AskUserQuestion can resolve to two tools at runtime (host MCP variant or native); prefer any `mcp__*__AskUserQuestion` variant in the tool list over native; hosts may disable native via `--disallowedTools AskUserQuestion` (Conductor does this by default); same questions/options shape and decision-brief format applies to the MCP variant. Includes a fallback path when neither variant is callable: write the decision into the plan file as `## Decisions to confirm` + ExitPlanMode.
49+
- `scripts/resolvers/preamble/generate-completion-status.ts` (the plan-mode-info block at preamble position 1) updated to point at the Tool resolution section: AskUserQuestion satisfies plan mode's end-of-turn requirement for "any variant," with the plan-file fallback for the no-variant case.
50+
51+
#### Added — regression tests
52+
53+
- 4 inline `test()` blocks added to `test/skill-e2e-plan-{ceo,eng,design,devex}-plan-mode.test.ts`. Each spawns claude with `extraArgs: ['--disallowedTools', 'AskUserQuestion']` and asserts the skill still surfaces the question — pass envelope `['asked', 'plan_ready']` (the latter covers the plan-file fallback flow), failure signals are `'auto_decided'` (caught explicitly) plus the standard silent_write/exited/timeout.
54+
- `test/skill-e2e-autoplan-auto-mode.test.ts` (new). Asserts autoplan's first non-auto-decided gate (Phase 1 premise confirmation) still surfaces. Autoplan auto-decides intermediate questions BY DESIGN, so the test scopes to gates the user MUST see.
55+
- `test/skill-e2e-office-hours-auto-mode.test.ts` (new). Asserts office-hours' startup-vs-builder mode AskUserQuestion still surfaces.
56+
- `test/skill-e2e-auto-decide-preserved.test.ts` (new, periodic-tier). Sets up an isolated `GSTACK_HOME` tmpdir, writes `question_tuning=true` + a `never-ask` preference for `plan-ceo-review-mode` (source `'plan-tune'`), runs `/plan-ceo-review` under `--disallowedTools AskUserQuestion`, asserts outcome is NOT `'asked'` (the model honored the opt-in).
57+
58+
#### Changed — PTY harness
59+
60+
- `test/helpers/claude-pty-runner.ts`: `runPlanSkillObservation` accepts new optional `extraArgs?: string[]` (plumbs straight through to `launchClaudePty`, which already supported the field). `ClassifyResult` gains `'auto_decided'` outcome plus `isAutoDecidedVisible(visible)` detector that matches the AUTO_DECIDE preamble template (`Auto-decided … (your preference)`). `classifyVisible` branch order extended to `silent_write → auto_decided → plan_ready → asked` so an upstream auto-decide isn't masked by a downstream plan-mode confirmation.
61+
- Whitespace-tolerant detection: `isPlanReadyVisible` and `isAutoDecidedVisible` now test both spaced and whitespace-collapsed forms of their target phrases. `stripAnsi` removes cursor-positioning escapes (`\x1b[40C`) without replacing them with spaces, so "ready to execute" can come through as "readytoexecute" — the spaced regex would miss it.
62+
63+
#### Changed — touchfiles
64+
65+
- `test/helpers/touchfiles.ts`: existing `plan-X-review-plan-mode` entries gain `scripts/resolvers/question-tuning.ts` and `scripts/resolvers/preamble/generate-ask-user-format.ts` as touchfile dependencies, so AUTO_DECIDE-bearing resolver changes correctly invalidate the regression cases.
66+
- New entries: `autoplan-auto-mode` (gate), `office-hours-auto-mode` (gate), `auto-decide-preserved` (periodic).
67+
- `test/touchfiles.test.ts`: count of tests selected by `plan-ceo-review/SKILL.md` updates from 19 to 21 to cover the new entries that depend on `plan-ceo-review/**`.
68+
69+
#### For contributors
70+
71+
- The PTY harness's `auto_decided` outcome is a defense-in-depth signal: it fires on the AUTO_DECIDE preamble template wording, which is non-deterministic. Treat it as evidence of a regression, not a hard contract.
72+
- The Tool resolution section is the surgical fix site for any future host that disables native AUQ similarly. The pattern: register a `mcp__<host>__AskUserQuestion` MCP tool; the gstack preamble already tells the model to prefer it. No skill-template changes needed per-host.
73+
- `auto-decide-preserved` runs in an isolated `GSTACK_HOME` tmpdir to avoid mutating the developer's real `~/.gstack` state. When debugging, set `GSTACK_HOME` manually to a scratch dir and run the same setup the test does (`gstack-config set question_tuning true`, then `gstack-question-preference --write`).
74+
375
## [1.24.0.0] - 2026-04-30
476

577
## **Cross-platform hardening. Mac + Linux full, curated Windows lane added.**
@@ -168,7 +240,6 @@ For `/plan-ceo-review` specifically: any future preamble slim-down or template e
168240
- The runner change is additive and the existing sibling smokes (`plan-eng`, `plan-design`, `plan-devex`, `plan-mode-no-op`) keep their loose `['asked', 'plan_ready']` assertion. Their behavior is unchanged.
169241
- Post-merge follow-ups captured in `TODOS.md`: per-finding AskUserQuestion count assertion (V2), env-driven gstack-config overrides (so `QUESTION_TUNING=false` actually isolates the test), path-confusion hardening on `SANCTIONED_WRITE_SUBSTRINGS`.
170242

171-
172243
## [1.20.0.0] - 2026-04-28
173244

174245
## **Browser-skills land. `/scrape <intent>` first call drives the page; second call runs the codified script in 200ms.**

SKILL.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -107,7 +107,7 @@ In plan mode, allowed because they inform the plan: `$B`, `$D`, `codex exec`/`co
107107

108108
## Skill Invocation During Plan Mode
109109

110-
If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion satisfies plan mode's end-of-turn requirement. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode.
110+
If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, fall back to writing the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode — never silently auto-decide. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode.
111111

112112
If `PROACTIVE` is `"false"`, do not auto-invoke or proactively suggest skills. If a skill seems useful, ask: "I think /skillname might help here — want me to run it?"
113113

VERSION

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
1.24.0.0
1+
1.25.0.0

autoplan/SKILL.md

Lines changed: 11 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -116,7 +116,7 @@ In plan mode, allowed because they inform the plan: `$B`, `$D`, `codex exec`/`co
116116

117117
## Skill Invocation During Plan Mode
118118

119-
If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion satisfies plan mode's end-of-turn requirement. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode.
119+
If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, fall back to writing the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode — never silently auto-decide. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode.
120120

121121
If `PROACTIVE` is `"false"`, do not auto-invoke or proactively suggest skills. If a skill seems useful, ask: "I think /skillname might help here — want me to run it?"
122122

@@ -281,6 +281,16 @@ AI orchestrator (e.g., OpenClaw). In spawned sessions:
281281

282282
## AskUserQuestion Format
283283

284+
### Tool resolution (read first)
285+
286+
"AskUserQuestion" can resolve to two tools at runtime: the **host MCP variant** (e.g. `mcp__conductor__AskUserQuestion` — appears in your tool list when the host registers it) or the **native** Claude Code tool.
287+
288+
**Rule:** if any `mcp__*__AskUserQuestion` variant is in your tool list, prefer it. Hosts may disable native AUQ via `--disallowedTools AskUserQuestion` (Conductor does, by default) and route through their MCP variant; calling native there silently fails. Same questions/options shape; same decision-brief format applies.
289+
290+
**Fallback when neither variant is callable:** in plan mode, write the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode (the native "Ready to execute?" surfaces it). Outside plan mode, output the brief as prose and stop. **Never silently auto-decide** — only `/plan-tune` AUTO_DECIDE opt-ins authorize auto-picking.
291+
292+
### Format
293+
284294
Every AskUserQuestion is a decision brief and must be sent as tool_use, not prose.
285295

286296
```

benchmark-models/SKILL.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -109,7 +109,7 @@ In plan mode, allowed because they inform the plan: `$B`, `$D`, `codex exec`/`co
109109

110110
## Skill Invocation During Plan Mode
111111

112-
If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion satisfies plan mode's end-of-turn requirement. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode.
112+
If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, fall back to writing the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode — never silently auto-decide. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode.
113113

114114
If `PROACTIVE` is `"false"`, do not auto-invoke or proactively suggest skills. If a skill seems useful, ask: "I think /skillname might help here — want me to run it?"
115115

benchmark/SKILL.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -109,7 +109,7 @@ In plan mode, allowed because they inform the plan: `$B`, `$D`, `codex exec`/`co
109109

110110
## Skill Invocation During Plan Mode
111111

112-
If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion satisfies plan mode's end-of-turn requirement. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode.
112+
If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, fall back to writing the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode — never silently auto-decide. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode.
113113

114114
If `PROACTIVE` is `"false"`, do not auto-invoke or proactively suggest skills. If a skill seems useful, ask: "I think /skillname might help here — want me to run it?"
115115

0 commit comments

Comments
 (0)