diff --git a/CHANGELOG.md b/CHANGELOG.md index d0051169e4..7dbc82f998 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,5 +1,59 @@ # Changelog +## [1.48.0.0] - 2026-05-26 + +## **Agents stop dropping AskUserQuestion options when there are 5+.** A new canonical preamble rule + runtime gate makes Conductor's 4-option cap a split-or-batch decision, not a silent trim. + +The failure mode looked like this in real transcripts: + +> "I'm hitting Conductor's limit of 4 options in the AUQ, so I need to cut one. E4 is the biggest lift and probably beyond scope for v0.42 anyway. Trimming: E4. Moving to TODOs without asking. Re-firing with 4." + +The agent unilaterally cut an option from the user's decision set. This release names that as a bug and gives the agent two compliant shapes for any 5+ option scenario: batch into ≤4-groups when the options are coherent alternatives (version bumps, layout variants), or split into N sequential per-option calls when the options are independent scope items (which platforms to ship, which TODOs to land). Default-to-split when unsure. The inline preamble subsection is intentionally compressed; full reference with worked examples, Hold/dependency semantics, and final-summary validation lives in `docs/askuserquestion-split.md` — loaded on demand when N>4. + +A two-layer defense protects the option set from being silently auto-approved: per-option `question_id`s of shape `-split-` are unique per option (preferences can't leak across the chain), and the runtime checker `bin/gstack-question-preference --check` refuses `never-ask` on any `*-split-*` id with an explanatory note. The user's option set is sacred — the whole point of splitting is restoring user sovereignty over the decision space. + +### The numbers that matter + +Source: this branch's 4 commits + the post-merge regen against `main` at v1.47.0.0. Net diff: 57 files, ~2800 lines (most of it mechanical SKILL.md regen across all 41 tier-2+ skills, ~34 lines added per skill from the inline subsection injection). + +| Capability | Before this PR | After this PR | +|---|---|---| +| 5+ options for ONE decision | drop one to fit cap, hope user doesn't notice | split into N per-option calls OR batch into ≤4-groups, name the rule in the preamble | +| Per-option call shape | n/a (couldn't reliably produce one) | `D.k` header, Include / Defer / Cut / Hold buckets, kind-note (no completeness score), recommendation per option | +| Hold semantics | undefined (chain might queue, might stop, agent-dependent) | stop chain immediately, resume on user "continue" | +| Final summary | n/a | `D.final` validates dependencies, reprompts conflicts, confirms assembled scope | +| `D.0` meta-question (N>6) | n/a | tool-call meta-question first: proceed / narrow / batch | +| AUTO_DECIDE on split per-option calls | possible if user tuned the pattern via /plan-tune | runtime checker forces `ASK_NORMALLY` on any `*-split-*` id, with explanatory note | +| Regression coverage | n/a | 6 inline-contract tests + 7 runtime-gate tests + 1 periodic-tier E2E behavior test (5-option scope fixture) | +| Inline preamble bytes per SKILL.md | n/a | ~1.6 KB (vs ~4 KB if the full rule were inlined; deeper reference loaded on demand) | + +### What this means for builders + +Next time you give an agent a scope question with 5+ candidates, you'll see one of three shapes: a single batched AskUserQuestion with ≤4 buckets (if the options are coherent alternatives), N sequential per-option calls each with Include/Defer/Cut/Hold (if they're independent), or a `D.0` meta-question first asking whether to proceed/narrow/batch (if N>6). You'll never see a silent trim. If you've ever wired up a /plan-tune `never-ask` preference, it now refuses to apply to split chains — the runtime check forces ASK_NORMALLY with a note explaining why. + +### Itemized changes + +#### Added +- New canonical preamble subsection in `scripts/resolvers/preamble/generate-ask-user-format.ts`: "Handling 5+ options — split, never drop". Inline-compressed (~1.6 KB per tier-2+ skill); full reference at `docs/askuserquestion-split.md`. +- `docs/askuserquestion-split.md`: ~200-line deep reference covering both compliant shapes (batched / split), per-option call mechanics with `D.k` numbering, Hold-means-stop semantics, final-summary dependency validation with conflict reprompt, `D.0` meta-question for N>6, `-split-` question_id format, and the two-layer AUTO_DECIDE defense. +- Runtime AUTO_DECIDE gate in `bin/gstack-question-preference --check`: detects any `question_id` matching `*-split-*` and forces `ASK_NORMALLY` regardless of stored `never-ask` or `ask-only-for-one-way` preferences. Emits an explanatory note so the user knows their preference was bypassed and why. +- `test/skill-e2e-plan-ceo-split-overflow.test.ts`: periodic-tier regression test using a 5-option chat-platform integration fixture. Floor 4 review-phase AUQs (N-1 tolerance). Catches the original drop-to-fit-4 failure mode. + +#### Changed +- Test baseline anchor rebased v1.44.1 → v1.47.0.0 in `test/skill-size-budget.test.ts`. Main's v1.46 (catalog tokens) + v1.47 (/spec) growth pushed the v1.44.1 anchor past the 5% ratchet; rebasing absorbs that growth at HEAD. Historical `parity-baseline-v1.44.1.json` and `parity-baseline-v1.46.0.0.json` retained in `test/fixtures/` for reference. +- 3 self-check items added to the AskUserQuestion preamble checklist (split-not-drop, dependency-check-before-chain, Hold-stops-immediately). +- 41 tier-2+ skills regenerated to inherit the new subsection (~34 lines each via the preamble resolver). +- 3 golden ship fixtures refreshed for the new preamble. + +#### Fixed +- Orphan `12.` numbered-list prefix on the existing CJK rule in `generate-ask-user-format.ts` — refactoring artifact, no items 1-11 above it. Removed. +- `docs/skills.md` missing `/spec` table row (pre-existing miss from PR #1698/#1733 that landed `/spec` to main without updating the doc inventory). Added. + +#### For contributors +- 6 resolver tests pin the inline-subsection contract (4-option cap text, Include/Defer/Cut/Hold buckets, D-numbering shape, AUTO_DECIDE runtime gate reference, docs pointer, orphan-12 regression). +- 7 runtime-gate tests in `test/gstack-question-preference.test.ts` cover the carve-out: no-pref baseline, never-ask override, explanatory note text, ask-only-for-one-way override, always-ask (no note), non-split id containing "split" word (negative regex specificity), multi-skill split id formats. +- `parity-baseline-v1.47.0.0.json` captured via `bun run scripts/capture-baseline.ts --tag v1.47.0.0`. + ## [1.47.0.0] - 2026-05-26 ## **`/spec` ships: turn vague intent into a precise, executable spec in five phases.** Pipe the spec into a spawned Claude Code agent, dedupe against existing issues, archive locally for the team corpus, and let `/ship` close the source issue on merge. diff --git a/VERSION b/VERSION index d074bee30d..01934fdf4c 100644 --- a/VERSION +++ b/VERSION @@ -1 +1 @@ -1.47.0.0 +1.48.0.0 diff --git a/autoplan/SKILL.md b/autoplan/SKILL.md index b088d6443e..0e77d81968 100644 --- a/autoplan/SKILL.md +++ b/autoplan/SKILL.md @@ -342,7 +342,36 @@ Effort both-scales: when an option involves effort, label both human-team and CC Net line closes the tradeoff. Per-skill instructions may add stricter rules. -12. **Non-ASCII characters — write directly, never \u-escape.** When any +### Handling 5+ options — split, never drop + +AskUserQuestion caps every call at **4 options**. With 5+ real options, NEVER +drop, merge, or silently defer one to fit. Pick a compliant shape: + +- **Batch into ≤4-groups** — for coherent alternatives (e.g. version bumps, + layout variants). One call, 5th surfaced only if first 4 don't fit. +- **Split per-option** — for independent scope items (e.g. "ship E1..E6?"). + Fire N sequential calls, one per option. Default to this when unsure. + +Per-option call shape: `D.k` header (e.g. D3.1..D3.5), ELI10 per option, +Recommendation, kind-note (no completeness score — Include/Defer/Cut/Hold are +decision actions), and 4 buckets: +**A) Include**, **B) Defer**, **C) Cut**, **D) Hold** (stop chain, discuss). + +After the chain, fire `D.final` to validate the assembled set (reprompt +dependency conflicts) and confirm shipping it. Use `D.revise-` to +revise one option without re-running the chain. + +For N>6, fire a `D.0` meta-AskUserQuestion first (proceed / narrow / batch). + +question_ids for split chains: `-split-` (kebab-case ASCII, +≤64 chars, `-2`/`-3` suffix on collision). The runtime checker +(`bin/gstack-question-preference`) refuses `never-ask` on any `*-split-*` id, +so split chains are never AUTO_DECIDE-eligible — the user's option set is sacred. + +**Full rule + worked examples + Hold/dependency semantics:** see +`docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. + +**Non-ASCII characters — write directly, never \u-escape.** When any string field (question, option label, option description) contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit the literal UTF-8 characters in the JSON string. **Never escape them @@ -375,6 +404,9 @@ Before calling AskUserQuestion, verify: - [ ] Net line closes the decision - [ ] You are calling the tool, not writing prose - [ ] Non-ASCII characters (CJK / accents) written directly, NOT \u-escaped +- [ ] If you had 5+ options, you split (or batched into ≤4-groups) — did NOT drop any +- [ ] If you split, you checked dependencies between options before firing the chain +- [ ] If a per-option Hold fires, you stopped the chain immediately (didn't queue) ## Artifacts Sync (skill start) diff --git a/bin/gstack-question-preference b/bin/gstack-question-preference index b660742e35..b8c5665af9 100755 --- a/bin/gstack-question-preference +++ b/bin/gstack-question-preference @@ -68,6 +68,21 @@ do_check() { return; } + // Split-chain carve-out: per-option calls in N-option splits emit + // question_ids of the form -split-. These are + // NEVER AUTO_DECIDE-eligible regardless of stored preferences — the + // whole point of splitting is restoring user sovereignty over the + // option set. See scripts/resolvers/preamble/generate-ask-user-format.ts + // \"Handling 5+ options — split, never drop\" for the surrounding + // mechanism that generates these ids. + if (/-split-/.test(qid)) { + console.log('ASK_NORMALLY'); + if (pref === 'never-ask' || pref === 'ask-only-for-one-way') { + console.log('NOTE: split-chain per-option calls always ASK_NORMALLY; your ' + pref + ' preference does not apply to options inside a sequential split.'); + } + return; + } + switch (pref) { case 'never-ask': console.log('AUTO_DECIDE'); diff --git a/canary/SKILL.md b/canary/SKILL.md index cdc92daf6a..2693319be6 100644 --- a/canary/SKILL.md +++ b/canary/SKILL.md @@ -334,7 +334,36 @@ Effort both-scales: when an option involves effort, label both human-team and CC Net line closes the tradeoff. Per-skill instructions may add stricter rules. -12. **Non-ASCII characters — write directly, never \u-escape.** When any +### Handling 5+ options — split, never drop + +AskUserQuestion caps every call at **4 options**. With 5+ real options, NEVER +drop, merge, or silently defer one to fit. Pick a compliant shape: + +- **Batch into ≤4-groups** — for coherent alternatives (e.g. version bumps, + layout variants). One call, 5th surfaced only if first 4 don't fit. +- **Split per-option** — for independent scope items (e.g. "ship E1..E6?"). + Fire N sequential calls, one per option. Default to this when unsure. + +Per-option call shape: `D.k` header (e.g. D3.1..D3.5), ELI10 per option, +Recommendation, kind-note (no completeness score — Include/Defer/Cut/Hold are +decision actions), and 4 buckets: +**A) Include**, **B) Defer**, **C) Cut**, **D) Hold** (stop chain, discuss). + +After the chain, fire `D.final` to validate the assembled set (reprompt +dependency conflicts) and confirm shipping it. Use `D.revise-` to +revise one option without re-running the chain. + +For N>6, fire a `D.0` meta-AskUserQuestion first (proceed / narrow / batch). + +question_ids for split chains: `-split-` (kebab-case ASCII, +≤64 chars, `-2`/`-3` suffix on collision). The runtime checker +(`bin/gstack-question-preference`) refuses `never-ask` on any `*-split-*` id, +so split chains are never AUTO_DECIDE-eligible — the user's option set is sacred. + +**Full rule + worked examples + Hold/dependency semantics:** see +`docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. + +**Non-ASCII characters — write directly, never \u-escape.** When any string field (question, option label, option description) contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit the literal UTF-8 characters in the JSON string. **Never escape them @@ -367,6 +396,9 @@ Before calling AskUserQuestion, verify: - [ ] Net line closes the decision - [ ] You are calling the tool, not writing prose - [ ] Non-ASCII characters (CJK / accents) written directly, NOT \u-escaped +- [ ] If you had 5+ options, you split (or batched into ≤4-groups) — did NOT drop any +- [ ] If you split, you checked dependencies between options before firing the chain +- [ ] If a per-option Hold fires, you stopped the chain immediately (didn't queue) ## Artifacts Sync (skill start) diff --git a/codex/SKILL.md b/codex/SKILL.md index 7d6a4d21a2..24331dde34 100644 --- a/codex/SKILL.md +++ b/codex/SKILL.md @@ -337,7 +337,36 @@ Effort both-scales: when an option involves effort, label both human-team and CC Net line closes the tradeoff. Per-skill instructions may add stricter rules. -12. **Non-ASCII characters — write directly, never \u-escape.** When any +### Handling 5+ options — split, never drop + +AskUserQuestion caps every call at **4 options**. With 5+ real options, NEVER +drop, merge, or silently defer one to fit. Pick a compliant shape: + +- **Batch into ≤4-groups** — for coherent alternatives (e.g. version bumps, + layout variants). One call, 5th surfaced only if first 4 don't fit. +- **Split per-option** — for independent scope items (e.g. "ship E1..E6?"). + Fire N sequential calls, one per option. Default to this when unsure. + +Per-option call shape: `D.k` header (e.g. D3.1..D3.5), ELI10 per option, +Recommendation, kind-note (no completeness score — Include/Defer/Cut/Hold are +decision actions), and 4 buckets: +**A) Include**, **B) Defer**, **C) Cut**, **D) Hold** (stop chain, discuss). + +After the chain, fire `D.final` to validate the assembled set (reprompt +dependency conflicts) and confirm shipping it. Use `D.revise-` to +revise one option without re-running the chain. + +For N>6, fire a `D.0` meta-AskUserQuestion first (proceed / narrow / batch). + +question_ids for split chains: `-split-` (kebab-case ASCII, +≤64 chars, `-2`/`-3` suffix on collision). The runtime checker +(`bin/gstack-question-preference`) refuses `never-ask` on any `*-split-*` id, +so split chains are never AUTO_DECIDE-eligible — the user's option set is sacred. + +**Full rule + worked examples + Hold/dependency semantics:** see +`docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. + +**Non-ASCII characters — write directly, never \u-escape.** When any string field (question, option label, option description) contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit the literal UTF-8 characters in the JSON string. **Never escape them @@ -370,6 +399,9 @@ Before calling AskUserQuestion, verify: - [ ] Net line closes the decision - [ ] You are calling the tool, not writing prose - [ ] Non-ASCII characters (CJK / accents) written directly, NOT \u-escaped +- [ ] If you had 5+ options, you split (or batched into ≤4-groups) — did NOT drop any +- [ ] If you split, you checked dependencies between options before firing the chain +- [ ] If a per-option Hold fires, you stopped the chain immediately (didn't queue) ## Artifacts Sync (skill start) diff --git a/context-restore/SKILL.md b/context-restore/SKILL.md index 72eb2995b9..22e499dd25 100644 --- a/context-restore/SKILL.md +++ b/context-restore/SKILL.md @@ -338,7 +338,36 @@ Effort both-scales: when an option involves effort, label both human-team and CC Net line closes the tradeoff. Per-skill instructions may add stricter rules. -12. **Non-ASCII characters — write directly, never \u-escape.** When any +### Handling 5+ options — split, never drop + +AskUserQuestion caps every call at **4 options**. With 5+ real options, NEVER +drop, merge, or silently defer one to fit. Pick a compliant shape: + +- **Batch into ≤4-groups** — for coherent alternatives (e.g. version bumps, + layout variants). One call, 5th surfaced only if first 4 don't fit. +- **Split per-option** — for independent scope items (e.g. "ship E1..E6?"). + Fire N sequential calls, one per option. Default to this when unsure. + +Per-option call shape: `D.k` header (e.g. D3.1..D3.5), ELI10 per option, +Recommendation, kind-note (no completeness score — Include/Defer/Cut/Hold are +decision actions), and 4 buckets: +**A) Include**, **B) Defer**, **C) Cut**, **D) Hold** (stop chain, discuss). + +After the chain, fire `D.final` to validate the assembled set (reprompt +dependency conflicts) and confirm shipping it. Use `D.revise-` to +revise one option without re-running the chain. + +For N>6, fire a `D.0` meta-AskUserQuestion first (proceed / narrow / batch). + +question_ids for split chains: `-split-` (kebab-case ASCII, +≤64 chars, `-2`/`-3` suffix on collision). The runtime checker +(`bin/gstack-question-preference`) refuses `never-ask` on any `*-split-*` id, +so split chains are never AUTO_DECIDE-eligible — the user's option set is sacred. + +**Full rule + worked examples + Hold/dependency semantics:** see +`docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. + +**Non-ASCII characters — write directly, never \u-escape.** When any string field (question, option label, option description) contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit the literal UTF-8 characters in the JSON string. **Never escape them @@ -371,6 +400,9 @@ Before calling AskUserQuestion, verify: - [ ] Net line closes the decision - [ ] You are calling the tool, not writing prose - [ ] Non-ASCII characters (CJK / accents) written directly, NOT \u-escaped +- [ ] If you had 5+ options, you split (or batched into ≤4-groups) — did NOT drop any +- [ ] If you split, you checked dependencies between options before firing the chain +- [ ] If a per-option Hold fires, you stopped the chain immediately (didn't queue) ## Artifacts Sync (skill start) diff --git a/context-save/SKILL.md b/context-save/SKILL.md index cc9b79f65c..f41551d78c 100644 --- a/context-save/SKILL.md +++ b/context-save/SKILL.md @@ -337,7 +337,36 @@ Effort both-scales: when an option involves effort, label both human-team and CC Net line closes the tradeoff. Per-skill instructions may add stricter rules. -12. **Non-ASCII characters — write directly, never \u-escape.** When any +### Handling 5+ options — split, never drop + +AskUserQuestion caps every call at **4 options**. With 5+ real options, NEVER +drop, merge, or silently defer one to fit. Pick a compliant shape: + +- **Batch into ≤4-groups** — for coherent alternatives (e.g. version bumps, + layout variants). One call, 5th surfaced only if first 4 don't fit. +- **Split per-option** — for independent scope items (e.g. "ship E1..E6?"). + Fire N sequential calls, one per option. Default to this when unsure. + +Per-option call shape: `D.k` header (e.g. D3.1..D3.5), ELI10 per option, +Recommendation, kind-note (no completeness score — Include/Defer/Cut/Hold are +decision actions), and 4 buckets: +**A) Include**, **B) Defer**, **C) Cut**, **D) Hold** (stop chain, discuss). + +After the chain, fire `D.final` to validate the assembled set (reprompt +dependency conflicts) and confirm shipping it. Use `D.revise-` to +revise one option without re-running the chain. + +For N>6, fire a `D.0` meta-AskUserQuestion first (proceed / narrow / batch). + +question_ids for split chains: `-split-` (kebab-case ASCII, +≤64 chars, `-2`/`-3` suffix on collision). The runtime checker +(`bin/gstack-question-preference`) refuses `never-ask` on any `*-split-*` id, +so split chains are never AUTO_DECIDE-eligible — the user's option set is sacred. + +**Full rule + worked examples + Hold/dependency semantics:** see +`docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. + +**Non-ASCII characters — write directly, never \u-escape.** When any string field (question, option label, option description) contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit the literal UTF-8 characters in the JSON string. **Never escape them @@ -370,6 +399,9 @@ Before calling AskUserQuestion, verify: - [ ] Net line closes the decision - [ ] You are calling the tool, not writing prose - [ ] Non-ASCII characters (CJK / accents) written directly, NOT \u-escaped +- [ ] If you had 5+ options, you split (or batched into ≤4-groups) — did NOT drop any +- [ ] If you split, you checked dependencies between options before firing the chain +- [ ] If a per-option Hold fires, you stopped the chain immediately (didn't queue) ## Artifacts Sync (skill start) diff --git a/cso/SKILL.md b/cso/SKILL.md index 60f319397d..3e39ce4c57 100644 --- a/cso/SKILL.md +++ b/cso/SKILL.md @@ -340,7 +340,36 @@ Effort both-scales: when an option involves effort, label both human-team and CC Net line closes the tradeoff. Per-skill instructions may add stricter rules. -12. **Non-ASCII characters — write directly, never \u-escape.** When any +### Handling 5+ options — split, never drop + +AskUserQuestion caps every call at **4 options**. With 5+ real options, NEVER +drop, merge, or silently defer one to fit. Pick a compliant shape: + +- **Batch into ≤4-groups** — for coherent alternatives (e.g. version bumps, + layout variants). One call, 5th surfaced only if first 4 don't fit. +- **Split per-option** — for independent scope items (e.g. "ship E1..E6?"). + Fire N sequential calls, one per option. Default to this when unsure. + +Per-option call shape: `D.k` header (e.g. D3.1..D3.5), ELI10 per option, +Recommendation, kind-note (no completeness score — Include/Defer/Cut/Hold are +decision actions), and 4 buckets: +**A) Include**, **B) Defer**, **C) Cut**, **D) Hold** (stop chain, discuss). + +After the chain, fire `D.final` to validate the assembled set (reprompt +dependency conflicts) and confirm shipping it. Use `D.revise-` to +revise one option without re-running the chain. + +For N>6, fire a `D.0` meta-AskUserQuestion first (proceed / narrow / batch). + +question_ids for split chains: `-split-` (kebab-case ASCII, +≤64 chars, `-2`/`-3` suffix on collision). The runtime checker +(`bin/gstack-question-preference`) refuses `never-ask` on any `*-split-*` id, +so split chains are never AUTO_DECIDE-eligible — the user's option set is sacred. + +**Full rule + worked examples + Hold/dependency semantics:** see +`docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. + +**Non-ASCII characters — write directly, never \u-escape.** When any string field (question, option label, option description) contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit the literal UTF-8 characters in the JSON string. **Never escape them @@ -373,6 +402,9 @@ Before calling AskUserQuestion, verify: - [ ] Net line closes the decision - [ ] You are calling the tool, not writing prose - [ ] Non-ASCII characters (CJK / accents) written directly, NOT \u-escaped +- [ ] If you had 5+ options, you split (or batched into ≤4-groups) — did NOT drop any +- [ ] If you split, you checked dependencies between options before firing the chain +- [ ] If a per-option Hold fires, you stopped the chain immediately (didn't queue) ## Artifacts Sync (skill start) diff --git a/design-consultation/SKILL.md b/design-consultation/SKILL.md index 14bc0dbb70..235026d2f7 100644 --- a/design-consultation/SKILL.md +++ b/design-consultation/SKILL.md @@ -360,7 +360,36 @@ Effort both-scales: when an option involves effort, label both human-team and CC Net line closes the tradeoff. Per-skill instructions may add stricter rules. -12. **Non-ASCII characters — write directly, never \u-escape.** When any +### Handling 5+ options — split, never drop + +AskUserQuestion caps every call at **4 options**. With 5+ real options, NEVER +drop, merge, or silently defer one to fit. Pick a compliant shape: + +- **Batch into ≤4-groups** — for coherent alternatives (e.g. version bumps, + layout variants). One call, 5th surfaced only if first 4 don't fit. +- **Split per-option** — for independent scope items (e.g. "ship E1..E6?"). + Fire N sequential calls, one per option. Default to this when unsure. + +Per-option call shape: `D.k` header (e.g. D3.1..D3.5), ELI10 per option, +Recommendation, kind-note (no completeness score — Include/Defer/Cut/Hold are +decision actions), and 4 buckets: +**A) Include**, **B) Defer**, **C) Cut**, **D) Hold** (stop chain, discuss). + +After the chain, fire `D.final` to validate the assembled set (reprompt +dependency conflicts) and confirm shipping it. Use `D.revise-` to +revise one option without re-running the chain. + +For N>6, fire a `D.0` meta-AskUserQuestion first (proceed / narrow / batch). + +question_ids for split chains: `-split-` (kebab-case ASCII, +≤64 chars, `-2`/`-3` suffix on collision). The runtime checker +(`bin/gstack-question-preference`) refuses `never-ask` on any `*-split-*` id, +so split chains are never AUTO_DECIDE-eligible — the user's option set is sacred. + +**Full rule + worked examples + Hold/dependency semantics:** see +`docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. + +**Non-ASCII characters — write directly, never \u-escape.** When any string field (question, option label, option description) contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit the literal UTF-8 characters in the JSON string. **Never escape them @@ -393,6 +422,9 @@ Before calling AskUserQuestion, verify: - [ ] Net line closes the decision - [ ] You are calling the tool, not writing prose - [ ] Non-ASCII characters (CJK / accents) written directly, NOT \u-escaped +- [ ] If you had 5+ options, you split (or batched into ≤4-groups) — did NOT drop any +- [ ] If you split, you checked dependencies between options before firing the chain +- [ ] If a per-option Hold fires, you stopped the chain immediately (didn't queue) ## Artifacts Sync (skill start) diff --git a/design-html/SKILL.md b/design-html/SKILL.md index b51bb27c14..70b87ff7e0 100644 --- a/design-html/SKILL.md +++ b/design-html/SKILL.md @@ -341,7 +341,36 @@ Effort both-scales: when an option involves effort, label both human-team and CC Net line closes the tradeoff. Per-skill instructions may add stricter rules. -12. **Non-ASCII characters — write directly, never \u-escape.** When any +### Handling 5+ options — split, never drop + +AskUserQuestion caps every call at **4 options**. With 5+ real options, NEVER +drop, merge, or silently defer one to fit. Pick a compliant shape: + +- **Batch into ≤4-groups** — for coherent alternatives (e.g. version bumps, + layout variants). One call, 5th surfaced only if first 4 don't fit. +- **Split per-option** — for independent scope items (e.g. "ship E1..E6?"). + Fire N sequential calls, one per option. Default to this when unsure. + +Per-option call shape: `D.k` header (e.g. D3.1..D3.5), ELI10 per option, +Recommendation, kind-note (no completeness score — Include/Defer/Cut/Hold are +decision actions), and 4 buckets: +**A) Include**, **B) Defer**, **C) Cut**, **D) Hold** (stop chain, discuss). + +After the chain, fire `D.final` to validate the assembled set (reprompt +dependency conflicts) and confirm shipping it. Use `D.revise-` to +revise one option without re-running the chain. + +For N>6, fire a `D.0` meta-AskUserQuestion first (proceed / narrow / batch). + +question_ids for split chains: `-split-` (kebab-case ASCII, +≤64 chars, `-2`/`-3` suffix on collision). The runtime checker +(`bin/gstack-question-preference`) refuses `never-ask` on any `*-split-*` id, +so split chains are never AUTO_DECIDE-eligible — the user's option set is sacred. + +**Full rule + worked examples + Hold/dependency semantics:** see +`docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. + +**Non-ASCII characters — write directly, never \u-escape.** When any string field (question, option label, option description) contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit the literal UTF-8 characters in the JSON string. **Never escape them @@ -374,6 +403,9 @@ Before calling AskUserQuestion, verify: - [ ] Net line closes the decision - [ ] You are calling the tool, not writing prose - [ ] Non-ASCII characters (CJK / accents) written directly, NOT \u-escaped +- [ ] If you had 5+ options, you split (or batched into ≤4-groups) — did NOT drop any +- [ ] If you split, you checked dependencies between options before firing the chain +- [ ] If a per-option Hold fires, you stopped the chain immediately (didn't queue) ## Artifacts Sync (skill start) diff --git a/design-review/SKILL.md b/design-review/SKILL.md index 1d8bd80f0a..33c43ceb56 100644 --- a/design-review/SKILL.md +++ b/design-review/SKILL.md @@ -338,7 +338,36 @@ Effort both-scales: when an option involves effort, label both human-team and CC Net line closes the tradeoff. Per-skill instructions may add stricter rules. -12. **Non-ASCII characters — write directly, never \u-escape.** When any +### Handling 5+ options — split, never drop + +AskUserQuestion caps every call at **4 options**. With 5+ real options, NEVER +drop, merge, or silently defer one to fit. Pick a compliant shape: + +- **Batch into ≤4-groups** — for coherent alternatives (e.g. version bumps, + layout variants). One call, 5th surfaced only if first 4 don't fit. +- **Split per-option** — for independent scope items (e.g. "ship E1..E6?"). + Fire N sequential calls, one per option. Default to this when unsure. + +Per-option call shape: `D.k` header (e.g. D3.1..D3.5), ELI10 per option, +Recommendation, kind-note (no completeness score — Include/Defer/Cut/Hold are +decision actions), and 4 buckets: +**A) Include**, **B) Defer**, **C) Cut**, **D) Hold** (stop chain, discuss). + +After the chain, fire `D.final` to validate the assembled set (reprompt +dependency conflicts) and confirm shipping it. Use `D.revise-` to +revise one option without re-running the chain. + +For N>6, fire a `D.0` meta-AskUserQuestion first (proceed / narrow / batch). + +question_ids for split chains: `-split-` (kebab-case ASCII, +≤64 chars, `-2`/`-3` suffix on collision). The runtime checker +(`bin/gstack-question-preference`) refuses `never-ask` on any `*-split-*` id, +so split chains are never AUTO_DECIDE-eligible — the user's option set is sacred. + +**Full rule + worked examples + Hold/dependency semantics:** see +`docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. + +**Non-ASCII characters — write directly, never \u-escape.** When any string field (question, option label, option description) contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit the literal UTF-8 characters in the JSON string. **Never escape them @@ -371,6 +400,9 @@ Before calling AskUserQuestion, verify: - [ ] Net line closes the decision - [ ] You are calling the tool, not writing prose - [ ] Non-ASCII characters (CJK / accents) written directly, NOT \u-escaped +- [ ] If you had 5+ options, you split (or batched into ≤4-groups) — did NOT drop any +- [ ] If you split, you checked dependencies between options before firing the chain +- [ ] If a per-option Hold fires, you stopped the chain immediately (didn't queue) ## Artifacts Sync (skill start) diff --git a/design-shotgun/SKILL.md b/design-shotgun/SKILL.md index 82d1bb7b2e..71f1a02564 100644 --- a/design-shotgun/SKILL.md +++ b/design-shotgun/SKILL.md @@ -355,7 +355,36 @@ Effort both-scales: when an option involves effort, label both human-team and CC Net line closes the tradeoff. Per-skill instructions may add stricter rules. -12. **Non-ASCII characters — write directly, never \u-escape.** When any +### Handling 5+ options — split, never drop + +AskUserQuestion caps every call at **4 options**. With 5+ real options, NEVER +drop, merge, or silently defer one to fit. Pick a compliant shape: + +- **Batch into ≤4-groups** — for coherent alternatives (e.g. version bumps, + layout variants). One call, 5th surfaced only if first 4 don't fit. +- **Split per-option** — for independent scope items (e.g. "ship E1..E6?"). + Fire N sequential calls, one per option. Default to this when unsure. + +Per-option call shape: `D.k` header (e.g. D3.1..D3.5), ELI10 per option, +Recommendation, kind-note (no completeness score — Include/Defer/Cut/Hold are +decision actions), and 4 buckets: +**A) Include**, **B) Defer**, **C) Cut**, **D) Hold** (stop chain, discuss). + +After the chain, fire `D.final` to validate the assembled set (reprompt +dependency conflicts) and confirm shipping it. Use `D.revise-` to +revise one option without re-running the chain. + +For N>6, fire a `D.0` meta-AskUserQuestion first (proceed / narrow / batch). + +question_ids for split chains: `-split-` (kebab-case ASCII, +≤64 chars, `-2`/`-3` suffix on collision). The runtime checker +(`bin/gstack-question-preference`) refuses `never-ask` on any `*-split-*` id, +so split chains are never AUTO_DECIDE-eligible — the user's option set is sacred. + +**Full rule + worked examples + Hold/dependency semantics:** see +`docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. + +**Non-ASCII characters — write directly, never \u-escape.** When any string field (question, option label, option description) contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit the literal UTF-8 characters in the JSON string. **Never escape them @@ -388,6 +417,9 @@ Before calling AskUserQuestion, verify: - [ ] Net line closes the decision - [ ] You are calling the tool, not writing prose - [ ] Non-ASCII characters (CJK / accents) written directly, NOT \u-escaped +- [ ] If you had 5+ options, you split (or batched into ≤4-groups) — did NOT drop any +- [ ] If you split, you checked dependencies between options before firing the chain +- [ ] If a per-option Hold fires, you stopped the chain immediately (didn't queue) ## Artifacts Sync (skill start) diff --git a/devex-review/SKILL.md b/devex-review/SKILL.md index 4398af3868..a15ed78796 100644 --- a/devex-review/SKILL.md +++ b/devex-review/SKILL.md @@ -340,7 +340,36 @@ Effort both-scales: when an option involves effort, label both human-team and CC Net line closes the tradeoff. Per-skill instructions may add stricter rules. -12. **Non-ASCII characters — write directly, never \u-escape.** When any +### Handling 5+ options — split, never drop + +AskUserQuestion caps every call at **4 options**. With 5+ real options, NEVER +drop, merge, or silently defer one to fit. Pick a compliant shape: + +- **Batch into ≤4-groups** — for coherent alternatives (e.g. version bumps, + layout variants). One call, 5th surfaced only if first 4 don't fit. +- **Split per-option** — for independent scope items (e.g. "ship E1..E6?"). + Fire N sequential calls, one per option. Default to this when unsure. + +Per-option call shape: `D.k` header (e.g. D3.1..D3.5), ELI10 per option, +Recommendation, kind-note (no completeness score — Include/Defer/Cut/Hold are +decision actions), and 4 buckets: +**A) Include**, **B) Defer**, **C) Cut**, **D) Hold** (stop chain, discuss). + +After the chain, fire `D.final` to validate the assembled set (reprompt +dependency conflicts) and confirm shipping it. Use `D.revise-` to +revise one option without re-running the chain. + +For N>6, fire a `D.0` meta-AskUserQuestion first (proceed / narrow / batch). + +question_ids for split chains: `-split-` (kebab-case ASCII, +≤64 chars, `-2`/`-3` suffix on collision). The runtime checker +(`bin/gstack-question-preference`) refuses `never-ask` on any `*-split-*` id, +so split chains are never AUTO_DECIDE-eligible — the user's option set is sacred. + +**Full rule + worked examples + Hold/dependency semantics:** see +`docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. + +**Non-ASCII characters — write directly, never \u-escape.** When any string field (question, option label, option description) contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit the literal UTF-8 characters in the JSON string. **Never escape them @@ -373,6 +402,9 @@ Before calling AskUserQuestion, verify: - [ ] Net line closes the decision - [ ] You are calling the tool, not writing prose - [ ] Non-ASCII characters (CJK / accents) written directly, NOT \u-escaped +- [ ] If you had 5+ options, you split (or batched into ≤4-groups) — did NOT drop any +- [ ] If you split, you checked dependencies between options before firing the chain +- [ ] If a per-option Hold fires, you stopped the chain immediately (didn't queue) ## Artifacts Sync (skill start) diff --git a/docs/askuserquestion-split.md b/docs/askuserquestion-split.md new file mode 100644 index 0000000000..ec2f880cef --- /dev/null +++ b/docs/askuserquestion-split.md @@ -0,0 +1,216 @@ +# AskUserQuestion split rule — full reference + +Inline summary lives in the canonical preamble (`scripts/resolvers/preamble/generate-ask-user-format.ts`). +That subsection is intentionally compressed because it injects into every +tier-2+ skill's `SKILL.md`. This file is the deep reference the inline +guidance points to — load it when N>4 options come up and you need +worked examples or the full Hold / dependency / final-summary semantics. + +## The bug this prevents + +Pre-rule failure mode (transcript verbatim from the user complaint that +motivated this): + +> "I'm hitting Conductor's limit of 4 options in the AUQ, so I need to +> cut one. E4 (the detect-mappings codegen) is the biggest lift and +> probably beyond scope for v0.42 anyway — users can hand-author their +> mapping rules for the 9 clusters. I'll drop that and keep E1, E2, E3, +> and E5..." +> +> "Conductor caps at 4 options. Trimming: E4 (detect-mappings codegen) +> is the largest-effort item and a natural v0.43+ follow-up — moving it +> to TODOS.md without asking. Re-firing with 4." + +The agent unilaterally cut a real option without user input. The option +set is the user's decision space; shrinking it silently is the bug. + +## Which shape: batched vs. split + +Two compliant shapes. Pick by reading the options: + +1. **Batched into ≤4-groups** — the options are coherent alternatives, + one will be picked. Examples: "major / minor / patch / micro" for a + version bump, "5 layout variants where the user picks one", "which + framework: rspec / minitest / cucumber / none". Batch the top 4 into + one AskUserQuestion; surface the 5th as a follow-up if none of the + first 4 fit. This is the lower-friction path when applicable. + +2. **Split per-option** — the options are independent scope items, each + carrying its own include/defer/cut decision. Examples: "E1..E6, which + do we ship?", "5 candidate integrations for Q3", "8 TODOs surfaced by + the audit — which do we land?". Fire N sequential AskUserQuestion + calls, one per option. + +**Default to split per-option when unsure.** Batching wrong options +together — shoehorning orthogonal scope items into one question — is +the same failure mode as dropping. + +## Split per-option mechanics + +### Before the chain + +Check for dependencies between options. If E3 requires E1, or E5 +conflicts with E2, surface that in the per-option ELI10: + +> "Cutting this orphans E3 — they're linked." + +Without dependency surfacing, the chain produces incoherent picked sets +(user picks Include for E3 + Cut for E1, ships an unbuildable scope). + +### D-numbering + +- Parent decision: `D` where N is the global question counter. +- Each per-option call: `D.k` for k=1..K children. +- Final summary: `D.final`. +- Single-option revise: `D.revise-`. + +Example chain for 5 options at parent D3: + +``` +D3.1 → D3.2 → D3.3 → D3.4 → D3.5 → D3.final +``` + +### Per-option call shape + +For each option Eₖ, fire an AskUserQuestion with: + +- `D.k` header (e.g. D3.1, D3.2 ... D3.5) +- ELI10 of just this option's scope, cost, and any dependency it carries +- Recommendation: Include / Defer / Cut, with concrete reason +- 4 buckets per option: + - **A) Include** in this scope (recommended/not) + - **B) Defer** to follow-up (TODOs / next version) + - **C) Cut** entirely + - **D) Hold** — stop the chain, discuss before deciding +- Note: options differ in kind, not coverage — no completeness score. + (Include/Defer/Cut/Hold are decision actions, so the existing format + rule applies: omit `Completeness: N/10` and use the kind-note instead.) + +### Hold means stop, not queue + +When the user picks Hold on any per-option call, **stop the chain +immediately**. Do not continue asking later options behind the Hold — +the user wants to discuss the picked option first. After discussion, +the user resumes by saying "continue" or naming the next option to ask +about. + +Wrong behavior: queue E4 and E5 behind a Hold on E3, then fire them +later with stale context. Right behavior: stop, let the user reset the +parent decision, resume from where they left off. + +### Final summary + +After the chain resolves (without Hold), fire `D.final` to confirm +and validate the assembled set. + +**Step 1 — validate dependencies.** If the picked set is incoherent +(e.g. E3 picked Include but its required E1 was Cut), do NOT silently +accept. Re-prompt the conflict as a single AskUserQuestion: + +> "E3 needs E1 but you cut E1. Revise: +> A) keep E1 +> B) cut E3 too +> C) leave as-is and accept the broken state" + +**Step 2 — confirm the assembled set.** If coherent: + +> "Here's the assembled set: E1, E2, E5. Ship this scope? +> A) Ship this scope (recommended) +> B) Revise one option (you pick which) +> C) Cut more" + +**Step 3 — targeted revise.** If the user picks B, ask which option to +revise, then fire ONE per-option AskUserQuestion at `D.revise-` +to update just that option. Do **not** re-run the whole chain. + +## Sizing rules + +- **N ≤ 4**: use the normal single AskUserQuestion form. Don't split. +- **N = 5 or 6**: split (or batch if a clean grouping exists). +- **N > 6**: BEFORE the chain, fire a meta-AskUserQuestion at `D.0`: + + > "About to ask N per-option questions. Options: + > A) Proceed with the full split (recommended only if every option is + > independent) + > B) Narrow scope first — I'll propose a smaller set + > C) Batch into groups of 4 instead" + + This is itself an AskUserQuestion tool call, not prose — it counts as + the first prompt in the chain, not a violation of the "tool not prose" + rule. + +## question_id rules for split chains + +Each per-option AskUserQuestion emits a unique `question_id` of the +form `-split-` where `` is the option's +key kebab-cased (lowercase, hyphens, ASCII only). + +Examples: +- `plan-ceo-review-split-e4-detect-mappings` +- `ship-split-rspec` +- `plan-eng-review-split-add-coverage-test` + +**Collision handling.** If two options would produce the same slug, +suffix with `-2`, `-3`, etc. + +**Length.** Total length must be ≤64 chars (validated by +`bin/gstack-question-preference --write`). Truncate the option slug if +needed, preserving the `-split-` prefix. + +## AUTO_DECIDE behavior with split chains + +Two-layer defense. + +**Layer 1 — mechanism.** Each per-option `question_id` is unique to its +option, so preferences set on one option's id cannot leak across the +chain. A `never-ask` on `ship-split-rspec` does not silently approve +`ship-split-minitest`. + +**Layer 2 — runtime enforcement.** `bin/gstack-question-preference +--check` detects any id matching `*-split-*` (the canonical slug pattern +emitted by split chains) and forces `ASK_NORMALLY` even when a +`never-ask` or `ask-only-for-one-way` preference exists for that exact +id. The check emits an explanatory note when this override fires: + +> "split-chain per-option calls always ASK_NORMALLY; your never-ask +> preference does not apply to options inside a sequential split." + +**Result.** Split-chain per-option calls are NEVER AUTO_DECIDE-eligible. +This is a runtime contract, not just collision-resistance by id +uniqueness. The user's option set is sacred — restoring user +sovereignty over the decision space is the entire point of splitting. + +## Interaction with per-skill rules + +This rule **overrides any per-skill "batch decisions" guidance**. +Per-skill templates that explicitly require one-issue-per-call (e.g. +`plan-eng-review`) are already compatible — they're a stricter special +case of this rule. + +## Worked example: 5 platform integrations + +Fixture used by `test/skill-e2e-plan-ceo-split-overflow.test.ts`. A plan +has 5 independent chat-platform candidates: + +- E1) Slack DM bot (~2 weeks, ~40% of asks) +- E2) Discord guild bot (~3 weeks, ~15%) +- E3) Microsoft Teams (~4 weeks, ~5%) +- E4) Telegram (~1 week, ~8%) +- E5) Mattermost (~2 weeks, ~3%) + +User wants individual decisions per candidate, not a bundled pick. The +agent should: + +1. Recognize this is a 5-option independent-scope decision → split. +2. Check dependencies (none here — each platform is standalone). +3. Fire `D3.1` through `D3.5`, one per platform, with Include / Defer / + Cut / Hold buckets and an effort+demand-grounded recommendation per + option. +4. After the chain, fire `D3.final` summarizing the assembled scope + (e.g. "Ship E1 + E4 — Slack and Telegram pull most demand for least + build cost. Defer the rest. A) Ship / B) Revise / C) Cut more"). + +Pre-fix failure shape (the bug): agent constructs a single +AskUserQuestion with E1..E4 as four options, drops E5 with prose like +"E5 is the smallest revenue segment, moving to TODOs". The user never +got to weigh in on E5. Floor-of-4 in the E2E test catches this. diff --git a/docs/skills.md b/docs/skills.md index 3749fd89c3..1ef0f6ae9c 100644 --- a/docs/skills.md +++ b/docs/skills.md @@ -5,6 +5,7 @@ Detailed guides for every gstack skill — philosophy, workflow, and examples. | Skill | Your specialist | What they do | |-------|----------------|--------------| | [`/office-hours`](#office-hours) | **YC Office Hours** | Start here. Six forcing questions that reframe your product before you write code. Pushes back on your framing, challenges premises, generates implementation alternatives. Design doc feeds into every downstream skill. | +| [`/spec`](#spec) | **Spec Author** | Turn vague intent into a precise, executable spec in five phases. Backlog-ready output that downstream skills can pick up. Optional agent spawn at the end. | | [`/plan-ceo-review`](#plan-ceo-review) | **CEO / Founder** | Rethink the problem. Find the 10-star product hiding inside the request. Four modes: Expansion, Selective Expansion, Hold Scope, Reduction. | | [`/plan-eng-review`](#plan-eng-review) | **Eng Manager** | Lock in architecture, data flow, diagrams, edge cases, and tests. Forces hidden assumptions into the open. | | [`/plan-design-review`](#plan-design-review) | **Senior Designer** | Interactive plan-mode design review. Rates each dimension 0-10, explains what a 10 looks like, fixes the plan. Works in plan mode. | diff --git a/document-generate/SKILL.md b/document-generate/SKILL.md index e7b3c477c2..cb89b4ee5d 100644 --- a/document-generate/SKILL.md +++ b/document-generate/SKILL.md @@ -340,7 +340,36 @@ Effort both-scales: when an option involves effort, label both human-team and CC Net line closes the tradeoff. Per-skill instructions may add stricter rules. -12. **Non-ASCII characters — write directly, never \u-escape.** When any +### Handling 5+ options — split, never drop + +AskUserQuestion caps every call at **4 options**. With 5+ real options, NEVER +drop, merge, or silently defer one to fit. Pick a compliant shape: + +- **Batch into ≤4-groups** — for coherent alternatives (e.g. version bumps, + layout variants). One call, 5th surfaced only if first 4 don't fit. +- **Split per-option** — for independent scope items (e.g. "ship E1..E6?"). + Fire N sequential calls, one per option. Default to this when unsure. + +Per-option call shape: `D.k` header (e.g. D3.1..D3.5), ELI10 per option, +Recommendation, kind-note (no completeness score — Include/Defer/Cut/Hold are +decision actions), and 4 buckets: +**A) Include**, **B) Defer**, **C) Cut**, **D) Hold** (stop chain, discuss). + +After the chain, fire `D.final` to validate the assembled set (reprompt +dependency conflicts) and confirm shipping it. Use `D.revise-` to +revise one option without re-running the chain. + +For N>6, fire a `D.0` meta-AskUserQuestion first (proceed / narrow / batch). + +question_ids for split chains: `-split-` (kebab-case ASCII, +≤64 chars, `-2`/`-3` suffix on collision). The runtime checker +(`bin/gstack-question-preference`) refuses `never-ask` on any `*-split-*` id, +so split chains are never AUTO_DECIDE-eligible — the user's option set is sacred. + +**Full rule + worked examples + Hold/dependency semantics:** see +`docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. + +**Non-ASCII characters — write directly, never \u-escape.** When any string field (question, option label, option description) contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit the literal UTF-8 characters in the JSON string. **Never escape them @@ -373,6 +402,9 @@ Before calling AskUserQuestion, verify: - [ ] Net line closes the decision - [ ] You are calling the tool, not writing prose - [ ] Non-ASCII characters (CJK / accents) written directly, NOT \u-escaped +- [ ] If you had 5+ options, you split (or batched into ≤4-groups) — did NOT drop any +- [ ] If you split, you checked dependencies between options before firing the chain +- [ ] If a per-option Hold fires, you stopped the chain immediately (didn't queue) ## Artifacts Sync (skill start) diff --git a/document-release/SKILL.md b/document-release/SKILL.md index f080efe678..3fc606e8ac 100644 --- a/document-release/SKILL.md +++ b/document-release/SKILL.md @@ -338,7 +338,36 @@ Effort both-scales: when an option involves effort, label both human-team and CC Net line closes the tradeoff. Per-skill instructions may add stricter rules. -12. **Non-ASCII characters — write directly, never \u-escape.** When any +### Handling 5+ options — split, never drop + +AskUserQuestion caps every call at **4 options**. With 5+ real options, NEVER +drop, merge, or silently defer one to fit. Pick a compliant shape: + +- **Batch into ≤4-groups** — for coherent alternatives (e.g. version bumps, + layout variants). One call, 5th surfaced only if first 4 don't fit. +- **Split per-option** — for independent scope items (e.g. "ship E1..E6?"). + Fire N sequential calls, one per option. Default to this when unsure. + +Per-option call shape: `D.k` header (e.g. D3.1..D3.5), ELI10 per option, +Recommendation, kind-note (no completeness score — Include/Defer/Cut/Hold are +decision actions), and 4 buckets: +**A) Include**, **B) Defer**, **C) Cut**, **D) Hold** (stop chain, discuss). + +After the chain, fire `D.final` to validate the assembled set (reprompt +dependency conflicts) and confirm shipping it. Use `D.revise-` to +revise one option without re-running the chain. + +For N>6, fire a `D.0` meta-AskUserQuestion first (proceed / narrow / batch). + +question_ids for split chains: `-split-` (kebab-case ASCII, +≤64 chars, `-2`/`-3` suffix on collision). The runtime checker +(`bin/gstack-question-preference`) refuses `never-ask` on any `*-split-*` id, +so split chains are never AUTO_DECIDE-eligible — the user's option set is sacred. + +**Full rule + worked examples + Hold/dependency semantics:** see +`docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. + +**Non-ASCII characters — write directly, never \u-escape.** When any string field (question, option label, option description) contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit the literal UTF-8 characters in the JSON string. **Never escape them @@ -371,6 +400,9 @@ Before calling AskUserQuestion, verify: - [ ] Net line closes the decision - [ ] You are calling the tool, not writing prose - [ ] Non-ASCII characters (CJK / accents) written directly, NOT \u-escaped +- [ ] If you had 5+ options, you split (or batched into ≤4-groups) — did NOT drop any +- [ ] If you split, you checked dependencies between options before firing the chain +- [ ] If a per-option Hold fires, you stopped the chain immediately (didn't queue) ## Artifacts Sync (skill start) diff --git a/health/SKILL.md b/health/SKILL.md index ff043ccbdb..ef63acaf65 100644 --- a/health/SKILL.md +++ b/health/SKILL.md @@ -336,7 +336,36 @@ Effort both-scales: when an option involves effort, label both human-team and CC Net line closes the tradeoff. Per-skill instructions may add stricter rules. -12. **Non-ASCII characters — write directly, never \u-escape.** When any +### Handling 5+ options — split, never drop + +AskUserQuestion caps every call at **4 options**. With 5+ real options, NEVER +drop, merge, or silently defer one to fit. Pick a compliant shape: + +- **Batch into ≤4-groups** — for coherent alternatives (e.g. version bumps, + layout variants). One call, 5th surfaced only if first 4 don't fit. +- **Split per-option** — for independent scope items (e.g. "ship E1..E6?"). + Fire N sequential calls, one per option. Default to this when unsure. + +Per-option call shape: `D.k` header (e.g. D3.1..D3.5), ELI10 per option, +Recommendation, kind-note (no completeness score — Include/Defer/Cut/Hold are +decision actions), and 4 buckets: +**A) Include**, **B) Defer**, **C) Cut**, **D) Hold** (stop chain, discuss). + +After the chain, fire `D.final` to validate the assembled set (reprompt +dependency conflicts) and confirm shipping it. Use `D.revise-` to +revise one option without re-running the chain. + +For N>6, fire a `D.0` meta-AskUserQuestion first (proceed / narrow / batch). + +question_ids for split chains: `-split-` (kebab-case ASCII, +≤64 chars, `-2`/`-3` suffix on collision). The runtime checker +(`bin/gstack-question-preference`) refuses `never-ask` on any `*-split-*` id, +so split chains are never AUTO_DECIDE-eligible — the user's option set is sacred. + +**Full rule + worked examples + Hold/dependency semantics:** see +`docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. + +**Non-ASCII characters — write directly, never \u-escape.** When any string field (question, option label, option description) contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit the literal UTF-8 characters in the JSON string. **Never escape them @@ -369,6 +398,9 @@ Before calling AskUserQuestion, verify: - [ ] Net line closes the decision - [ ] You are calling the tool, not writing prose - [ ] Non-ASCII characters (CJK / accents) written directly, NOT \u-escaped +- [ ] If you had 5+ options, you split (or batched into ≤4-groups) — did NOT drop any +- [ ] If you split, you checked dependencies between options before firing the chain +- [ ] If a per-option Hold fires, you stopped the chain immediately (didn't queue) ## Artifacts Sync (skill start) diff --git a/investigate/SKILL.md b/investigate/SKILL.md index be0731b5c8..f1d12dd1e6 100644 --- a/investigate/SKILL.md +++ b/investigate/SKILL.md @@ -375,7 +375,36 @@ Effort both-scales: when an option involves effort, label both human-team and CC Net line closes the tradeoff. Per-skill instructions may add stricter rules. -12. **Non-ASCII characters — write directly, never \u-escape.** When any +### Handling 5+ options — split, never drop + +AskUserQuestion caps every call at **4 options**. With 5+ real options, NEVER +drop, merge, or silently defer one to fit. Pick a compliant shape: + +- **Batch into ≤4-groups** — for coherent alternatives (e.g. version bumps, + layout variants). One call, 5th surfaced only if first 4 don't fit. +- **Split per-option** — for independent scope items (e.g. "ship E1..E6?"). + Fire N sequential calls, one per option. Default to this when unsure. + +Per-option call shape: `D.k` header (e.g. D3.1..D3.5), ELI10 per option, +Recommendation, kind-note (no completeness score — Include/Defer/Cut/Hold are +decision actions), and 4 buckets: +**A) Include**, **B) Defer**, **C) Cut**, **D) Hold** (stop chain, discuss). + +After the chain, fire `D.final` to validate the assembled set (reprompt +dependency conflicts) and confirm shipping it. Use `D.revise-` to +revise one option without re-running the chain. + +For N>6, fire a `D.0` meta-AskUserQuestion first (proceed / narrow / batch). + +question_ids for split chains: `-split-` (kebab-case ASCII, +≤64 chars, `-2`/`-3` suffix on collision). The runtime checker +(`bin/gstack-question-preference`) refuses `never-ask` on any `*-split-*` id, +so split chains are never AUTO_DECIDE-eligible — the user's option set is sacred. + +**Full rule + worked examples + Hold/dependency semantics:** see +`docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. + +**Non-ASCII characters — write directly, never \u-escape.** When any string field (question, option label, option description) contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit the literal UTF-8 characters in the JSON string. **Never escape them @@ -408,6 +437,9 @@ Before calling AskUserQuestion, verify: - [ ] Net line closes the decision - [ ] You are calling the tool, not writing prose - [ ] Non-ASCII characters (CJK / accents) written directly, NOT \u-escaped +- [ ] If you had 5+ options, you split (or batched into ≤4-groups) — did NOT drop any +- [ ] If you split, you checked dependencies between options before firing the chain +- [ ] If a per-option Hold fires, you stopped the chain immediately (didn't queue) ## Artifacts Sync (skill start) diff --git a/ios-clean/SKILL.md b/ios-clean/SKILL.md index 85352e406b..f925bc9486 100644 --- a/ios-clean/SKILL.md +++ b/ios-clean/SKILL.md @@ -338,7 +338,36 @@ Effort both-scales: when an option involves effort, label both human-team and CC Net line closes the tradeoff. Per-skill instructions may add stricter rules. -12. **Non-ASCII characters — write directly, never \u-escape.** When any +### Handling 5+ options — split, never drop + +AskUserQuestion caps every call at **4 options**. With 5+ real options, NEVER +drop, merge, or silently defer one to fit. Pick a compliant shape: + +- **Batch into ≤4-groups** — for coherent alternatives (e.g. version bumps, + layout variants). One call, 5th surfaced only if first 4 don't fit. +- **Split per-option** — for independent scope items (e.g. "ship E1..E6?"). + Fire N sequential calls, one per option. Default to this when unsure. + +Per-option call shape: `D.k` header (e.g. D3.1..D3.5), ELI10 per option, +Recommendation, kind-note (no completeness score — Include/Defer/Cut/Hold are +decision actions), and 4 buckets: +**A) Include**, **B) Defer**, **C) Cut**, **D) Hold** (stop chain, discuss). + +After the chain, fire `D.final` to validate the assembled set (reprompt +dependency conflicts) and confirm shipping it. Use `D.revise-` to +revise one option without re-running the chain. + +For N>6, fire a `D.0` meta-AskUserQuestion first (proceed / narrow / batch). + +question_ids for split chains: `-split-` (kebab-case ASCII, +≤64 chars, `-2`/`-3` suffix on collision). The runtime checker +(`bin/gstack-question-preference`) refuses `never-ask` on any `*-split-*` id, +so split chains are never AUTO_DECIDE-eligible — the user's option set is sacred. + +**Full rule + worked examples + Hold/dependency semantics:** see +`docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. + +**Non-ASCII characters — write directly, never \u-escape.** When any string field (question, option label, option description) contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit the literal UTF-8 characters in the JSON string. **Never escape them @@ -371,6 +400,9 @@ Before calling AskUserQuestion, verify: - [ ] Net line closes the decision - [ ] You are calling the tool, not writing prose - [ ] Non-ASCII characters (CJK / accents) written directly, NOT \u-escaped +- [ ] If you had 5+ options, you split (or batched into ≤4-groups) — did NOT drop any +- [ ] If you split, you checked dependencies between options before firing the chain +- [ ] If a per-option Hold fires, you stopped the chain immediately (didn't queue) ## Artifacts Sync (skill start) diff --git a/ios-design-review/SKILL.md b/ios-design-review/SKILL.md index f5d8900ab9..76f9629f98 100644 --- a/ios-design-review/SKILL.md +++ b/ios-design-review/SKILL.md @@ -340,7 +340,36 @@ Effort both-scales: when an option involves effort, label both human-team and CC Net line closes the tradeoff. Per-skill instructions may add stricter rules. -12. **Non-ASCII characters — write directly, never \u-escape.** When any +### Handling 5+ options — split, never drop + +AskUserQuestion caps every call at **4 options**. With 5+ real options, NEVER +drop, merge, or silently defer one to fit. Pick a compliant shape: + +- **Batch into ≤4-groups** — for coherent alternatives (e.g. version bumps, + layout variants). One call, 5th surfaced only if first 4 don't fit. +- **Split per-option** — for independent scope items (e.g. "ship E1..E6?"). + Fire N sequential calls, one per option. Default to this when unsure. + +Per-option call shape: `D.k` header (e.g. D3.1..D3.5), ELI10 per option, +Recommendation, kind-note (no completeness score — Include/Defer/Cut/Hold are +decision actions), and 4 buckets: +**A) Include**, **B) Defer**, **C) Cut**, **D) Hold** (stop chain, discuss). + +After the chain, fire `D.final` to validate the assembled set (reprompt +dependency conflicts) and confirm shipping it. Use `D.revise-` to +revise one option without re-running the chain. + +For N>6, fire a `D.0` meta-AskUserQuestion first (proceed / narrow / batch). + +question_ids for split chains: `-split-` (kebab-case ASCII, +≤64 chars, `-2`/`-3` suffix on collision). The runtime checker +(`bin/gstack-question-preference`) refuses `never-ask` on any `*-split-*` id, +so split chains are never AUTO_DECIDE-eligible — the user's option set is sacred. + +**Full rule + worked examples + Hold/dependency semantics:** see +`docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. + +**Non-ASCII characters — write directly, never \u-escape.** When any string field (question, option label, option description) contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit the literal UTF-8 characters in the JSON string. **Never escape them @@ -373,6 +402,9 @@ Before calling AskUserQuestion, verify: - [ ] Net line closes the decision - [ ] You are calling the tool, not writing prose - [ ] Non-ASCII characters (CJK / accents) written directly, NOT \u-escaped +- [ ] If you had 5+ options, you split (or batched into ≤4-groups) — did NOT drop any +- [ ] If you split, you checked dependencies between options before firing the chain +- [ ] If a per-option Hold fires, you stopped the chain immediately (didn't queue) ## Artifacts Sync (skill start) diff --git a/ios-fix/SKILL.md b/ios-fix/SKILL.md index 82fea9321b..11d7a3f1b1 100644 --- a/ios-fix/SKILL.md +++ b/ios-fix/SKILL.md @@ -341,7 +341,36 @@ Effort both-scales: when an option involves effort, label both human-team and CC Net line closes the tradeoff. Per-skill instructions may add stricter rules. -12. **Non-ASCII characters — write directly, never \u-escape.** When any +### Handling 5+ options — split, never drop + +AskUserQuestion caps every call at **4 options**. With 5+ real options, NEVER +drop, merge, or silently defer one to fit. Pick a compliant shape: + +- **Batch into ≤4-groups** — for coherent alternatives (e.g. version bumps, + layout variants). One call, 5th surfaced only if first 4 don't fit. +- **Split per-option** — for independent scope items (e.g. "ship E1..E6?"). + Fire N sequential calls, one per option. Default to this when unsure. + +Per-option call shape: `D.k` header (e.g. D3.1..D3.5), ELI10 per option, +Recommendation, kind-note (no completeness score — Include/Defer/Cut/Hold are +decision actions), and 4 buckets: +**A) Include**, **B) Defer**, **C) Cut**, **D) Hold** (stop chain, discuss). + +After the chain, fire `D.final` to validate the assembled set (reprompt +dependency conflicts) and confirm shipping it. Use `D.revise-` to +revise one option without re-running the chain. + +For N>6, fire a `D.0` meta-AskUserQuestion first (proceed / narrow / batch). + +question_ids for split chains: `-split-` (kebab-case ASCII, +≤64 chars, `-2`/`-3` suffix on collision). The runtime checker +(`bin/gstack-question-preference`) refuses `never-ask` on any `*-split-*` id, +so split chains are never AUTO_DECIDE-eligible — the user's option set is sacred. + +**Full rule + worked examples + Hold/dependency semantics:** see +`docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. + +**Non-ASCII characters — write directly, never \u-escape.** When any string field (question, option label, option description) contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit the literal UTF-8 characters in the JSON string. **Never escape them @@ -374,6 +403,9 @@ Before calling AskUserQuestion, verify: - [ ] Net line closes the decision - [ ] You are calling the tool, not writing prose - [ ] Non-ASCII characters (CJK / accents) written directly, NOT \u-escaped +- [ ] If you had 5+ options, you split (or batched into ≤4-groups) — did NOT drop any +- [ ] If you split, you checked dependencies between options before firing the chain +- [ ] If a per-option Hold fires, you stopped the chain immediately (didn't queue) ## Artifacts Sync (skill start) diff --git a/ios-qa/SKILL.md b/ios-qa/SKILL.md index 6e3e1f73da..1080896c57 100644 --- a/ios-qa/SKILL.md +++ b/ios-qa/SKILL.md @@ -344,7 +344,36 @@ Effort both-scales: when an option involves effort, label both human-team and CC Net line closes the tradeoff. Per-skill instructions may add stricter rules. -12. **Non-ASCII characters — write directly, never \u-escape.** When any +### Handling 5+ options — split, never drop + +AskUserQuestion caps every call at **4 options**. With 5+ real options, NEVER +drop, merge, or silently defer one to fit. Pick a compliant shape: + +- **Batch into ≤4-groups** — for coherent alternatives (e.g. version bumps, + layout variants). One call, 5th surfaced only if first 4 don't fit. +- **Split per-option** — for independent scope items (e.g. "ship E1..E6?"). + Fire N sequential calls, one per option. Default to this when unsure. + +Per-option call shape: `D.k` header (e.g. D3.1..D3.5), ELI10 per option, +Recommendation, kind-note (no completeness score — Include/Defer/Cut/Hold are +decision actions), and 4 buckets: +**A) Include**, **B) Defer**, **C) Cut**, **D) Hold** (stop chain, discuss). + +After the chain, fire `D.final` to validate the assembled set (reprompt +dependency conflicts) and confirm shipping it. Use `D.revise-` to +revise one option without re-running the chain. + +For N>6, fire a `D.0` meta-AskUserQuestion first (proceed / narrow / batch). + +question_ids for split chains: `-split-` (kebab-case ASCII, +≤64 chars, `-2`/`-3` suffix on collision). The runtime checker +(`bin/gstack-question-preference`) refuses `never-ask` on any `*-split-*` id, +so split chains are never AUTO_DECIDE-eligible — the user's option set is sacred. + +**Full rule + worked examples + Hold/dependency semantics:** see +`docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. + +**Non-ASCII characters — write directly, never \u-escape.** When any string field (question, option label, option description) contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit the literal UTF-8 characters in the JSON string. **Never escape them @@ -377,6 +406,9 @@ Before calling AskUserQuestion, verify: - [ ] Net line closes the decision - [ ] You are calling the tool, not writing prose - [ ] Non-ASCII characters (CJK / accents) written directly, NOT \u-escaped +- [ ] If you had 5+ options, you split (or batched into ≤4-groups) — did NOT drop any +- [ ] If you split, you checked dependencies between options before firing the chain +- [ ] If a per-option Hold fires, you stopped the chain immediately (didn't queue) ## Artifacts Sync (skill start) diff --git a/ios-sync/SKILL.md b/ios-sync/SKILL.md index c8319de120..2e0f703afa 100644 --- a/ios-sync/SKILL.md +++ b/ios-sync/SKILL.md @@ -338,7 +338,36 @@ Effort both-scales: when an option involves effort, label both human-team and CC Net line closes the tradeoff. Per-skill instructions may add stricter rules. -12. **Non-ASCII characters — write directly, never \u-escape.** When any +### Handling 5+ options — split, never drop + +AskUserQuestion caps every call at **4 options**. With 5+ real options, NEVER +drop, merge, or silently defer one to fit. Pick a compliant shape: + +- **Batch into ≤4-groups** — for coherent alternatives (e.g. version bumps, + layout variants). One call, 5th surfaced only if first 4 don't fit. +- **Split per-option** — for independent scope items (e.g. "ship E1..E6?"). + Fire N sequential calls, one per option. Default to this when unsure. + +Per-option call shape: `D.k` header (e.g. D3.1..D3.5), ELI10 per option, +Recommendation, kind-note (no completeness score — Include/Defer/Cut/Hold are +decision actions), and 4 buckets: +**A) Include**, **B) Defer**, **C) Cut**, **D) Hold** (stop chain, discuss). + +After the chain, fire `D.final` to validate the assembled set (reprompt +dependency conflicts) and confirm shipping it. Use `D.revise-` to +revise one option without re-running the chain. + +For N>6, fire a `D.0` meta-AskUserQuestion first (proceed / narrow / batch). + +question_ids for split chains: `-split-` (kebab-case ASCII, +≤64 chars, `-2`/`-3` suffix on collision). The runtime checker +(`bin/gstack-question-preference`) refuses `never-ask` on any `*-split-*` id, +so split chains are never AUTO_DECIDE-eligible — the user's option set is sacred. + +**Full rule + worked examples + Hold/dependency semantics:** see +`docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. + +**Non-ASCII characters — write directly, never \u-escape.** When any string field (question, option label, option description) contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit the literal UTF-8 characters in the JSON string. **Never escape them @@ -371,6 +400,9 @@ Before calling AskUserQuestion, verify: - [ ] Net line closes the decision - [ ] You are calling the tool, not writing prose - [ ] Non-ASCII characters (CJK / accents) written directly, NOT \u-escaped +- [ ] If you had 5+ options, you split (or batched into ≤4-groups) — did NOT drop any +- [ ] If you split, you checked dependencies between options before firing the chain +- [ ] If a per-option Hold fires, you stopped the chain immediately (didn't queue) ## Artifacts Sync (skill start) diff --git a/land-and-deploy/SKILL.md b/land-and-deploy/SKILL.md index 47a86674a4..8bfec441c5 100644 --- a/land-and-deploy/SKILL.md +++ b/land-and-deploy/SKILL.md @@ -333,7 +333,36 @@ Effort both-scales: when an option involves effort, label both human-team and CC Net line closes the tradeoff. Per-skill instructions may add stricter rules. -12. **Non-ASCII characters — write directly, never \u-escape.** When any +### Handling 5+ options — split, never drop + +AskUserQuestion caps every call at **4 options**. With 5+ real options, NEVER +drop, merge, or silently defer one to fit. Pick a compliant shape: + +- **Batch into ≤4-groups** — for coherent alternatives (e.g. version bumps, + layout variants). One call, 5th surfaced only if first 4 don't fit. +- **Split per-option** — for independent scope items (e.g. "ship E1..E6?"). + Fire N sequential calls, one per option. Default to this when unsure. + +Per-option call shape: `D.k` header (e.g. D3.1..D3.5), ELI10 per option, +Recommendation, kind-note (no completeness score — Include/Defer/Cut/Hold are +decision actions), and 4 buckets: +**A) Include**, **B) Defer**, **C) Cut**, **D) Hold** (stop chain, discuss). + +After the chain, fire `D.final` to validate the assembled set (reprompt +dependency conflicts) and confirm shipping it. Use `D.revise-` to +revise one option without re-running the chain. + +For N>6, fire a `D.0` meta-AskUserQuestion first (proceed / narrow / batch). + +question_ids for split chains: `-split-` (kebab-case ASCII, +≤64 chars, `-2`/`-3` suffix on collision). The runtime checker +(`bin/gstack-question-preference`) refuses `never-ask` on any `*-split-*` id, +so split chains are never AUTO_DECIDE-eligible — the user's option set is sacred. + +**Full rule + worked examples + Hold/dependency semantics:** see +`docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. + +**Non-ASCII characters — write directly, never \u-escape.** When any string field (question, option label, option description) contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit the literal UTF-8 characters in the JSON string. **Never escape them @@ -366,6 +395,9 @@ Before calling AskUserQuestion, verify: - [ ] Net line closes the decision - [ ] You are calling the tool, not writing prose - [ ] Non-ASCII characters (CJK / accents) written directly, NOT \u-escaped +- [ ] If you had 5+ options, you split (or batched into ≤4-groups) — did NOT drop any +- [ ] If you split, you checked dependencies between options before firing the chain +- [ ] If a per-option Hold fires, you stopped the chain immediately (didn't queue) ## Artifacts Sync (skill start) diff --git a/landing-report/SKILL.md b/landing-report/SKILL.md index d09cc6716b..442c28d7f9 100644 --- a/landing-report/SKILL.md +++ b/landing-report/SKILL.md @@ -334,7 +334,36 @@ Effort both-scales: when an option involves effort, label both human-team and CC Net line closes the tradeoff. Per-skill instructions may add stricter rules. -12. **Non-ASCII characters — write directly, never \u-escape.** When any +### Handling 5+ options — split, never drop + +AskUserQuestion caps every call at **4 options**. With 5+ real options, NEVER +drop, merge, or silently defer one to fit. Pick a compliant shape: + +- **Batch into ≤4-groups** — for coherent alternatives (e.g. version bumps, + layout variants). One call, 5th surfaced only if first 4 don't fit. +- **Split per-option** — for independent scope items (e.g. "ship E1..E6?"). + Fire N sequential calls, one per option. Default to this when unsure. + +Per-option call shape: `D.k` header (e.g. D3.1..D3.5), ELI10 per option, +Recommendation, kind-note (no completeness score — Include/Defer/Cut/Hold are +decision actions), and 4 buckets: +**A) Include**, **B) Defer**, **C) Cut**, **D) Hold** (stop chain, discuss). + +After the chain, fire `D.final` to validate the assembled set (reprompt +dependency conflicts) and confirm shipping it. Use `D.revise-` to +revise one option without re-running the chain. + +For N>6, fire a `D.0` meta-AskUserQuestion first (proceed / narrow / batch). + +question_ids for split chains: `-split-` (kebab-case ASCII, +≤64 chars, `-2`/`-3` suffix on collision). The runtime checker +(`bin/gstack-question-preference`) refuses `never-ask` on any `*-split-*` id, +so split chains are never AUTO_DECIDE-eligible — the user's option set is sacred. + +**Full rule + worked examples + Hold/dependency semantics:** see +`docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. + +**Non-ASCII characters — write directly, never \u-escape.** When any string field (question, option label, option description) contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit the literal UTF-8 characters in the JSON string. **Never escape them @@ -367,6 +396,9 @@ Before calling AskUserQuestion, verify: - [ ] Net line closes the decision - [ ] You are calling the tool, not writing prose - [ ] Non-ASCII characters (CJK / accents) written directly, NOT \u-escaped +- [ ] If you had 5+ options, you split (or batched into ≤4-groups) — did NOT drop any +- [ ] If you split, you checked dependencies between options before firing the chain +- [ ] If a per-option Hold fires, you stopped the chain immediately (didn't queue) ## Artifacts Sync (skill start) diff --git a/learn/SKILL.md b/learn/SKILL.md index 16473ccfe5..3eb54e696d 100644 --- a/learn/SKILL.md +++ b/learn/SKILL.md @@ -336,7 +336,36 @@ Effort both-scales: when an option involves effort, label both human-team and CC Net line closes the tradeoff. Per-skill instructions may add stricter rules. -12. **Non-ASCII characters — write directly, never \u-escape.** When any +### Handling 5+ options — split, never drop + +AskUserQuestion caps every call at **4 options**. With 5+ real options, NEVER +drop, merge, or silently defer one to fit. Pick a compliant shape: + +- **Batch into ≤4-groups** — for coherent alternatives (e.g. version bumps, + layout variants). One call, 5th surfaced only if first 4 don't fit. +- **Split per-option** — for independent scope items (e.g. "ship E1..E6?"). + Fire N sequential calls, one per option. Default to this when unsure. + +Per-option call shape: `D.k` header (e.g. D3.1..D3.5), ELI10 per option, +Recommendation, kind-note (no completeness score — Include/Defer/Cut/Hold are +decision actions), and 4 buckets: +**A) Include**, **B) Defer**, **C) Cut**, **D) Hold** (stop chain, discuss). + +After the chain, fire `D.final` to validate the assembled set (reprompt +dependency conflicts) and confirm shipping it. Use `D.revise-` to +revise one option without re-running the chain. + +For N>6, fire a `D.0` meta-AskUserQuestion first (proceed / narrow / batch). + +question_ids for split chains: `-split-` (kebab-case ASCII, +≤64 chars, `-2`/`-3` suffix on collision). The runtime checker +(`bin/gstack-question-preference`) refuses `never-ask` on any `*-split-*` id, +so split chains are never AUTO_DECIDE-eligible — the user's option set is sacred. + +**Full rule + worked examples + Hold/dependency semantics:** see +`docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. + +**Non-ASCII characters — write directly, never \u-escape.** When any string field (question, option label, option description) contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit the literal UTF-8 characters in the JSON string. **Never escape them @@ -369,6 +398,9 @@ Before calling AskUserQuestion, verify: - [ ] Net line closes the decision - [ ] You are calling the tool, not writing prose - [ ] Non-ASCII characters (CJK / accents) written directly, NOT \u-escaped +- [ ] If you had 5+ options, you split (or batched into ≤4-groups) — did NOT drop any +- [ ] If you split, you checked dependencies between options before firing the chain +- [ ] If a per-option Hold fires, you stopped the chain immediately (didn't queue) ## Artifacts Sync (skill start) diff --git a/office-hours/SKILL.md b/office-hours/SKILL.md index 417f6099a3..bfa14d6bd3 100644 --- a/office-hours/SKILL.md +++ b/office-hours/SKILL.md @@ -371,7 +371,36 @@ Effort both-scales: when an option involves effort, label both human-team and CC Net line closes the tradeoff. Per-skill instructions may add stricter rules. -12. **Non-ASCII characters — write directly, never \u-escape.** When any +### Handling 5+ options — split, never drop + +AskUserQuestion caps every call at **4 options**. With 5+ real options, NEVER +drop, merge, or silently defer one to fit. Pick a compliant shape: + +- **Batch into ≤4-groups** — for coherent alternatives (e.g. version bumps, + layout variants). One call, 5th surfaced only if first 4 don't fit. +- **Split per-option** — for independent scope items (e.g. "ship E1..E6?"). + Fire N sequential calls, one per option. Default to this when unsure. + +Per-option call shape: `D.k` header (e.g. D3.1..D3.5), ELI10 per option, +Recommendation, kind-note (no completeness score — Include/Defer/Cut/Hold are +decision actions), and 4 buckets: +**A) Include**, **B) Defer**, **C) Cut**, **D) Hold** (stop chain, discuss). + +After the chain, fire `D.final` to validate the assembled set (reprompt +dependency conflicts) and confirm shipping it. Use `D.revise-` to +revise one option without re-running the chain. + +For N>6, fire a `D.0` meta-AskUserQuestion first (proceed / narrow / batch). + +question_ids for split chains: `-split-` (kebab-case ASCII, +≤64 chars, `-2`/`-3` suffix on collision). The runtime checker +(`bin/gstack-question-preference`) refuses `never-ask` on any `*-split-*` id, +so split chains are never AUTO_DECIDE-eligible — the user's option set is sacred. + +**Full rule + worked examples + Hold/dependency semantics:** see +`docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. + +**Non-ASCII characters — write directly, never \u-escape.** When any string field (question, option label, option description) contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit the literal UTF-8 characters in the JSON string. **Never escape them @@ -404,6 +433,9 @@ Before calling AskUserQuestion, verify: - [ ] Net line closes the decision - [ ] You are calling the tool, not writing prose - [ ] Non-ASCII characters (CJK / accents) written directly, NOT \u-escaped +- [ ] If you had 5+ options, you split (or batched into ≤4-groups) — did NOT drop any +- [ ] If you split, you checked dependencies between options before firing the chain +- [ ] If a per-option Hold fires, you stopped the chain immediately (didn't queue) ## Artifacts Sync (skill start) diff --git a/open-gstack-browser/SKILL.md b/open-gstack-browser/SKILL.md index d356ac19c0..ef01414de8 100644 --- a/open-gstack-browser/SKILL.md +++ b/open-gstack-browser/SKILL.md @@ -333,7 +333,36 @@ Effort both-scales: when an option involves effort, label both human-team and CC Net line closes the tradeoff. Per-skill instructions may add stricter rules. -12. **Non-ASCII characters — write directly, never \u-escape.** When any +### Handling 5+ options — split, never drop + +AskUserQuestion caps every call at **4 options**. With 5+ real options, NEVER +drop, merge, or silently defer one to fit. Pick a compliant shape: + +- **Batch into ≤4-groups** — for coherent alternatives (e.g. version bumps, + layout variants). One call, 5th surfaced only if first 4 don't fit. +- **Split per-option** — for independent scope items (e.g. "ship E1..E6?"). + Fire N sequential calls, one per option. Default to this when unsure. + +Per-option call shape: `D.k` header (e.g. D3.1..D3.5), ELI10 per option, +Recommendation, kind-note (no completeness score — Include/Defer/Cut/Hold are +decision actions), and 4 buckets: +**A) Include**, **B) Defer**, **C) Cut**, **D) Hold** (stop chain, discuss). + +After the chain, fire `D.final` to validate the assembled set (reprompt +dependency conflicts) and confirm shipping it. Use `D.revise-` to +revise one option without re-running the chain. + +For N>6, fire a `D.0` meta-AskUserQuestion first (proceed / narrow / batch). + +question_ids for split chains: `-split-` (kebab-case ASCII, +≤64 chars, `-2`/`-3` suffix on collision). The runtime checker +(`bin/gstack-question-preference`) refuses `never-ask` on any `*-split-*` id, +so split chains are never AUTO_DECIDE-eligible — the user's option set is sacred. + +**Full rule + worked examples + Hold/dependency semantics:** see +`docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. + +**Non-ASCII characters — write directly, never \u-escape.** When any string field (question, option label, option description) contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit the literal UTF-8 characters in the JSON string. **Never escape them @@ -366,6 +395,9 @@ Before calling AskUserQuestion, verify: - [ ] Net line closes the decision - [ ] You are calling the tool, not writing prose - [ ] Non-ASCII characters (CJK / accents) written directly, NOT \u-escaped +- [ ] If you had 5+ options, you split (or batched into ≤4-groups) — did NOT drop any +- [ ] If you split, you checked dependencies between options before firing the chain +- [ ] If a per-option Hold fires, you stopped the chain immediately (didn't queue) ## Artifacts Sync (skill start) diff --git a/package.json b/package.json index 0ce03ad066..eb77faa516 100644 --- a/package.json +++ b/package.json @@ -1,6 +1,6 @@ { "name": "gstack", - "version": "1.47.0.0", + "version": "1.48.0.0", "description": "Garry's Stack — Claude Code skills + fast headless browser. One repo, one install, entire AI engineering workflow.", "license": "MIT", "type": "module", diff --git a/pair-agent/SKILL.md b/pair-agent/SKILL.md index 0a88d64d27..baa1553b76 100644 --- a/pair-agent/SKILL.md +++ b/pair-agent/SKILL.md @@ -335,7 +335,36 @@ Effort both-scales: when an option involves effort, label both human-team and CC Net line closes the tradeoff. Per-skill instructions may add stricter rules. -12. **Non-ASCII characters — write directly, never \u-escape.** When any +### Handling 5+ options — split, never drop + +AskUserQuestion caps every call at **4 options**. With 5+ real options, NEVER +drop, merge, or silently defer one to fit. Pick a compliant shape: + +- **Batch into ≤4-groups** — for coherent alternatives (e.g. version bumps, + layout variants). One call, 5th surfaced only if first 4 don't fit. +- **Split per-option** — for independent scope items (e.g. "ship E1..E6?"). + Fire N sequential calls, one per option. Default to this when unsure. + +Per-option call shape: `D.k` header (e.g. D3.1..D3.5), ELI10 per option, +Recommendation, kind-note (no completeness score — Include/Defer/Cut/Hold are +decision actions), and 4 buckets: +**A) Include**, **B) Defer**, **C) Cut**, **D) Hold** (stop chain, discuss). + +After the chain, fire `D.final` to validate the assembled set (reprompt +dependency conflicts) and confirm shipping it. Use `D.revise-` to +revise one option without re-running the chain. + +For N>6, fire a `D.0` meta-AskUserQuestion first (proceed / narrow / batch). + +question_ids for split chains: `-split-` (kebab-case ASCII, +≤64 chars, `-2`/`-3` suffix on collision). The runtime checker +(`bin/gstack-question-preference`) refuses `never-ask` on any `*-split-*` id, +so split chains are never AUTO_DECIDE-eligible — the user's option set is sacred. + +**Full rule + worked examples + Hold/dependency semantics:** see +`docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. + +**Non-ASCII characters — write directly, never \u-escape.** When any string field (question, option label, option description) contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit the literal UTF-8 characters in the JSON string. **Never escape them @@ -368,6 +397,9 @@ Before calling AskUserQuestion, verify: - [ ] Net line closes the decision - [ ] You are calling the tool, not writing prose - [ ] Non-ASCII characters (CJK / accents) written directly, NOT \u-escaped +- [ ] If you had 5+ options, you split (or batched into ≤4-groups) — did NOT drop any +- [ ] If you split, you checked dependencies between options before firing the chain +- [ ] If a per-option Hold fires, you stopped the chain immediately (didn't queue) ## Artifacts Sync (skill start) diff --git a/plan-ceo-review/SKILL.md b/plan-ceo-review/SKILL.md index 0d525995a9..526bb0e2e3 100644 --- a/plan-ceo-review/SKILL.md +++ b/plan-ceo-review/SKILL.md @@ -365,7 +365,36 @@ Effort both-scales: when an option involves effort, label both human-team and CC Net line closes the tradeoff. Per-skill instructions may add stricter rules. -12. **Non-ASCII characters — write directly, never \u-escape.** When any +### Handling 5+ options — split, never drop + +AskUserQuestion caps every call at **4 options**. With 5+ real options, NEVER +drop, merge, or silently defer one to fit. Pick a compliant shape: + +- **Batch into ≤4-groups** — for coherent alternatives (e.g. version bumps, + layout variants). One call, 5th surfaced only if first 4 don't fit. +- **Split per-option** — for independent scope items (e.g. "ship E1..E6?"). + Fire N sequential calls, one per option. Default to this when unsure. + +Per-option call shape: `D.k` header (e.g. D3.1..D3.5), ELI10 per option, +Recommendation, kind-note (no completeness score — Include/Defer/Cut/Hold are +decision actions), and 4 buckets: +**A) Include**, **B) Defer**, **C) Cut**, **D) Hold** (stop chain, discuss). + +After the chain, fire `D.final` to validate the assembled set (reprompt +dependency conflicts) and confirm shipping it. Use `D.revise-` to +revise one option without re-running the chain. + +For N>6, fire a `D.0` meta-AskUserQuestion first (proceed / narrow / batch). + +question_ids for split chains: `-split-` (kebab-case ASCII, +≤64 chars, `-2`/`-3` suffix on collision). The runtime checker +(`bin/gstack-question-preference`) refuses `never-ask` on any `*-split-*` id, +so split chains are never AUTO_DECIDE-eligible — the user's option set is sacred. + +**Full rule + worked examples + Hold/dependency semantics:** see +`docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. + +**Non-ASCII characters — write directly, never \u-escape.** When any string field (question, option label, option description) contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit the literal UTF-8 characters in the JSON string. **Never escape them @@ -398,6 +427,9 @@ Before calling AskUserQuestion, verify: - [ ] Net line closes the decision - [ ] You are calling the tool, not writing prose - [ ] Non-ASCII characters (CJK / accents) written directly, NOT \u-escaped +- [ ] If you had 5+ options, you split (or batched into ≤4-groups) — did NOT drop any +- [ ] If you split, you checked dependencies between options before firing the chain +- [ ] If a per-option Hold fires, you stopped the chain immediately (didn't queue) ## Artifacts Sync (skill start) diff --git a/plan-design-review/SKILL.md b/plan-design-review/SKILL.md index b75c08d26e..ce70998cde 100644 --- a/plan-design-review/SKILL.md +++ b/plan-design-review/SKILL.md @@ -337,7 +337,36 @@ Effort both-scales: when an option involves effort, label both human-team and CC Net line closes the tradeoff. Per-skill instructions may add stricter rules. -12. **Non-ASCII characters — write directly, never \u-escape.** When any +### Handling 5+ options — split, never drop + +AskUserQuestion caps every call at **4 options**. With 5+ real options, NEVER +drop, merge, or silently defer one to fit. Pick a compliant shape: + +- **Batch into ≤4-groups** — for coherent alternatives (e.g. version bumps, + layout variants). One call, 5th surfaced only if first 4 don't fit. +- **Split per-option** — for independent scope items (e.g. "ship E1..E6?"). + Fire N sequential calls, one per option. Default to this when unsure. + +Per-option call shape: `D.k` header (e.g. D3.1..D3.5), ELI10 per option, +Recommendation, kind-note (no completeness score — Include/Defer/Cut/Hold are +decision actions), and 4 buckets: +**A) Include**, **B) Defer**, **C) Cut**, **D) Hold** (stop chain, discuss). + +After the chain, fire `D.final` to validate the assembled set (reprompt +dependency conflicts) and confirm shipping it. Use `D.revise-` to +revise one option without re-running the chain. + +For N>6, fire a `D.0` meta-AskUserQuestion first (proceed / narrow / batch). + +question_ids for split chains: `-split-` (kebab-case ASCII, +≤64 chars, `-2`/`-3` suffix on collision). The runtime checker +(`bin/gstack-question-preference`) refuses `never-ask` on any `*-split-*` id, +so split chains are never AUTO_DECIDE-eligible — the user's option set is sacred. + +**Full rule + worked examples + Hold/dependency semantics:** see +`docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. + +**Non-ASCII characters — write directly, never \u-escape.** When any string field (question, option label, option description) contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit the literal UTF-8 characters in the JSON string. **Never escape them @@ -370,6 +399,9 @@ Before calling AskUserQuestion, verify: - [ ] Net line closes the decision - [ ] You are calling the tool, not writing prose - [ ] Non-ASCII characters (CJK / accents) written directly, NOT \u-escaped +- [ ] If you had 5+ options, you split (or batched into ≤4-groups) — did NOT drop any +- [ ] If you split, you checked dependencies between options before firing the chain +- [ ] If a per-option Hold fires, you stopped the chain immediately (didn't queue) ## Artifacts Sync (skill start) diff --git a/plan-devex-review/SKILL.md b/plan-devex-review/SKILL.md index 72540cac89..2bb031cbf2 100644 --- a/plan-devex-review/SKILL.md +++ b/plan-devex-review/SKILL.md @@ -343,7 +343,36 @@ Effort both-scales: when an option involves effort, label both human-team and CC Net line closes the tradeoff. Per-skill instructions may add stricter rules. -12. **Non-ASCII characters — write directly, never \u-escape.** When any +### Handling 5+ options — split, never drop + +AskUserQuestion caps every call at **4 options**. With 5+ real options, NEVER +drop, merge, or silently defer one to fit. Pick a compliant shape: + +- **Batch into ≤4-groups** — for coherent alternatives (e.g. version bumps, + layout variants). One call, 5th surfaced only if first 4 don't fit. +- **Split per-option** — for independent scope items (e.g. "ship E1..E6?"). + Fire N sequential calls, one per option. Default to this when unsure. + +Per-option call shape: `D.k` header (e.g. D3.1..D3.5), ELI10 per option, +Recommendation, kind-note (no completeness score — Include/Defer/Cut/Hold are +decision actions), and 4 buckets: +**A) Include**, **B) Defer**, **C) Cut**, **D) Hold** (stop chain, discuss). + +After the chain, fire `D.final` to validate the assembled set (reprompt +dependency conflicts) and confirm shipping it. Use `D.revise-` to +revise one option without re-running the chain. + +For N>6, fire a `D.0` meta-AskUserQuestion first (proceed / narrow / batch). + +question_ids for split chains: `-split-` (kebab-case ASCII, +≤64 chars, `-2`/`-3` suffix on collision). The runtime checker +(`bin/gstack-question-preference`) refuses `never-ask` on any `*-split-*` id, +so split chains are never AUTO_DECIDE-eligible — the user's option set is sacred. + +**Full rule + worked examples + Hold/dependency semantics:** see +`docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. + +**Non-ASCII characters — write directly, never \u-escape.** When any string field (question, option label, option description) contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit the literal UTF-8 characters in the JSON string. **Never escape them @@ -376,6 +405,9 @@ Before calling AskUserQuestion, verify: - [ ] Net line closes the decision - [ ] You are calling the tool, not writing prose - [ ] Non-ASCII characters (CJK / accents) written directly, NOT \u-escaped +- [ ] If you had 5+ options, you split (or batched into ≤4-groups) — did NOT drop any +- [ ] If you split, you checked dependencies between options before firing the chain +- [ ] If a per-option Hold fires, you stopped the chain immediately (didn't queue) ## Artifacts Sync (skill start) diff --git a/plan-eng-review/SKILL.md b/plan-eng-review/SKILL.md index 518fdcff2e..b6cd234410 100644 --- a/plan-eng-review/SKILL.md +++ b/plan-eng-review/SKILL.md @@ -341,7 +341,36 @@ Effort both-scales: when an option involves effort, label both human-team and CC Net line closes the tradeoff. Per-skill instructions may add stricter rules. -12. **Non-ASCII characters — write directly, never \u-escape.** When any +### Handling 5+ options — split, never drop + +AskUserQuestion caps every call at **4 options**. With 5+ real options, NEVER +drop, merge, or silently defer one to fit. Pick a compliant shape: + +- **Batch into ≤4-groups** — for coherent alternatives (e.g. version bumps, + layout variants). One call, 5th surfaced only if first 4 don't fit. +- **Split per-option** — for independent scope items (e.g. "ship E1..E6?"). + Fire N sequential calls, one per option. Default to this when unsure. + +Per-option call shape: `D.k` header (e.g. D3.1..D3.5), ELI10 per option, +Recommendation, kind-note (no completeness score — Include/Defer/Cut/Hold are +decision actions), and 4 buckets: +**A) Include**, **B) Defer**, **C) Cut**, **D) Hold** (stop chain, discuss). + +After the chain, fire `D.final` to validate the assembled set (reprompt +dependency conflicts) and confirm shipping it. Use `D.revise-` to +revise one option without re-running the chain. + +For N>6, fire a `D.0` meta-AskUserQuestion first (proceed / narrow / batch). + +question_ids for split chains: `-split-` (kebab-case ASCII, +≤64 chars, `-2`/`-3` suffix on collision). The runtime checker +(`bin/gstack-question-preference`) refuses `never-ask` on any `*-split-*` id, +so split chains are never AUTO_DECIDE-eligible — the user's option set is sacred. + +**Full rule + worked examples + Hold/dependency semantics:** see +`docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. + +**Non-ASCII characters — write directly, never \u-escape.** When any string field (question, option label, option description) contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit the literal UTF-8 characters in the JSON string. **Never escape them @@ -374,6 +403,9 @@ Before calling AskUserQuestion, verify: - [ ] Net line closes the decision - [ ] You are calling the tool, not writing prose - [ ] Non-ASCII characters (CJK / accents) written directly, NOT \u-escaped +- [ ] If you had 5+ options, you split (or batched into ≤4-groups) — did NOT drop any +- [ ] If you split, you checked dependencies between options before firing the chain +- [ ] If a per-option Hold fires, you stopped the chain immediately (didn't queue) ## Artifacts Sync (skill start) diff --git a/plan-tune/SKILL.md b/plan-tune/SKILL.md index f4346c0078..6f5875d0d8 100644 --- a/plan-tune/SKILL.md +++ b/plan-tune/SKILL.md @@ -346,7 +346,36 @@ Effort both-scales: when an option involves effort, label both human-team and CC Net line closes the tradeoff. Per-skill instructions may add stricter rules. -12. **Non-ASCII characters — write directly, never \u-escape.** When any +### Handling 5+ options — split, never drop + +AskUserQuestion caps every call at **4 options**. With 5+ real options, NEVER +drop, merge, or silently defer one to fit. Pick a compliant shape: + +- **Batch into ≤4-groups** — for coherent alternatives (e.g. version bumps, + layout variants). One call, 5th surfaced only if first 4 don't fit. +- **Split per-option** — for independent scope items (e.g. "ship E1..E6?"). + Fire N sequential calls, one per option. Default to this when unsure. + +Per-option call shape: `D.k` header (e.g. D3.1..D3.5), ELI10 per option, +Recommendation, kind-note (no completeness score — Include/Defer/Cut/Hold are +decision actions), and 4 buckets: +**A) Include**, **B) Defer**, **C) Cut**, **D) Hold** (stop chain, discuss). + +After the chain, fire `D.final` to validate the assembled set (reprompt +dependency conflicts) and confirm shipping it. Use `D.revise-` to +revise one option without re-running the chain. + +For N>6, fire a `D.0` meta-AskUserQuestion first (proceed / narrow / batch). + +question_ids for split chains: `-split-` (kebab-case ASCII, +≤64 chars, `-2`/`-3` suffix on collision). The runtime checker +(`bin/gstack-question-preference`) refuses `never-ask` on any `*-split-*` id, +so split chains are never AUTO_DECIDE-eligible — the user's option set is sacred. + +**Full rule + worked examples + Hold/dependency semantics:** see +`docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. + +**Non-ASCII characters — write directly, never \u-escape.** When any string field (question, option label, option description) contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit the literal UTF-8 characters in the JSON string. **Never escape them @@ -379,6 +408,9 @@ Before calling AskUserQuestion, verify: - [ ] Net line closes the decision - [ ] You are calling the tool, not writing prose - [ ] Non-ASCII characters (CJK / accents) written directly, NOT \u-escaped +- [ ] If you had 5+ options, you split (or batched into ≤4-groups) — did NOT drop any +- [ ] If you split, you checked dependencies between options before firing the chain +- [ ] If a per-option Hold fires, you stopped the chain immediately (didn't queue) ## Artifacts Sync (skill start) diff --git a/qa-only/SKILL.md b/qa-only/SKILL.md index 660015ea55..7a58b76ed9 100644 --- a/qa-only/SKILL.md +++ b/qa-only/SKILL.md @@ -336,7 +336,36 @@ Effort both-scales: when an option involves effort, label both human-team and CC Net line closes the tradeoff. Per-skill instructions may add stricter rules. -12. **Non-ASCII characters — write directly, never \u-escape.** When any +### Handling 5+ options — split, never drop + +AskUserQuestion caps every call at **4 options**. With 5+ real options, NEVER +drop, merge, or silently defer one to fit. Pick a compliant shape: + +- **Batch into ≤4-groups** — for coherent alternatives (e.g. version bumps, + layout variants). One call, 5th surfaced only if first 4 don't fit. +- **Split per-option** — for independent scope items (e.g. "ship E1..E6?"). + Fire N sequential calls, one per option. Default to this when unsure. + +Per-option call shape: `D.k` header (e.g. D3.1..D3.5), ELI10 per option, +Recommendation, kind-note (no completeness score — Include/Defer/Cut/Hold are +decision actions), and 4 buckets: +**A) Include**, **B) Defer**, **C) Cut**, **D) Hold** (stop chain, discuss). + +After the chain, fire `D.final` to validate the assembled set (reprompt +dependency conflicts) and confirm shipping it. Use `D.revise-` to +revise one option without re-running the chain. + +For N>6, fire a `D.0` meta-AskUserQuestion first (proceed / narrow / batch). + +question_ids for split chains: `-split-` (kebab-case ASCII, +≤64 chars, `-2`/`-3` suffix on collision). The runtime checker +(`bin/gstack-question-preference`) refuses `never-ask` on any `*-split-*` id, +so split chains are never AUTO_DECIDE-eligible — the user's option set is sacred. + +**Full rule + worked examples + Hold/dependency semantics:** see +`docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. + +**Non-ASCII characters — write directly, never \u-escape.** When any string field (question, option label, option description) contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit the literal UTF-8 characters in the JSON string. **Never escape them @@ -369,6 +398,9 @@ Before calling AskUserQuestion, verify: - [ ] Net line closes the decision - [ ] You are calling the tool, not writing prose - [ ] Non-ASCII characters (CJK / accents) written directly, NOT \u-escaped +- [ ] If you had 5+ options, you split (or batched into ≤4-groups) — did NOT drop any +- [ ] If you split, you checked dependencies between options before firing the chain +- [ ] If a per-option Hold fires, you stopped the chain immediately (didn't queue) ## Artifacts Sync (skill start) diff --git a/qa/SKILL.md b/qa/SKILL.md index 1d56de7286..6779c47cfc 100644 --- a/qa/SKILL.md +++ b/qa/SKILL.md @@ -342,7 +342,36 @@ Effort both-scales: when an option involves effort, label both human-team and CC Net line closes the tradeoff. Per-skill instructions may add stricter rules. -12. **Non-ASCII characters — write directly, never \u-escape.** When any +### Handling 5+ options — split, never drop + +AskUserQuestion caps every call at **4 options**. With 5+ real options, NEVER +drop, merge, or silently defer one to fit. Pick a compliant shape: + +- **Batch into ≤4-groups** — for coherent alternatives (e.g. version bumps, + layout variants). One call, 5th surfaced only if first 4 don't fit. +- **Split per-option** — for independent scope items (e.g. "ship E1..E6?"). + Fire N sequential calls, one per option. Default to this when unsure. + +Per-option call shape: `D.k` header (e.g. D3.1..D3.5), ELI10 per option, +Recommendation, kind-note (no completeness score — Include/Defer/Cut/Hold are +decision actions), and 4 buckets: +**A) Include**, **B) Defer**, **C) Cut**, **D) Hold** (stop chain, discuss). + +After the chain, fire `D.final` to validate the assembled set (reprompt +dependency conflicts) and confirm shipping it. Use `D.revise-` to +revise one option without re-running the chain. + +For N>6, fire a `D.0` meta-AskUserQuestion first (proceed / narrow / batch). + +question_ids for split chains: `-split-` (kebab-case ASCII, +≤64 chars, `-2`/`-3` suffix on collision). The runtime checker +(`bin/gstack-question-preference`) refuses `never-ask` on any `*-split-*` id, +so split chains are never AUTO_DECIDE-eligible — the user's option set is sacred. + +**Full rule + worked examples + Hold/dependency semantics:** see +`docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. + +**Non-ASCII characters — write directly, never \u-escape.** When any string field (question, option label, option description) contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit the literal UTF-8 characters in the JSON string. **Never escape them @@ -375,6 +404,9 @@ Before calling AskUserQuestion, verify: - [ ] Net line closes the decision - [ ] You are calling the tool, not writing prose - [ ] Non-ASCII characters (CJK / accents) written directly, NOT \u-escaped +- [ ] If you had 5+ options, you split (or batched into ≤4-groups) — did NOT drop any +- [ ] If you split, you checked dependencies between options before firing the chain +- [ ] If a per-option Hold fires, you stopped the chain immediately (didn't queue) ## Artifacts Sync (skill start) diff --git a/retro/SKILL.md b/retro/SKILL.md index c451cc6159..ddbee15515 100644 --- a/retro/SKILL.md +++ b/retro/SKILL.md @@ -353,7 +353,36 @@ Effort both-scales: when an option involves effort, label both human-team and CC Net line closes the tradeoff. Per-skill instructions may add stricter rules. -12. **Non-ASCII characters — write directly, never \u-escape.** When any +### Handling 5+ options — split, never drop + +AskUserQuestion caps every call at **4 options**. With 5+ real options, NEVER +drop, merge, or silently defer one to fit. Pick a compliant shape: + +- **Batch into ≤4-groups** — for coherent alternatives (e.g. version bumps, + layout variants). One call, 5th surfaced only if first 4 don't fit. +- **Split per-option** — for independent scope items (e.g. "ship E1..E6?"). + Fire N sequential calls, one per option. Default to this when unsure. + +Per-option call shape: `D.k` header (e.g. D3.1..D3.5), ELI10 per option, +Recommendation, kind-note (no completeness score — Include/Defer/Cut/Hold are +decision actions), and 4 buckets: +**A) Include**, **B) Defer**, **C) Cut**, **D) Hold** (stop chain, discuss). + +After the chain, fire `D.final` to validate the assembled set (reprompt +dependency conflicts) and confirm shipping it. Use `D.revise-` to +revise one option without re-running the chain. + +For N>6, fire a `D.0` meta-AskUserQuestion first (proceed / narrow / batch). + +question_ids for split chains: `-split-` (kebab-case ASCII, +≤64 chars, `-2`/`-3` suffix on collision). The runtime checker +(`bin/gstack-question-preference`) refuses `never-ask` on any `*-split-*` id, +so split chains are never AUTO_DECIDE-eligible — the user's option set is sacred. + +**Full rule + worked examples + Hold/dependency semantics:** see +`docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. + +**Non-ASCII characters — write directly, never \u-escape.** When any string field (question, option label, option description) contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit the literal UTF-8 characters in the JSON string. **Never escape them @@ -386,6 +415,9 @@ Before calling AskUserQuestion, verify: - [ ] Net line closes the decision - [ ] You are calling the tool, not writing prose - [ ] Non-ASCII characters (CJK / accents) written directly, NOT \u-escaped +- [ ] If you had 5+ options, you split (or batched into ≤4-groups) — did NOT drop any +- [ ] If you split, you checked dependencies between options before firing the chain +- [ ] If a per-option Hold fires, you stopped the chain immediately (didn't queue) ## Artifacts Sync (skill start) diff --git a/review/SKILL.md b/review/SKILL.md index 38782efc8b..dd6914a88c 100644 --- a/review/SKILL.md +++ b/review/SKILL.md @@ -338,7 +338,36 @@ Effort both-scales: when an option involves effort, label both human-team and CC Net line closes the tradeoff. Per-skill instructions may add stricter rules. -12. **Non-ASCII characters — write directly, never \u-escape.** When any +### Handling 5+ options — split, never drop + +AskUserQuestion caps every call at **4 options**. With 5+ real options, NEVER +drop, merge, or silently defer one to fit. Pick a compliant shape: + +- **Batch into ≤4-groups** — for coherent alternatives (e.g. version bumps, + layout variants). One call, 5th surfaced only if first 4 don't fit. +- **Split per-option** — for independent scope items (e.g. "ship E1..E6?"). + Fire N sequential calls, one per option. Default to this when unsure. + +Per-option call shape: `D.k` header (e.g. D3.1..D3.5), ELI10 per option, +Recommendation, kind-note (no completeness score — Include/Defer/Cut/Hold are +decision actions), and 4 buckets: +**A) Include**, **B) Defer**, **C) Cut**, **D) Hold** (stop chain, discuss). + +After the chain, fire `D.final` to validate the assembled set (reprompt +dependency conflicts) and confirm shipping it. Use `D.revise-` to +revise one option without re-running the chain. + +For N>6, fire a `D.0` meta-AskUserQuestion first (proceed / narrow / batch). + +question_ids for split chains: `-split-` (kebab-case ASCII, +≤64 chars, `-2`/`-3` suffix on collision). The runtime checker +(`bin/gstack-question-preference`) refuses `never-ask` on any `*-split-*` id, +so split chains are never AUTO_DECIDE-eligible — the user's option set is sacred. + +**Full rule + worked examples + Hold/dependency semantics:** see +`docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. + +**Non-ASCII characters — write directly, never \u-escape.** When any string field (question, option label, option description) contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit the literal UTF-8 characters in the JSON string. **Never escape them @@ -371,6 +400,9 @@ Before calling AskUserQuestion, verify: - [ ] Net line closes the decision - [ ] You are calling the tool, not writing prose - [ ] Non-ASCII characters (CJK / accents) written directly, NOT \u-escaped +- [ ] If you had 5+ options, you split (or batched into ≤4-groups) — did NOT drop any +- [ ] If you split, you checked dependencies between options before firing the chain +- [ ] If a per-option Hold fires, you stopped the chain immediately (didn't queue) ## Artifacts Sync (skill start) diff --git a/scrape/SKILL.md b/scrape/SKILL.md index 461dc32f46..dccdd0db73 100644 --- a/scrape/SKILL.md +++ b/scrape/SKILL.md @@ -334,7 +334,36 @@ Effort both-scales: when an option involves effort, label both human-team and CC Net line closes the tradeoff. Per-skill instructions may add stricter rules. -12. **Non-ASCII characters — write directly, never \u-escape.** When any +### Handling 5+ options — split, never drop + +AskUserQuestion caps every call at **4 options**. With 5+ real options, NEVER +drop, merge, or silently defer one to fit. Pick a compliant shape: + +- **Batch into ≤4-groups** — for coherent alternatives (e.g. version bumps, + layout variants). One call, 5th surfaced only if first 4 don't fit. +- **Split per-option** — for independent scope items (e.g. "ship E1..E6?"). + Fire N sequential calls, one per option. Default to this when unsure. + +Per-option call shape: `D.k` header (e.g. D3.1..D3.5), ELI10 per option, +Recommendation, kind-note (no completeness score — Include/Defer/Cut/Hold are +decision actions), and 4 buckets: +**A) Include**, **B) Defer**, **C) Cut**, **D) Hold** (stop chain, discuss). + +After the chain, fire `D.final` to validate the assembled set (reprompt +dependency conflicts) and confirm shipping it. Use `D.revise-` to +revise one option without re-running the chain. + +For N>6, fire a `D.0` meta-AskUserQuestion first (proceed / narrow / batch). + +question_ids for split chains: `-split-` (kebab-case ASCII, +≤64 chars, `-2`/`-3` suffix on collision). The runtime checker +(`bin/gstack-question-preference`) refuses `never-ask` on any `*-split-*` id, +so split chains are never AUTO_DECIDE-eligible — the user's option set is sacred. + +**Full rule + worked examples + Hold/dependency semantics:** see +`docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. + +**Non-ASCII characters — write directly, never \u-escape.** When any string field (question, option label, option description) contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit the literal UTF-8 characters in the JSON string. **Never escape them @@ -367,6 +396,9 @@ Before calling AskUserQuestion, verify: - [ ] Net line closes the decision - [ ] You are calling the tool, not writing prose - [ ] Non-ASCII characters (CJK / accents) written directly, NOT \u-escaped +- [ ] If you had 5+ options, you split (or batched into ≤4-groups) — did NOT drop any +- [ ] If you split, you checked dependencies between options before firing the chain +- [ ] If a per-option Hold fires, you stopped the chain immediately (didn't queue) ## Artifacts Sync (skill start) diff --git a/scripts/resolvers/preamble/generate-ask-user-format.ts b/scripts/resolvers/preamble/generate-ask-user-format.ts index 5a7d174dbc..e71b39e41b 100644 --- a/scripts/resolvers/preamble/generate-ask-user-format.ts +++ b/scripts/resolvers/preamble/generate-ask-user-format.ts @@ -46,7 +46,36 @@ Effort both-scales: when an option involves effort, label both human-team and CC Net line closes the tradeoff. Per-skill instructions may add stricter rules. -12. **Non-ASCII characters — write directly, never \\u-escape.** When any +### Handling 5+ options — split, never drop + +AskUserQuestion caps every call at **4 options**. With 5+ real options, NEVER +drop, merge, or silently defer one to fit. Pick a compliant shape: + +- **Batch into ≤4-groups** — for coherent alternatives (e.g. version bumps, + layout variants). One call, 5th surfaced only if first 4 don't fit. +- **Split per-option** — for independent scope items (e.g. "ship E1..E6?"). + Fire N sequential calls, one per option. Default to this when unsure. + +Per-option call shape: \`D.k\` header (e.g. D3.1..D3.5), ELI10 per option, +Recommendation, kind-note (no completeness score — Include/Defer/Cut/Hold are +decision actions), and 4 buckets: +**A) Include**, **B) Defer**, **C) Cut**, **D) Hold** (stop chain, discuss). + +After the chain, fire \`D.final\` to validate the assembled set (reprompt +dependency conflicts) and confirm shipping it. Use \`D.revise-\` to +revise one option without re-running the chain. + +For N>6, fire a \`D.0\` meta-AskUserQuestion first (proceed / narrow / batch). + +question_ids for split chains: \`-split-\` (kebab-case ASCII, +≤64 chars, \`-2\`/\`-3\` suffix on collision). The runtime checker +(\`bin/gstack-question-preference\`) refuses \`never-ask\` on any \`*-split-*\` id, +so split chains are never AUTO_DECIDE-eligible — the user's option set is sacred. + +**Full rule + worked examples + Hold/dependency semantics:** see +\`docs/askuserquestion-split.md\` in the gstack repo. Read on demand when N>4. + +**Non-ASCII characters — write directly, never \\u-escape.** When any string field (question, option label, option description) contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit the literal UTF-8 characters in the JSON string. **Never escape them @@ -79,5 +108,8 @@ Before calling AskUserQuestion, verify: - [ ] Net line closes the decision - [ ] You are calling the tool, not writing prose - [ ] Non-ASCII characters (CJK / accents) written directly, NOT \\u-escaped +- [ ] If you had 5+ options, you split (or batched into ≤4-groups) — did NOT drop any +- [ ] If you split, you checked dependencies between options before firing the chain +- [ ] If a per-option Hold fires, you stopped the chain immediately (didn't queue) `; } diff --git a/setup-deploy/SKILL.md b/setup-deploy/SKILL.md index 0e962404a1..3e69b015d0 100644 --- a/setup-deploy/SKILL.md +++ b/setup-deploy/SKILL.md @@ -337,7 +337,36 @@ Effort both-scales: when an option involves effort, label both human-team and CC Net line closes the tradeoff. Per-skill instructions may add stricter rules. -12. **Non-ASCII characters — write directly, never \u-escape.** When any +### Handling 5+ options — split, never drop + +AskUserQuestion caps every call at **4 options**. With 5+ real options, NEVER +drop, merge, or silently defer one to fit. Pick a compliant shape: + +- **Batch into ≤4-groups** — for coherent alternatives (e.g. version bumps, + layout variants). One call, 5th surfaced only if first 4 don't fit. +- **Split per-option** — for independent scope items (e.g. "ship E1..E6?"). + Fire N sequential calls, one per option. Default to this when unsure. + +Per-option call shape: `D.k` header (e.g. D3.1..D3.5), ELI10 per option, +Recommendation, kind-note (no completeness score — Include/Defer/Cut/Hold are +decision actions), and 4 buckets: +**A) Include**, **B) Defer**, **C) Cut**, **D) Hold** (stop chain, discuss). + +After the chain, fire `D.final` to validate the assembled set (reprompt +dependency conflicts) and confirm shipping it. Use `D.revise-` to +revise one option without re-running the chain. + +For N>6, fire a `D.0` meta-AskUserQuestion first (proceed / narrow / batch). + +question_ids for split chains: `-split-` (kebab-case ASCII, +≤64 chars, `-2`/`-3` suffix on collision). The runtime checker +(`bin/gstack-question-preference`) refuses `never-ask` on any `*-split-*` id, +so split chains are never AUTO_DECIDE-eligible — the user's option set is sacred. + +**Full rule + worked examples + Hold/dependency semantics:** see +`docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. + +**Non-ASCII characters — write directly, never \u-escape.** When any string field (question, option label, option description) contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit the literal UTF-8 characters in the JSON string. **Never escape them @@ -370,6 +399,9 @@ Before calling AskUserQuestion, verify: - [ ] Net line closes the decision - [ ] You are calling the tool, not writing prose - [ ] Non-ASCII characters (CJK / accents) written directly, NOT \u-escaped +- [ ] If you had 5+ options, you split (or batched into ≤4-groups) — did NOT drop any +- [ ] If you split, you checked dependencies between options before firing the chain +- [ ] If a per-option Hold fires, you stopped the chain immediately (didn't queue) ## Artifacts Sync (skill start) diff --git a/setup-gbrain/SKILL.md b/setup-gbrain/SKILL.md index 408d5380b7..12d8e2ce13 100644 --- a/setup-gbrain/SKILL.md +++ b/setup-gbrain/SKILL.md @@ -336,7 +336,36 @@ Effort both-scales: when an option involves effort, label both human-team and CC Net line closes the tradeoff. Per-skill instructions may add stricter rules. -12. **Non-ASCII characters — write directly, never \u-escape.** When any +### Handling 5+ options — split, never drop + +AskUserQuestion caps every call at **4 options**. With 5+ real options, NEVER +drop, merge, or silently defer one to fit. Pick a compliant shape: + +- **Batch into ≤4-groups** — for coherent alternatives (e.g. version bumps, + layout variants). One call, 5th surfaced only if first 4 don't fit. +- **Split per-option** — for independent scope items (e.g. "ship E1..E6?"). + Fire N sequential calls, one per option. Default to this when unsure. + +Per-option call shape: `D.k` header (e.g. D3.1..D3.5), ELI10 per option, +Recommendation, kind-note (no completeness score — Include/Defer/Cut/Hold are +decision actions), and 4 buckets: +**A) Include**, **B) Defer**, **C) Cut**, **D) Hold** (stop chain, discuss). + +After the chain, fire `D.final` to validate the assembled set (reprompt +dependency conflicts) and confirm shipping it. Use `D.revise-` to +revise one option without re-running the chain. + +For N>6, fire a `D.0` meta-AskUserQuestion first (proceed / narrow / batch). + +question_ids for split chains: `-split-` (kebab-case ASCII, +≤64 chars, `-2`/`-3` suffix on collision). The runtime checker +(`bin/gstack-question-preference`) refuses `never-ask` on any `*-split-*` id, +so split chains are never AUTO_DECIDE-eligible — the user's option set is sacred. + +**Full rule + worked examples + Hold/dependency semantics:** see +`docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. + +**Non-ASCII characters — write directly, never \u-escape.** When any string field (question, option label, option description) contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit the literal UTF-8 characters in the JSON string. **Never escape them @@ -369,6 +398,9 @@ Before calling AskUserQuestion, verify: - [ ] Net line closes the decision - [ ] You are calling the tool, not writing prose - [ ] Non-ASCII characters (CJK / accents) written directly, NOT \u-escaped +- [ ] If you had 5+ options, you split (or batched into ≤4-groups) — did NOT drop any +- [ ] If you split, you checked dependencies between options before firing the chain +- [ ] If a per-option Hold fires, you stopped the chain immediately (didn't queue) ## Artifacts Sync (skill start) diff --git a/ship/SKILL.md b/ship/SKILL.md index 0658db8dd4..9611072f74 100644 --- a/ship/SKILL.md +++ b/ship/SKILL.md @@ -338,7 +338,36 @@ Effort both-scales: when an option involves effort, label both human-team and CC Net line closes the tradeoff. Per-skill instructions may add stricter rules. -12. **Non-ASCII characters — write directly, never \u-escape.** When any +### Handling 5+ options — split, never drop + +AskUserQuestion caps every call at **4 options**. With 5+ real options, NEVER +drop, merge, or silently defer one to fit. Pick a compliant shape: + +- **Batch into ≤4-groups** — for coherent alternatives (e.g. version bumps, + layout variants). One call, 5th surfaced only if first 4 don't fit. +- **Split per-option** — for independent scope items (e.g. "ship E1..E6?"). + Fire N sequential calls, one per option. Default to this when unsure. + +Per-option call shape: `D.k` header (e.g. D3.1..D3.5), ELI10 per option, +Recommendation, kind-note (no completeness score — Include/Defer/Cut/Hold are +decision actions), and 4 buckets: +**A) Include**, **B) Defer**, **C) Cut**, **D) Hold** (stop chain, discuss). + +After the chain, fire `D.final` to validate the assembled set (reprompt +dependency conflicts) and confirm shipping it. Use `D.revise-` to +revise one option without re-running the chain. + +For N>6, fire a `D.0` meta-AskUserQuestion first (proceed / narrow / batch). + +question_ids for split chains: `-split-` (kebab-case ASCII, +≤64 chars, `-2`/`-3` suffix on collision). The runtime checker +(`bin/gstack-question-preference`) refuses `never-ask` on any `*-split-*` id, +so split chains are never AUTO_DECIDE-eligible — the user's option set is sacred. + +**Full rule + worked examples + Hold/dependency semantics:** see +`docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. + +**Non-ASCII characters — write directly, never \u-escape.** When any string field (question, option label, option description) contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit the literal UTF-8 characters in the JSON string. **Never escape them @@ -371,6 +400,9 @@ Before calling AskUserQuestion, verify: - [ ] Net line closes the decision - [ ] You are calling the tool, not writing prose - [ ] Non-ASCII characters (CJK / accents) written directly, NOT \u-escaped +- [ ] If you had 5+ options, you split (or batched into ≤4-groups) — did NOT drop any +- [ ] If you split, you checked dependencies between options before firing the chain +- [ ] If a per-option Hold fires, you stopped the chain immediately (didn't queue) ## Artifacts Sync (skill start) diff --git a/skillify/SKILL.md b/skillify/SKILL.md index e56a314b86..8b81f1ce8d 100644 --- a/skillify/SKILL.md +++ b/skillify/SKILL.md @@ -334,7 +334,36 @@ Effort both-scales: when an option involves effort, label both human-team and CC Net line closes the tradeoff. Per-skill instructions may add stricter rules. -12. **Non-ASCII characters — write directly, never \u-escape.** When any +### Handling 5+ options — split, never drop + +AskUserQuestion caps every call at **4 options**. With 5+ real options, NEVER +drop, merge, or silently defer one to fit. Pick a compliant shape: + +- **Batch into ≤4-groups** — for coherent alternatives (e.g. version bumps, + layout variants). One call, 5th surfaced only if first 4 don't fit. +- **Split per-option** — for independent scope items (e.g. "ship E1..E6?"). + Fire N sequential calls, one per option. Default to this when unsure. + +Per-option call shape: `D.k` header (e.g. D3.1..D3.5), ELI10 per option, +Recommendation, kind-note (no completeness score — Include/Defer/Cut/Hold are +decision actions), and 4 buckets: +**A) Include**, **B) Defer**, **C) Cut**, **D) Hold** (stop chain, discuss). + +After the chain, fire `D.final` to validate the assembled set (reprompt +dependency conflicts) and confirm shipping it. Use `D.revise-` to +revise one option without re-running the chain. + +For N>6, fire a `D.0` meta-AskUserQuestion first (proceed / narrow / batch). + +question_ids for split chains: `-split-` (kebab-case ASCII, +≤64 chars, `-2`/`-3` suffix on collision). The runtime checker +(`bin/gstack-question-preference`) refuses `never-ask` on any `*-split-*` id, +so split chains are never AUTO_DECIDE-eligible — the user's option set is sacred. + +**Full rule + worked examples + Hold/dependency semantics:** see +`docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. + +**Non-ASCII characters — write directly, never \u-escape.** When any string field (question, option label, option description) contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit the literal UTF-8 characters in the JSON string. **Never escape them @@ -367,6 +396,9 @@ Before calling AskUserQuestion, verify: - [ ] Net line closes the decision - [ ] You are calling the tool, not writing prose - [ ] Non-ASCII characters (CJK / accents) written directly, NOT \u-escaped +- [ ] If you had 5+ options, you split (or batched into ≤4-groups) — did NOT drop any +- [ ] If you split, you checked dependencies between options before firing the chain +- [ ] If a per-option Hold fires, you stopped the chain immediately (didn't queue) ## Artifacts Sync (skill start) diff --git a/spec/SKILL.md b/spec/SKILL.md index a125a8792b..3e7187d180 100644 --- a/spec/SKILL.md +++ b/spec/SKILL.md @@ -335,7 +335,36 @@ Effort both-scales: when an option involves effort, label both human-team and CC Net line closes the tradeoff. Per-skill instructions may add stricter rules. -12. **Non-ASCII characters — write directly, never \u-escape.** When any +### Handling 5+ options — split, never drop + +AskUserQuestion caps every call at **4 options**. With 5+ real options, NEVER +drop, merge, or silently defer one to fit. Pick a compliant shape: + +- **Batch into ≤4-groups** — for coherent alternatives (e.g. version bumps, + layout variants). One call, 5th surfaced only if first 4 don't fit. +- **Split per-option** — for independent scope items (e.g. "ship E1..E6?"). + Fire N sequential calls, one per option. Default to this when unsure. + +Per-option call shape: `D.k` header (e.g. D3.1..D3.5), ELI10 per option, +Recommendation, kind-note (no completeness score — Include/Defer/Cut/Hold are +decision actions), and 4 buckets: +**A) Include**, **B) Defer**, **C) Cut**, **D) Hold** (stop chain, discuss). + +After the chain, fire `D.final` to validate the assembled set (reprompt +dependency conflicts) and confirm shipping it. Use `D.revise-` to +revise one option without re-running the chain. + +For N>6, fire a `D.0` meta-AskUserQuestion first (proceed / narrow / batch). + +question_ids for split chains: `-split-` (kebab-case ASCII, +≤64 chars, `-2`/`-3` suffix on collision). The runtime checker +(`bin/gstack-question-preference`) refuses `never-ask` on any `*-split-*` id, +so split chains are never AUTO_DECIDE-eligible — the user's option set is sacred. + +**Full rule + worked examples + Hold/dependency semantics:** see +`docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. + +**Non-ASCII characters — write directly, never \u-escape.** When any string field (question, option label, option description) contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit the literal UTF-8 characters in the JSON string. **Never escape them @@ -368,6 +397,9 @@ Before calling AskUserQuestion, verify: - [ ] Net line closes the decision - [ ] You are calling the tool, not writing prose - [ ] Non-ASCII characters (CJK / accents) written directly, NOT \u-escaped +- [ ] If you had 5+ options, you split (or batched into ≤4-groups) — did NOT drop any +- [ ] If you split, you checked dependencies between options before firing the chain +- [ ] If a per-option Hold fires, you stopped the chain immediately (didn't queue) ## Artifacts Sync (skill start) @@ -1242,7 +1274,36 @@ Effort both-scales: when an option involves effort, label both human-team and CC Net line closes the tradeoff. Per-skill instructions may add stricter rules. -12. **Non-ASCII characters — write directly, never \u-escape.** When any +### Handling 5+ options — split, never drop + +AskUserQuestion caps every call at **4 options**. With 5+ real options, NEVER +drop, merge, or silently defer one to fit. Pick a compliant shape: + +- **Batch into ≤4-groups** — for coherent alternatives (e.g. version bumps, + layout variants). One call, 5th surfaced only if first 4 don't fit. +- **Split per-option** — for independent scope items (e.g. "ship E1..E6?"). + Fire N sequential calls, one per option. Default to this when unsure. + +Per-option call shape: `D.k` header (e.g. D3.1..D3.5), ELI10 per option, +Recommendation, kind-note (no completeness score — Include/Defer/Cut/Hold are +decision actions), and 4 buckets: +**A) Include**, **B) Defer**, **C) Cut**, **D) Hold** (stop chain, discuss). + +After the chain, fire `D.final` to validate the assembled set (reprompt +dependency conflicts) and confirm shipping it. Use `D.revise-` to +revise one option without re-running the chain. + +For N>6, fire a `D.0` meta-AskUserQuestion first (proceed / narrow / batch). + +question_ids for split chains: `-split-` (kebab-case ASCII, +≤64 chars, `-2`/`-3` suffix on collision). The runtime checker +(`bin/gstack-question-preference`) refuses `never-ask` on any `*-split-*` id, +so split chains are never AUTO_DECIDE-eligible — the user's option set is sacred. + +**Full rule + worked examples + Hold/dependency semantics:** see +`docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. + +**Non-ASCII characters — write directly, never \u-escape.** When any string field (question, option label, option description) contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit the literal UTF-8 characters in the JSON string. **Never escape them @@ -1275,6 +1336,9 @@ Before calling AskUserQuestion, verify: - [ ] Net line closes the decision - [ ] You are calling the tool, not writing prose - [ ] Non-ASCII characters (CJK / accents) written directly, NOT \u-escaped +- [ ] If you had 5+ options, you split (or batched into ≤4-groups) — did NOT drop any +- [ ] If you split, you checked dependencies between options before firing the chain +- [ ] If a per-option Hold fires, you stopped the chain immediately (didn't queue) ## Artifacts Sync (skill start) diff --git a/sync-gbrain/SKILL.md b/sync-gbrain/SKILL.md index 10877b61ec..96ac9057aa 100644 --- a/sync-gbrain/SKILL.md +++ b/sync-gbrain/SKILL.md @@ -336,7 +336,36 @@ Effort both-scales: when an option involves effort, label both human-team and CC Net line closes the tradeoff. Per-skill instructions may add stricter rules. -12. **Non-ASCII characters — write directly, never \u-escape.** When any +### Handling 5+ options — split, never drop + +AskUserQuestion caps every call at **4 options**. With 5+ real options, NEVER +drop, merge, or silently defer one to fit. Pick a compliant shape: + +- **Batch into ≤4-groups** — for coherent alternatives (e.g. version bumps, + layout variants). One call, 5th surfaced only if first 4 don't fit. +- **Split per-option** — for independent scope items (e.g. "ship E1..E6?"). + Fire N sequential calls, one per option. Default to this when unsure. + +Per-option call shape: `D.k` header (e.g. D3.1..D3.5), ELI10 per option, +Recommendation, kind-note (no completeness score — Include/Defer/Cut/Hold are +decision actions), and 4 buckets: +**A) Include**, **B) Defer**, **C) Cut**, **D) Hold** (stop chain, discuss). + +After the chain, fire `D.final` to validate the assembled set (reprompt +dependency conflicts) and confirm shipping it. Use `D.revise-` to +revise one option without re-running the chain. + +For N>6, fire a `D.0` meta-AskUserQuestion first (proceed / narrow / batch). + +question_ids for split chains: `-split-` (kebab-case ASCII, +≤64 chars, `-2`/`-3` suffix on collision). The runtime checker +(`bin/gstack-question-preference`) refuses `never-ask` on any `*-split-*` id, +so split chains are never AUTO_DECIDE-eligible — the user's option set is sacred. + +**Full rule + worked examples + Hold/dependency semantics:** see +`docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. + +**Non-ASCII characters — write directly, never \u-escape.** When any string field (question, option label, option description) contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit the literal UTF-8 characters in the JSON string. **Never escape them @@ -369,6 +398,9 @@ Before calling AskUserQuestion, verify: - [ ] Net line closes the decision - [ ] You are calling the tool, not writing prose - [ ] Non-ASCII characters (CJK / accents) written directly, NOT \u-escaped +- [ ] If you had 5+ options, you split (or batched into ≤4-groups) — did NOT drop any +- [ ] If you split, you checked dependencies between options before firing the chain +- [ ] If a per-option Hold fires, you stopped the chain immediately (didn't queue) ## Artifacts Sync (skill start) diff --git a/test/fixtures/forcing-finding-seeds.ts b/test/fixtures/forcing-finding-seeds.ts index 8d52c858d7..14a41f72a0 100644 --- a/test/fixtures/forcing-finding-seeds.ts +++ b/test/fixtures/forcing-finding-seeds.ts @@ -120,3 +120,59 @@ export const FORCING_BATCHING_ENG = [ 'iterate the payload to recompute the dependency graph. Could cache the', 'graph on the first attempt; not planned.', ].join('\n'); + +/** + * Split-overflow regression seed (periodic tier). + * + * Catches the original failure mode the user complained about: when the + * agent has 5+ options for ONE conceptual decision, it must split into N + * sequential AskUserQuestion calls (or batch into compatible ≤4-groups), + * NOT drop an option arbitrarily to fit Conductor's 4-option cap. + * + * Fixture shape: 5 independent platform-integration candidates for ONE + * scope decision. Each is independent (no dependencies between them) so + * the natural compliant shape is a per-option split chain at parent D. + * + * Used by test/skill-e2e-plan-ceo-split-overflow.test.ts to assert the + * agent fires >= 4 review-phase AUQs (floor uses the standard [N-1] + * tolerance band, accounting for one expected scope-reduction-or-merge + * call before the per-option chain begins). + * + * Pre-fix behavior: agent fires 1 AUQ with 4 options, "trims" the 5th + * via prose ("E5 is the largest lift and a natural follow-up; moving to + * TODOs without asking"). That's the bug. Floor of 4 detects it. + */ +export const FORCING_SPLIT_OVERFLOW_CEO = [ + 'Please review this plan and help me decide scope. Write your plan-mode plan to /tmp/gstack-test-plan-ceo-split-overflow.md (use Edit/Write to that exact path).', + '', + '# Plan: Pick which chat-platform integrations to ship this quarter', + '', + 'We have engineering bandwidth for at most 2-3 integrations this quarter.', + 'I need your help deciding which to prioritize. Below are 5 candidates,', + 'each fully independent of the others (no shared infrastructure, no', + 'dependencies between them). For each, the user can independently decide:', + 'include in this scope, defer to next quarter, or cut entirely.', + '', + '## E1) Slack — DM bot for incident alerts', + 'Build cost: ~2 weeks. Existing Slack auth flow we can reuse. High user', + 'demand (top customer request in Q2 survey, ~40% of asks).', + '', + '## E2) Discord — guild bot for community channels', + 'Build cost: ~3 weeks. Greenfield integration, no existing auth. Medium', + 'demand (~15% of asks, but loud community).', + '', + '## E3) Microsoft Teams — webhook + bot framework', + 'Build cost: ~4 weeks. Enterprise customers specifically asked for this.', + 'Highest revenue impact per user but smallest user count (~5% of asks).', + '', + '## E4) Telegram — bot API integration', + 'Build cost: ~1 week. Simplest API surface. Low strategic value but', + 'cheap win (~8% of asks, mostly from international users).', + '', + '## E5) Mattermost — REST plugin', + 'Build cost: ~2 weeks. Self-hosted enterprise users. Niche but locked-in', + 'segment (~3% of asks but all from high-ARR accounts).', + '', + 'Please walk me through each candidate and help me decide include/defer/cut', + 'per option. I want individual decisions per candidate, not a bundled pick.', +].join('\n'); diff --git a/test/fixtures/golden/claude-ship-SKILL.md b/test/fixtures/golden/claude-ship-SKILL.md index 0907989142..9611072f74 100644 --- a/test/fixtures/golden/claude-ship-SKILL.md +++ b/test/fixtures/golden/claude-ship-SKILL.md @@ -107,6 +107,19 @@ _CHECKPOINT_MODE=$(~/.claude/skills/gstack/bin/gstack-config get checkpoint_mode _CHECKPOINT_PUSH=$(~/.claude/skills/gstack/bin/gstack-config get checkpoint_push 2>/dev/null || echo "false") echo "CHECKPOINT_MODE: $_CHECKPOINT_MODE" echo "CHECKPOINT_PUSH: $_CHECKPOINT_PUSH" +# Plan-mode hint for skills like /spec that branch behavior on plan-mode state. +# Claude Code exposes plan mode via system reminders; we detect best-effort +# from CLAUDE_PLAN_FILE (set by the harness when plan mode is active) and +# fall back to "inactive". Codex hosts and Claude execution mode both end up +# inactive, which is the safe default (defaults to file+execute pipeline). +if [ -n "${CLAUDE_PLAN_FILE:-}${GSTACK_PLAN_MODE_FORCE:-}" ]; then + export GSTACK_PLAN_MODE="active" +elif [ "${GSTACK_PLAN_MODE:-}" = "active" ]; then + export GSTACK_PLAN_MODE="active" +else + export GSTACK_PLAN_MODE="inactive" +fi +echo "GSTACK_PLAN_MODE: $GSTACK_PLAN_MODE" [ -n "$OPENCLAW_SESSION" ] && echo "SPAWNED_SESSION: true" || true ``` @@ -238,6 +251,7 @@ Key routing rules: - Ship/deploy/PR → invoke /ship or /land-and-deploy - Save progress → invoke /context-save - Resume context → invoke /context-restore +- Author a backlog-ready spec/issue → invoke /spec ``` Then commit the change: `git add CLAUDE.md && git commit -m "chore: add gstack skill routing rules to CLAUDE.md"` @@ -324,7 +338,36 @@ Effort both-scales: when an option involves effort, label both human-team and CC Net line closes the tradeoff. Per-skill instructions may add stricter rules. -12. **Non-ASCII characters — write directly, never \u-escape.** When any +### Handling 5+ options — split, never drop + +AskUserQuestion caps every call at **4 options**. With 5+ real options, NEVER +drop, merge, or silently defer one to fit. Pick a compliant shape: + +- **Batch into ≤4-groups** — for coherent alternatives (e.g. version bumps, + layout variants). One call, 5th surfaced only if first 4 don't fit. +- **Split per-option** — for independent scope items (e.g. "ship E1..E6?"). + Fire N sequential calls, one per option. Default to this when unsure. + +Per-option call shape: `D.k` header (e.g. D3.1..D3.5), ELI10 per option, +Recommendation, kind-note (no completeness score — Include/Defer/Cut/Hold are +decision actions), and 4 buckets: +**A) Include**, **B) Defer**, **C) Cut**, **D) Hold** (stop chain, discuss). + +After the chain, fire `D.final` to validate the assembled set (reprompt +dependency conflicts) and confirm shipping it. Use `D.revise-` to +revise one option without re-running the chain. + +For N>6, fire a `D.0` meta-AskUserQuestion first (proceed / narrow / batch). + +question_ids for split chains: `-split-` (kebab-case ASCII, +≤64 chars, `-2`/`-3` suffix on collision). The runtime checker +(`bin/gstack-question-preference`) refuses `never-ask` on any `*-split-*` id, +so split chains are never AUTO_DECIDE-eligible — the user's option set is sacred. + +**Full rule + worked examples + Hold/dependency semantics:** see +`docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. + +**Non-ASCII characters — write directly, never \u-escape.** When any string field (question, option label, option description) contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit the literal UTF-8 characters in the JSON string. **Never escape them @@ -357,6 +400,9 @@ Before calling AskUserQuestion, verify: - [ ] Net line closes the decision - [ ] You are calling the tool, not writing prose - [ ] Non-ASCII characters (CJK / accents) written directly, NOT \u-escaped +- [ ] If you had 5+ options, you split (or batched into ≤4-groups) — did NOT drop any +- [ ] If you split, you checked dependencies between options before firing the chain +- [ ] If a per-option Hold fires, you stopped the chain immediately (didn't queue) ## Artifacts Sync (skill start) @@ -2926,6 +2972,39 @@ you missed it.> +## Linked Spec +-$$, the spawned worktree IS where /ship runs). + SPEC_FILE=$(grep -l "^spec_branch: $CURRENT_BRANCH$" "$SPEC_ARCHIVES"/*.md 2>/dev/null | head -1) + [ -z "$SPEC_FILE" ] && exit # no spec; omit this section entirely + SPEC_ISSUE=$(grep "^spec_issue_number:" "$SPEC_FILE" | cut -d' ' -f2) + [ -z "$SPEC_ISSUE" ] && exit # spec archive exists but no issue number; omit + + # CONDITIONAL Closes #N (codex F4): only add when Plan Completion above is "complete". + # If the plan completion gate from Step 8 reports any deferred or failed items, emit: + # "Linked to #$SPEC_ISSUE (partial delivery — NOT auto-closing; close manually after follow-up)" + # If Plan Completion is fully complete, emit: + # "Closes #$SPEC_ISSUE" + # and include the Closes #N line in the PR body so GitHub auto-closes on merge.> + + + + This PR delivers the spec at . + Spec filed: > + + (partial delivery — not auto-closing). + Deferred items: . + Close # manually after follow-up lands.> + + + ## Verification Results diff --git a/test/fixtures/golden/codex-ship-SKILL.md b/test/fixtures/golden/codex-ship-SKILL.md index 5610f747d5..8eaaee3696 100644 --- a/test/fixtures/golden/codex-ship-SKILL.md +++ b/test/fixtures/golden/codex-ship-SKILL.md @@ -93,6 +93,19 @@ _CHECKPOINT_MODE=$($GSTACK_BIN/gstack-config get checkpoint_mode 2>/dev/null || _CHECKPOINT_PUSH=$($GSTACK_BIN/gstack-config get checkpoint_push 2>/dev/null || echo "false") echo "CHECKPOINT_MODE: $_CHECKPOINT_MODE" echo "CHECKPOINT_PUSH: $_CHECKPOINT_PUSH" +# Plan-mode hint for skills like /spec that branch behavior on plan-mode state. +# Claude Code exposes plan mode via system reminders; we detect best-effort +# from CLAUDE_PLAN_FILE (set by the harness when plan mode is active) and +# fall back to "inactive". Codex hosts and Claude execution mode both end up +# inactive, which is the safe default (defaults to file+execute pipeline). +if [ -n "${CLAUDE_PLAN_FILE:-}${GSTACK_PLAN_MODE_FORCE:-}" ]; then + export GSTACK_PLAN_MODE="active" +elif [ "${GSTACK_PLAN_MODE:-}" = "active" ]; then + export GSTACK_PLAN_MODE="active" +else + export GSTACK_PLAN_MODE="inactive" +fi +echo "GSTACK_PLAN_MODE: $GSTACK_PLAN_MODE" [ -n "$OPENCLAW_SESSION" ] && echo "SPAWNED_SESSION: true" || true ``` @@ -224,6 +237,7 @@ Key routing rules: - Ship/deploy/PR → invoke /ship or /land-and-deploy - Save progress → invoke /context-save - Resume context → invoke /context-restore +- Author a backlog-ready spec/issue → invoke /spec ``` Then commit the change: `git add CLAUDE.md && git commit -m "chore: add gstack skill routing rules to CLAUDE.md"` @@ -310,7 +324,36 @@ Effort both-scales: when an option involves effort, label both human-team and CC Net line closes the tradeoff. Per-skill instructions may add stricter rules. -12. **Non-ASCII characters — write directly, never \u-escape.** When any +### Handling 5+ options — split, never drop + +AskUserQuestion caps every call at **4 options**. With 5+ real options, NEVER +drop, merge, or silently defer one to fit. Pick a compliant shape: + +- **Batch into ≤4-groups** — for coherent alternatives (e.g. version bumps, + layout variants). One call, 5th surfaced only if first 4 don't fit. +- **Split per-option** — for independent scope items (e.g. "ship E1..E6?"). + Fire N sequential calls, one per option. Default to this when unsure. + +Per-option call shape: `D.k` header (e.g. D3.1..D3.5), ELI10 per option, +Recommendation, kind-note (no completeness score — Include/Defer/Cut/Hold are +decision actions), and 4 buckets: +**A) Include**, **B) Defer**, **C) Cut**, **D) Hold** (stop chain, discuss). + +After the chain, fire `D.final` to validate the assembled set (reprompt +dependency conflicts) and confirm shipping it. Use `D.revise-` to +revise one option without re-running the chain. + +For N>6, fire a `D.0` meta-AskUserQuestion first (proceed / narrow / batch). + +question_ids for split chains: `-split-` (kebab-case ASCII, +≤64 chars, `-2`/`-3` suffix on collision). The runtime checker +(`bin/gstack-question-preference`) refuses `never-ask` on any `*-split-*` id, +so split chains are never AUTO_DECIDE-eligible — the user's option set is sacred. + +**Full rule + worked examples + Hold/dependency semantics:** see +`docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. + +**Non-ASCII characters — write directly, never \u-escape.** When any string field (question, option label, option description) contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit the literal UTF-8 characters in the JSON string. **Never escape them @@ -343,6 +386,9 @@ Before calling AskUserQuestion, verify: - [ ] Net line closes the decision - [ ] You are calling the tool, not writing prose - [ ] Non-ASCII characters (CJK / accents) written directly, NOT \u-escaped +- [ ] If you had 5+ options, you split (or batched into ≤4-groups) — did NOT drop any +- [ ] If you split, you checked dependencies between options before firing the chain +- [ ] If a per-option Hold fires, you stopped the chain immediately (didn't queue) ## Artifacts Sync (skill start) @@ -2536,6 +2582,39 @@ you missed it.> +## Linked Spec +-$$, the spawned worktree IS where /ship runs). + SPEC_FILE=$(grep -l "^spec_branch: $CURRENT_BRANCH$" "$SPEC_ARCHIVES"/*.md 2>/dev/null | head -1) + [ -z "$SPEC_FILE" ] && exit # no spec; omit this section entirely + SPEC_ISSUE=$(grep "^spec_issue_number:" "$SPEC_FILE" | cut -d' ' -f2) + [ -z "$SPEC_ISSUE" ] && exit # spec archive exists but no issue number; omit + + # CONDITIONAL Closes #N (codex F4): only add when Plan Completion above is "complete". + # If the plan completion gate from Step 8 reports any deferred or failed items, emit: + # "Linked to #$SPEC_ISSUE (partial delivery — NOT auto-closing; close manually after follow-up)" + # If Plan Completion is fully complete, emit: + # "Closes #$SPEC_ISSUE" + # and include the Closes #N line in the PR body so GitHub auto-closes on merge.> + + + + This PR delivers the spec at . + Spec filed: > + + (partial delivery — not auto-closing). + Deferred items: . + Close # manually after follow-up lands.> + + + ## Verification Results diff --git a/test/fixtures/golden/factory-ship-SKILL.md b/test/fixtures/golden/factory-ship-SKILL.md index a7426c9cad..343768d894 100644 --- a/test/fixtures/golden/factory-ship-SKILL.md +++ b/test/fixtures/golden/factory-ship-SKILL.md @@ -95,6 +95,19 @@ _CHECKPOINT_MODE=$($GSTACK_BIN/gstack-config get checkpoint_mode 2>/dev/null || _CHECKPOINT_PUSH=$($GSTACK_BIN/gstack-config get checkpoint_push 2>/dev/null || echo "false") echo "CHECKPOINT_MODE: $_CHECKPOINT_MODE" echo "CHECKPOINT_PUSH: $_CHECKPOINT_PUSH" +# Plan-mode hint for skills like /spec that branch behavior on plan-mode state. +# Claude Code exposes plan mode via system reminders; we detect best-effort +# from CLAUDE_PLAN_FILE (set by the harness when plan mode is active) and +# fall back to "inactive". Codex hosts and Claude execution mode both end up +# inactive, which is the safe default (defaults to file+execute pipeline). +if [ -n "${CLAUDE_PLAN_FILE:-}${GSTACK_PLAN_MODE_FORCE:-}" ]; then + export GSTACK_PLAN_MODE="active" +elif [ "${GSTACK_PLAN_MODE:-}" = "active" ]; then + export GSTACK_PLAN_MODE="active" +else + export GSTACK_PLAN_MODE="inactive" +fi +echo "GSTACK_PLAN_MODE: $GSTACK_PLAN_MODE" [ -n "$OPENCLAW_SESSION" ] && echo "SPAWNED_SESSION: true" || true ``` @@ -226,6 +239,7 @@ Key routing rules: - Ship/deploy/PR → invoke /ship or /land-and-deploy - Save progress → invoke /context-save - Resume context → invoke /context-restore +- Author a backlog-ready spec/issue → invoke /spec ``` Then commit the change: `git add CLAUDE.md && git commit -m "chore: add gstack skill routing rules to CLAUDE.md"` @@ -312,7 +326,36 @@ Effort both-scales: when an option involves effort, label both human-team and CC Net line closes the tradeoff. Per-skill instructions may add stricter rules. -12. **Non-ASCII characters — write directly, never \u-escape.** When any +### Handling 5+ options — split, never drop + +AskUserQuestion caps every call at **4 options**. With 5+ real options, NEVER +drop, merge, or silently defer one to fit. Pick a compliant shape: + +- **Batch into ≤4-groups** — for coherent alternatives (e.g. version bumps, + layout variants). One call, 5th surfaced only if first 4 don't fit. +- **Split per-option** — for independent scope items (e.g. "ship E1..E6?"). + Fire N sequential calls, one per option. Default to this when unsure. + +Per-option call shape: `D.k` header (e.g. D3.1..D3.5), ELI10 per option, +Recommendation, kind-note (no completeness score — Include/Defer/Cut/Hold are +decision actions), and 4 buckets: +**A) Include**, **B) Defer**, **C) Cut**, **D) Hold** (stop chain, discuss). + +After the chain, fire `D.final` to validate the assembled set (reprompt +dependency conflicts) and confirm shipping it. Use `D.revise-` to +revise one option without re-running the chain. + +For N>6, fire a `D.0` meta-AskUserQuestion first (proceed / narrow / batch). + +question_ids for split chains: `-split-` (kebab-case ASCII, +≤64 chars, `-2`/`-3` suffix on collision). The runtime checker +(`bin/gstack-question-preference`) refuses `never-ask` on any `*-split-*` id, +so split chains are never AUTO_DECIDE-eligible — the user's option set is sacred. + +**Full rule + worked examples + Hold/dependency semantics:** see +`docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. + +**Non-ASCII characters — write directly, never \u-escape.** When any string field (question, option label, option description) contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit the literal UTF-8 characters in the JSON string. **Never escape them @@ -345,6 +388,9 @@ Before calling AskUserQuestion, verify: - [ ] Net line closes the decision - [ ] You are calling the tool, not writing prose - [ ] Non-ASCII characters (CJK / accents) written directly, NOT \u-escaped +- [ ] If you had 5+ options, you split (or batched into ≤4-groups) — did NOT drop any +- [ ] If you split, you checked dependencies between options before firing the chain +- [ ] If a per-option Hold fires, you stopped the chain immediately (didn't queue) ## Artifacts Sync (skill start) @@ -2914,6 +2960,39 @@ you missed it.> +## Linked Spec +-$$, the spawned worktree IS where /ship runs). + SPEC_FILE=$(grep -l "^spec_branch: $CURRENT_BRANCH$" "$SPEC_ARCHIVES"/*.md 2>/dev/null | head -1) + [ -z "$SPEC_FILE" ] && exit # no spec; omit this section entirely + SPEC_ISSUE=$(grep "^spec_issue_number:" "$SPEC_FILE" | cut -d' ' -f2) + [ -z "$SPEC_ISSUE" ] && exit # spec archive exists but no issue number; omit + + # CONDITIONAL Closes #N (codex F4): only add when Plan Completion above is "complete". + # If the plan completion gate from Step 8 reports any deferred or failed items, emit: + # "Linked to #$SPEC_ISSUE (partial delivery — NOT auto-closing; close manually after follow-up)" + # If Plan Completion is fully complete, emit: + # "Closes #$SPEC_ISSUE" + # and include the Closes #N line in the PR body so GitHub auto-closes on merge.> + + + + This PR delivers the spec at . + Spec filed: > + + (partial delivery — not auto-closing). + Deferred items: . + Close # manually after follow-up lands.> + + + ## Verification Results diff --git a/test/fixtures/parity-baseline-v1.47.0.0.json b/test/fixtures/parity-baseline-v1.47.0.0.json new file mode 100644 index 0000000000..aad9c538e3 --- /dev/null +++ b/test/fixtures/parity-baseline-v1.47.0.0.json @@ -0,0 +1,633 @@ +{ + "tag": "v1.47.0.0", + "capturedAt": "2026-05-27T05:50:57.656Z", + "capturedFromCommit": "e08e5fa8", + "capturedFromBranch": "garrytan/askuserquestion-split-on-overflow", + "totalSkills": 52, + "totalCorpusBytes": 3090887, + "estTotalCatalogTokens": 4116, + "topHeaviest": [ + { + "skill": "ship", + "skillMdBytes": 166782, + "skillMdLines": 3099, + "estTokens": 41696, + "tmplBytes": 50495, + "descriptionLen": 291, + "hasGateEval": true, + "hasPeriodicEval": true + }, + { + "skill": "plan-ceo-review", + "skillMdBytes": 132488, + "skillMdLines": 2197, + "estTokens": 33122, + "tmplBytes": 63393, + "descriptionLen": 794, + "hasGateEval": true, + "hasPeriodicEval": true + }, + { + "skill": "office-hours", + "skillMdBytes": 112842, + "skillMdLines": 2066, + "estTokens": 28211, + "tmplBytes": 55466, + "descriptionLen": 860, + "hasGateEval": true, + "hasPeriodicEval": false + }, + { + "skill": "plan-design-review", + "skillMdBytes": 107855, + "skillMdLines": 1928, + "estTokens": 26964, + "tmplBytes": 28624, + "descriptionLen": 218, + "hasGateEval": true, + "hasPeriodicEval": true + }, + { + "skill": "plan-devex-review", + "skillMdBytes": 106167, + "skillMdLines": 2119, + "estTokens": 26542, + "tmplBytes": 35680, + "descriptionLen": 250, + "hasGateEval": true, + "hasPeriodicEval": true + }, + { + "skill": "plan-eng-review", + "skillMdBytes": 103009, + "skillMdLines": 1762, + "estTokens": 25752, + "tmplBytes": 26234, + "descriptionLen": 231, + "hasGateEval": true, + "hasPeriodicEval": true + }, + { + "skill": "spec", + "skillMdBytes": 102629, + "skillMdLines": 2141, + "estTokens": 25657, + "tmplBytes": 28429, + "descriptionLen": 282, + "hasGateEval": true, + "hasPeriodicEval": false + }, + { + "skill": "design-review", + "skillMdBytes": 95654, + "skillMdLines": 1932, + "estTokens": 23914, + "tmplBytes": 11674, + "descriptionLen": 304, + "hasGateEval": true, + "hasPeriodicEval": false + }, + { + "skill": "review", + "skillMdBytes": 94048, + "skillMdLines": 1762, + "estTokens": 23512, + "tmplBytes": 14099, + "descriptionLen": 205, + "hasGateEval": true, + "hasPeriodicEval": false + }, + { + "skill": "land-and-deploy", + "skillMdBytes": 91886, + "skillMdLines": 1856, + "estTokens": 22972, + "tmplBytes": 48624, + "descriptionLen": 160, + "hasGateEval": true, + "hasPeriodicEval": false + } + ], + "skills": { + "autoplan": { + "skill": "autoplan", + "skillMdBytes": 90870, + "skillMdLines": 1784, + "estTokens": 22718, + "tmplBytes": 45271, + "descriptionLen": 366, + "hasGateEval": true, + "hasPeriodicEval": true + }, + "benchmark": { + "skill": "benchmark", + "skillMdBytes": 33266, + "skillMdLines": 747, + "estTokens": 8317, + "tmplBytes": 9378, + "descriptionLen": 213, + "hasGateEval": true, + "hasPeriodicEval": false + }, + "benchmark-models": { + "skill": "benchmark-models", + "skillMdBytes": 29333, + "skillMdLines": 622, + "estTokens": 7333, + "tmplBytes": 6631, + "descriptionLen": 217, + "hasGateEval": false, + "hasPeriodicEval": false + }, + "browse": { + "skill": "browse", + "skillMdBytes": 48018, + "skillMdLines": 929, + "estTokens": 12005, + "tmplBytes": 10805, + "descriptionLen": 181, + "hasGateEval": true, + "hasPeriodicEval": false + }, + "canary": { + "skill": "canary", + "skillMdBytes": 47105, + "skillMdLines": 990, + "estTokens": 11776, + "tmplBytes": 8033, + "descriptionLen": 180, + "hasGateEval": true, + "hasPeriodicEval": false + }, + "careful": { + "skill": "careful", + "skillMdBytes": 2551, + "skillMdLines": 68, + "estTokens": 638, + "tmplBytes": 2435, + "descriptionLen": 315, + "hasGateEval": false, + "hasPeriodicEval": false + }, + "codex": { + "skill": "codex", + "skillMdBytes": 79620, + "skillMdLines": 1519, + "estTokens": 19905, + "tmplBytes": 34143, + "descriptionLen": 187, + "hasGateEval": true, + "hasPeriodicEval": false + }, + "context-restore": { + "skill": "context-restore", + "skillMdBytes": 41493, + "skillMdLines": 848, + "estTokens": 10373, + "tmplBytes": 5255, + "descriptionLen": 238, + "hasGateEval": true, + "hasPeriodicEval": false + }, + "context-save": { + "skill": "context-save", + "skillMdBytes": 45690, + "skillMdLines": 966, + "estTokens": 11423, + "tmplBytes": 9293, + "descriptionLen": 168, + "hasGateEval": true, + "hasPeriodicEval": false + }, + "cso": { + "skill": "cso", + "skillMdBytes": 77397, + "skillMdLines": 1451, + "estTokens": 19349, + "tmplBytes": 35158, + "descriptionLen": 196, + "hasGateEval": true, + "hasPeriodicEval": false + }, + "design-consultation": { + "skill": "design-consultation", + "skillMdBytes": 79222, + "skillMdLines": 1561, + "estTokens": 19806, + "tmplBytes": 25899, + "descriptionLen": 888, + "hasGateEval": true, + "hasPeriodicEval": false + }, + "design-html": { + "skill": "design-html", + "skillMdBytes": 66547, + "skillMdLines": 1449, + "estTokens": 16637, + "tmplBytes": 22567, + "descriptionLen": 233, + "hasGateEval": false, + "hasPeriodicEval": false + }, + "design-review": { + "skill": "design-review", + "skillMdBytes": 95654, + "skillMdLines": 1932, + "estTokens": 23914, + "tmplBytes": 11674, + "descriptionLen": 304, + "hasGateEval": true, + "hasPeriodicEval": false + }, + "design-shotgun": { + "skill": "design-shotgun", + "skillMdBytes": 62836, + "skillMdLines": 1311, + "estTokens": 15709, + "tmplBytes": 13331, + "descriptionLen": 786, + "hasGateEval": false, + "hasPeriodicEval": false + }, + "devex-review": { + "skill": "devex-review", + "skillMdBytes": 64413, + "skillMdLines": 1233, + "estTokens": 16103, + "tmplBytes": 7984, + "descriptionLen": 201, + "hasGateEval": false, + "hasPeriodicEval": false + }, + "document-generate": { + "skill": "document-generate", + "skillMdBytes": 52987, + "skillMdLines": 1176, + "estTokens": 13247, + "tmplBytes": 15093, + "descriptionLen": 334, + "hasGateEval": false, + "hasPeriodicEval": false + }, + "document-release": { + "skill": "document-release", + "skillMdBytes": 58251, + "skillMdLines": 1235, + "estTokens": 14563, + "tmplBytes": 20362, + "descriptionLen": 192, + "hasGateEval": true, + "hasPeriodicEval": false + }, + "freeze": { + "skill": "freeze", + "skillMdBytes": 3154, + "skillMdLines": 92, + "estTokens": 789, + "tmplBytes": 3038, + "descriptionLen": 503, + "hasGateEval": false, + "hasPeriodicEval": false + }, + "gstack-upgrade": { + "skill": "gstack-upgrade", + "skillMdBytes": 10817, + "skillMdLines": 285, + "estTokens": 2704, + "tmplBytes": 10667, + "descriptionLen": 163, + "hasGateEval": true, + "hasPeriodicEval": false + }, + "guard": { + "skill": "guard", + "skillMdBytes": 3297, + "skillMdLines": 91, + "estTokens": 824, + "tmplBytes": 3181, + "descriptionLen": 686, + "hasGateEval": false, + "hasPeriodicEval": false + }, + "health": { + "skill": "health", + "skillMdBytes": 47916, + "skillMdLines": 1014, + "estTokens": 11979, + "tmplBytes": 11617, + "descriptionLen": 184, + "hasGateEval": true, + "hasPeriodicEval": false + }, + "investigate": { + "skill": "investigate", + "skillMdBytes": 50409, + "skillMdLines": 1012, + "estTokens": 12602, + "tmplBytes": 11561, + "descriptionLen": 1379, + "hasGateEval": true, + "hasPeriodicEval": false + }, + "ios-clean": { + "skill": "ios-clean", + "skillMdBytes": 41045, + "skillMdLines": 813, + "estTokens": 10261, + "tmplBytes": 3851, + "descriptionLen": 252, + "hasGateEval": false, + "hasPeriodicEval": false + }, + "ios-design-review": { + "skill": "ios-design-review", + "skillMdBytes": 41631, + "skillMdLines": 815, + "estTokens": 10408, + "tmplBytes": 4417, + "descriptionLen": 209, + "hasGateEval": false, + "hasPeriodicEval": false + }, + "ios-fix": { + "skill": "ios-fix", + "skillMdBytes": 40760, + "skillMdLines": 811, + "estTokens": 10190, + "tmplBytes": 3574, + "descriptionLen": 187, + "hasGateEval": false, + "hasPeriodicEval": false + }, + "ios-qa": { + "skill": "ios-qa", + "skillMdBytes": 47271, + "skillMdLines": 931, + "estTokens": 11818, + "tmplBytes": 10090, + "descriptionLen": 223, + "hasGateEval": true, + "hasPeriodicEval": false + }, + "ios-sync": { + "skill": "ios-sync", + "skillMdBytes": 40737, + "skillMdLines": 804, + "estTokens": 10184, + "tmplBytes": 3544, + "descriptionLen": 269, + "hasGateEval": false, + "hasPeriodicEval": false + }, + "land-and-deploy": { + "skill": "land-and-deploy", + "skillMdBytes": 91886, + "skillMdLines": 1856, + "estTokens": 22972, + "tmplBytes": 48624, + "descriptionLen": 160, + "hasGateEval": true, + "hasPeriodicEval": false + }, + "landing-report": { + "skill": "landing-report", + "skillMdBytes": 43985, + "skillMdLines": 874, + "estTokens": 10996, + "tmplBytes": 6806, + "descriptionLen": 195, + "hasGateEval": false, + "hasPeriodicEval": false + }, + "learn": { + "skill": "learn", + "skillMdBytes": 41722, + "skillMdLines": 891, + "estTokens": 10431, + "tmplBytes": 5594, + "descriptionLen": 178, + "hasGateEval": true, + "hasPeriodicEval": false + }, + "make-pdf": { + "skill": "make-pdf", + "skillMdBytes": 29450, + "skillMdLines": 663, + "estTokens": 7363, + "tmplBytes": 5106, + "descriptionLen": 177, + "hasGateEval": false, + "hasPeriodicEval": false + }, + "office-hours": { + "skill": "office-hours", + "skillMdBytes": 112842, + "skillMdLines": 2066, + "estTokens": 28211, + "tmplBytes": 55466, + "descriptionLen": 860, + "hasGateEval": true, + "hasPeriodicEval": false + }, + "open-gstack-browser": { + "skill": "open-gstack-browser", + "skillMdBytes": 46131, + "skillMdLines": 954, + "estTokens": 11533, + "tmplBytes": 7702, + "descriptionLen": 204, + "hasGateEval": false, + "hasPeriodicEval": false + }, + "pair-agent": { + "skill": "pair-agent", + "skillMdBytes": 46939, + "skillMdLines": 1010, + "estTokens": 11735, + "tmplBytes": 8548, + "descriptionLen": 167, + "hasGateEval": false, + "hasPeriodicEval": false + }, + "plan-ceo-review": { + "skill": "plan-ceo-review", + "skillMdBytes": 132488, + "skillMdLines": 2197, + "estTokens": 33122, + "tmplBytes": 63393, + "descriptionLen": 794, + "hasGateEval": true, + "hasPeriodicEval": true + }, + "plan-design-review": { + "skill": "plan-design-review", + "skillMdBytes": 107855, + "skillMdLines": 1928, + "estTokens": 26964, + "tmplBytes": 28624, + "descriptionLen": 218, + "hasGateEval": true, + "hasPeriodicEval": true + }, + "plan-devex-review": { + "skill": "plan-devex-review", + "skillMdBytes": 106167, + "skillMdLines": 2119, + "estTokens": 26542, + "tmplBytes": 35680, + "descriptionLen": 250, + "hasGateEval": true, + "hasPeriodicEval": true + }, + "plan-eng-review": { + "skill": "plan-eng-review", + "skillMdBytes": 103009, + "skillMdLines": 1762, + "estTokens": 25752, + "tmplBytes": 26234, + "descriptionLen": 231, + "hasGateEval": true, + "hasPeriodicEval": true + }, + "plan-tune": { + "skill": "plan-tune", + "skillMdBytes": 51717, + "skillMdLines": 1077, + "estTokens": 12929, + "tmplBytes": 15586, + "descriptionLen": 325, + "hasGateEval": true, + "hasPeriodicEval": false + }, + "qa": { + "skill": "qa", + "skillMdBytes": 73863, + "skillMdLines": 1622, + "estTokens": 18466, + "tmplBytes": 12701, + "descriptionLen": 218, + "hasGateEval": true, + "hasPeriodicEval": false + }, + "qa-only": { + "skill": "qa-only", + "skillMdBytes": 56421, + "skillMdLines": 1194, + "estTokens": 14105, + "tmplBytes": 3851, + "descriptionLen": 165, + "hasGateEval": true, + "hasPeriodicEval": false + }, + "retro": { + "skill": "retro", + "skillMdBytes": 82889, + "skillMdLines": 1750, + "estTokens": 20722, + "tmplBytes": 42427, + "descriptionLen": 648, + "hasGateEval": true, + "hasPeriodicEval": false + }, + "review": { + "skill": "review", + "skillMdBytes": 94048, + "skillMdLines": 1762, + "estTokens": 23512, + "tmplBytes": 14099, + "descriptionLen": 205, + "hasGateEval": true, + "hasPeriodicEval": false + }, + "scrape": { + "skill": "scrape", + "skillMdBytes": 43641, + "skillMdLines": 887, + "estTokens": 10910, + "tmplBytes": 5220, + "descriptionLen": 167, + "hasGateEval": true, + "hasPeriodicEval": false + }, + "setup-browser-cookies": { + "skill": "setup-browser-cookies", + "skillMdBytes": 26618, + "skillMdLines": 594, + "estTokens": 6655, + "tmplBytes": 2724, + "descriptionLen": 222, + "hasGateEval": false, + "hasPeriodicEval": false + }, + "setup-deploy": { + "skill": "setup-deploy", + "skillMdBytes": 43927, + "skillMdLines": 919, + "estTokens": 10982, + "tmplBytes": 7780, + "descriptionLen": 197, + "hasGateEval": true, + "hasPeriodicEval": false + }, + "setup-gbrain": { + "skill": "setup-gbrain", + "skillMdBytes": 78394, + "skillMdLines": 1704, + "estTokens": 19599, + "tmplBytes": 42245, + "descriptionLen": 323, + "hasGateEval": true, + "hasPeriodicEval": false + }, + "ship": { + "skill": "ship", + "skillMdBytes": 166782, + "skillMdLines": 3099, + "estTokens": 41696, + "tmplBytes": 50495, + "descriptionLen": 291, + "hasGateEval": true, + "hasPeriodicEval": true + }, + "skillify": { + "skill": "skillify", + "skillMdBytes": 53534, + "skillMdLines": 1168, + "estTokens": 13384, + "tmplBytes": 15107, + "descriptionLen": 233, + "hasGateEval": true, + "hasPeriodicEval": false + }, + "spec": { + "skill": "spec", + "skillMdBytes": 102629, + "skillMdLines": 2141, + "estTokens": 25657, + "tmplBytes": 28429, + "descriptionLen": 282, + "hasGateEval": true, + "hasPeriodicEval": false + }, + "sync-gbrain": { + "skill": "sync-gbrain", + "skillMdBytes": 50156, + "skillMdLines": 1028, + "estTokens": 12539, + "tmplBytes": 13996, + "descriptionLen": 299, + "hasGateEval": false, + "hasPeriodicEval": false + }, + "unfreeze": { + "skill": "unfreeze", + "skillMdBytes": 1504, + "skillMdLines": 49, + "estTokens": 376, + "tmplBytes": 1386, + "descriptionLen": 199, + "hasGateEval": false, + "hasPeriodicEval": false + } + } +} diff --git a/test/gstack-question-preference.test.ts b/test/gstack-question-preference.test.ts index 629319aefe..863cd9e741 100644 --- a/test/gstack-question-preference.test.ts +++ b/test/gstack-question-preference.test.ts @@ -103,6 +103,65 @@ describe('--check with preferences set', () => { }); }); +// Split-chain carve-out: question_ids matching -split- +// must always ASK_NORMALLY regardless of stored preferences. +// See scripts/resolvers/preamble/generate-ask-user-format.ts +// "Handling 5+ options — split, never drop" for the surrounding mechanism. +describe('--check split-chain carve-out (*-split-* always ASK_NORMALLY)', () => { + function setPref(id: string, pref: string) { + return run('--write', JSON.stringify({ question_id: id, preference: pref, source: 'plan-tune' })); + } + + test('split-id without preference → ASK_NORMALLY', () => { + const r = run('--check', 'plan-ceo-review-split-e4-detect-mappings'); + expect(r.stdout.trim()).toContain('ASK_NORMALLY'); + }); + + test('split-id + never-ask → ASK_NORMALLY (carve-out overrides preference)', () => { + setPref('plan-ceo-review-split-e4-detect-mappings', 'never-ask'); + const r = run('--check', 'plan-ceo-review-split-e4-detect-mappings'); + expect(r.stdout).toContain('ASK_NORMALLY'); + expect(r.stdout).not.toContain('AUTO_DECIDE'); + }); + + test('split-id + never-ask → emits explanatory note', () => { + setPref('plan-ceo-review-split-e4-detect-mappings', 'never-ask'); + const r = run('--check', 'plan-ceo-review-split-e4-detect-mappings'); + expect(r.stdout).toContain('split-chain per-option calls always ASK_NORMALLY'); + expect(r.stdout).toContain('never-ask'); + }); + + test('split-id + ask-only-for-one-way → ASK_NORMALLY (carve-out overrides preference)', () => { + setPref('ship-split-version-bump', 'ask-only-for-one-way'); + const r = run('--check', 'ship-split-version-bump'); + expect(r.stdout).toContain('ASK_NORMALLY'); + expect(r.stdout).not.toContain('AUTO_DECIDE'); + }); + + test('split-id + always-ask → ASK_NORMALLY (no note since preference agrees)', () => { + setPref('plan-eng-review-split-add-test', 'always-ask'); + const r = run('--check', 'plan-eng-review-split-add-test'); + expect(r.stdout.trim()).toContain('ASK_NORMALLY'); + expect(r.stdout).not.toContain('does not apply'); + }); + + test('non-split id that just happens to contain "split" word is NOT carved out', () => { + // The carve-out matches `-split-` (kebab-cased), not the substring "split". + // A question id like `qa-splitscreen-test` (hypothetical) would not match. + // Verify by using a never-ask pref that should fire AUTO_DECIDE. + setPref('qa-splitscreen-test', 'never-ask'); + const r = run('--check', 'qa-splitscreen-test'); + expect(r.stdout.trim()).toContain('AUTO_DECIDE'); + }); + + test('multiple split-id formats: skill-split-anything matches', () => { + setPref('autoplan-split-ceo-finding-7', 'never-ask'); + const r = run('--check', 'autoplan-split-ceo-finding-7'); + expect(r.stdout).toContain('ASK_NORMALLY'); + expect(r.stdout).not.toContain('AUTO_DECIDE'); + }); +}); + // ----------------------------------------------------------------------- // --write // ----------------------------------------------------------------------- diff --git a/test/helpers/touchfiles.ts b/test/helpers/touchfiles.ts index 3aa85fa208..359da2b6f4 100644 --- a/test/helpers/touchfiles.ts +++ b/test/helpers/touchfiles.ts @@ -149,6 +149,7 @@ export const E2E_TOUCHFILES: Record = { // confirm" plan write. runPlanSkillFloorCheck cannot detect that shape // (it exits on first AUQ); runPlanSkillCounting can. 'plan-eng-multi-finding-batching': ['plan-eng-review/**', 'scripts/resolvers/preamble.ts', 'scripts/resolvers/preamble/generate-ask-user-format.ts', 'scripts/resolvers/preamble/generate-completion-status.ts', 'scripts/resolvers/review.ts', 'test/helpers/claude-pty-runner.ts', 'test/fixtures/forcing-finding-seeds.ts', 'test/skill-e2e-plan-eng-multi-finding-batching.test.ts'], + 'plan-ceo-split-overflow': ['plan-ceo-review/**', 'scripts/resolvers/preamble.ts', 'scripts/resolvers/preamble/generate-ask-user-format.ts', 'bin/gstack-question-preference', 'test/helpers/claude-pty-runner.ts', 'test/fixtures/forcing-finding-seeds.ts', 'test/skill-e2e-plan-ceo-split-overflow.test.ts'], 'brain-privacy-gate': ['scripts/resolvers/preamble/generate-brain-sync-block.ts', 'scripts/resolvers/preamble.ts', 'bin/gstack-brain-sync', 'bin/gstack-artifacts-init', 'bin/gstack-config', 'test/helpers/agent-sdk-runner.ts'], // /setup-gbrain Path 4 (Remote MCP) — happy + bad-token end-to-end via @@ -479,6 +480,7 @@ export const E2E_TIERS: Record = { 'plan-design-finding-floor': 'gate', 'plan-devex-finding-floor': 'gate', 'plan-eng-multi-finding-batching': 'periodic', + 'plan-ceo-split-overflow': 'periodic', // Privacy gate for gstack-brain-sync — periodic (non-deterministic LLM call, // costs ~$0.30-$0.50 per run, not needed on every commit) diff --git a/test/resolver-ask-user-format.test.ts b/test/resolver-ask-user-format.test.ts index 37744f2bc9..629780c4f8 100644 --- a/test/resolver-ask-user-format.test.ts +++ b/test/resolver-ask-user-format.test.ts @@ -119,3 +119,45 @@ describe('generateAskUserFormat — v1.7.0.0 Pros/Cons format', () => { expect(out).toMatch(/Per-skill instructions may add/); }); }); + +describe('generateAskUserFormat — 5+ option split rule (slim inline + docs pointer)', () => { + const out = generateAskUserFormat(makeCtx()); + + // 5 highest-signal pins. The full rule lives in + // docs/askuserquestion-split.md; this contract only checks what the + // inline subsection MUST surface so the agent can act without + // reading the docs file for routine 5-option splits. + + test('forbids dropping options to fit the 4-option cap', () => { + expect(out).toMatch(/caps every call at \*\*4 options\*\*/); + expect(out).toMatch(/NEVER\s+drop, merge, or silently defer/); + }); + + test('names the Include / Defer / Cut / Hold buckets', () => { + expect(out).toMatch(/A\) Include/); + expect(out).toMatch(/B\) Defer/); + expect(out).toMatch(/C\) Cut/); + expect(out).toMatch(/D\) Hold/); + }); + + test('specifies D.k child numbering and D.final summary', () => { + expect(out).toContain('D.k'); + expect(out).toContain('D.final'); + }); + + test('AUTO_DECIDE is gated at runtime, not just collision-resistance', () => { + expect(out).toContain('bin/gstack-question-preference'); + expect(out).toContain('*-split-*'); + expect(out).toContain('never AUTO_DECIDE-eligible'); + }); + + test('points to docs/askuserquestion-split.md for the full rule', () => { + expect(out).toContain('docs/askuserquestion-split.md'); + expect(out).toMatch(/Read on demand when N>4/); + }); + + test('regression: orphan "12." prefix removed from CJK rule', () => { + expect(out).not.toContain('12. **Non-ASCII'); + expect(out).toContain('**Non-ASCII characters'); + }); +}); diff --git a/test/skill-e2e-plan-ceo-split-overflow.test.ts b/test/skill-e2e-plan-ceo-split-overflow.test.ts new file mode 100644 index 0000000000..05f3c51f29 --- /dev/null +++ b/test/skill-e2e-plan-ceo-split-overflow.test.ts @@ -0,0 +1,108 @@ +/** + * /plan-ceo-review split-overflow regression (periodic, paid, real-PTY). + * + * Catches the original failure mode the user complained about: when the + * agent has 5+ options for ONE conceptual decision, it must split into N + * sequential AskUserQuestion calls (or batch into compatible ≤4-groups), + * NOT drop an option arbitrarily to fit Conductor's 4-option cap. + * + * Pre-fix reasoning trace from the user transcript that motivated this: + * "I'm hitting Conductor's limit of 4 options in the AUQ, so I need + * to cut one. E4 is the largest lift and probably beyond scope... + * Trimming: E4. Moving to TODOs without asking. Re-firing with 4." + * + * The fixture seeds 5 independent scope candidates (chat-platform + * integrations) — each carries an independent include/defer/cut decision. + * With the split rule active, the natural compliant shape is a per-option + * chain at parent D; the test asserts the agent fires at least + * [N-1] review-phase AUQs (standard tolerance band from the existing + * finding-count tests, which accounts for one expected scope-reduction + * call before the per-option chain begins). + * + * Why a separate test from skill-e2e-plan-ceo-finding-count and + * skill-e2e-plan-eng-multi-finding-batching: + * - finding-count tests fire one AUQ per finding (Architecture, Code + * Quality, etc) — they exercise the "one issue per call" rule, not + * the "5+ options for ONE decision" split rule. + * - This test fixtures ONE scope decision with 5 options inside it, + * which is exactly the shape that hits Conductor's 4-option cap and + * triggers the new split-vs-drop guidance. + * + * Tier: periodic (~25 min, ~$0.30-$5.00/run depending on agent path). + * Sequential by default. + */ + +import { describe, test } from 'bun:test'; +import * as fs from 'node:fs'; +import { + runPlanSkillCounting, + ceoStep0Boundary, +} from './helpers/claude-pty-runner'; +import { FORCING_SPLIT_OVERFLOW_CEO } from './fixtures/forcing-finding-seeds'; + +const shouldRun = !!process.env.EVALS && process.env.EVALS_TIER === 'periodic'; +const describeE2E = shouldRun ? describe : describe.skip; + +const N = 5; +const FLOOR = N - 1; // 4 — must fire at least one AUQ per non-dropped option + +const PLAN_PATH = '/tmp/gstack-test-plan-ceo-split-overflow.md'; + +describeE2E('/plan-ceo-review split-overflow regression (periodic)', () => { + test( + `5-option scope decision emits >= ${FLOOR} review-phase AskUserQuestions (no dropping)`, + async () => { + try { + fs.rmSync(PLAN_PATH, { force: true }); + } catch { + /* best-effort */ + } + + const obs = await runPlanSkillCounting({ + skillName: 'plan-ceo-review', + slashCommand: '/plan-ceo-review', + followUpPrompt: FORCING_SPLIT_OVERFLOW_CEO, + isLastStep0AUQ: ceoStep0Boundary, + reviewCountCeiling: N + 3, // hard cap above floor + tolerance + cwd: process.cwd(), + timeoutMs: 1_500_000, // 25 min + env: { QUESTION_TUNING: 'false', EXPLAIN_LEVEL: 'default' }, + }); + + try { + if (!['plan_ready', 'completion_summary', 'ceiling_reached'].includes(obs.outcome)) { + throw new Error( + `split-overflow test FAILED: outcome=${obs.outcome}\n` + + `step0=${obs.step0Count} review=${obs.reviewCount} elapsed=${obs.elapsedMs}ms\n` + + `--- evidence (last 3KB) ---\n${obs.evidence}`, + ); + } + if (obs.reviewCount < FLOOR) { + throw new Error( + `SPLIT-OVERFLOW REGRESSION: reviewCount=${obs.reviewCount} < FLOOR=${FLOOR}.\n` + + `Agent surfaced fewer review-phase AUQs than independent scope options.\n` + + `This is the original drop-to-fit-4-options failure mode:\n` + + ` expected: ${N} per-option calls (or compliant ≤4-group batching with follow-up)\n` + + ` got: ${obs.reviewCount} call(s)\n` + + `Most likely the agent dropped one option to fit Conductor's 4-option\n` + + `cap, the exact bug scripts/resolvers/preamble/generate-ask-user-format.ts\n` + + `"Handling 5+ options — split, never drop" exists to prevent.\n` + + `Review-phase fingerprints:\n` + + obs.fingerprints + .filter((f) => !f.preReview) + .map((f) => ` - "${f.promptSnippet.slice(0, 80)}"`) + .join('\n') + + `\n--- evidence (last 3KB) ---\n${obs.evidence}`, + ); + } + } finally { + try { + fs.rmSync(PLAN_PATH, { force: true }); + } catch { + /* best-effort */ + } + } + }, + 1_700_000, + ); +}); diff --git a/test/skill-size-budget.test.ts b/test/skill-size-budget.test.ts index a22550d3f6..f86f8c5f4f 100644 --- a/test/skill-size-budget.test.ts +++ b/test/skill-size-budget.test.ts @@ -1,15 +1,20 @@ /** * Per-skill SKILL.md size budget regression (v1.46.0.0 T5). * - * Asserts that no skill's generated SKILL.md grew beyond the v1.44.1 + * Asserts that no skill's generated SKILL.md grew beyond the v1.47.0.0 * baseline. Catches preamble/resolver changes that bloat skills back to * the pre-compression size. Free — pure file IO + JSON diff. * + * Baseline rebased v1.44.1 → v1.47.0.0 in the AskUserQuestion split-rule + * PR after main merged GSTACK_PLAN_MODE + /spec, pushing the v1.44.1 + * anchor past the 5% ratchet. Historical v1.44.1.json and v1.46.0.0.json + * are retained in test/fixtures/ for reference. + * * Why a separate test from skill-budget-regression.test.ts: that one * compares LIVE eval runs (tool calls, turns, cost); this one compares * static SKILL.md sizes. Both gate-tier. * - * The baseline lives at test/fixtures/parity-baseline-v1.44.1.json, + * The baseline lives at test/fixtures/parity-baseline-v1.47.0.0.json, * captured by scripts/capture-baseline.ts before any Phase A work landed. * * Override: @@ -30,7 +35,7 @@ import { captureBaseline, type ParityBaseline } from './helpers/capture-parity-b import { logBudgetOverride } from './helpers/budget-override'; const REPO_ROOT = path.resolve(import.meta.dir, '..'); -const BASELINE_PATH = path.join(REPO_ROOT, 'test', 'fixtures', 'parity-baseline-v1.44.1.json'); +const BASELINE_PATH = path.join(REPO_ROOT, 'test', 'fixtures', 'parity-baseline-v1.47.0.0.json'); // Default per-skill ratio is 1.05 (5% growth tolerance). T4 catalog trim // MOVES text from frontmatter (always-loaded catalog) to a body section @@ -49,11 +54,11 @@ interface Regression { } describe('SKILL.md size budget regression (gate, free)', () => { - test('parity-baseline-v1.44.1.json exists', () => { + test('parity-baseline-v1.47.0.0.json exists', () => { expect(fs.existsSync(BASELINE_PATH)).toBe(true); }); - test('no skill exceeds v1.44.1 baseline size × ratio', () => { + test('no skill exceeds v1.47.0.0 baseline size × ratio', () => { const baseline: ParityBaseline = JSON.parse(fs.readFileSync(BASELINE_PATH, 'utf-8')); const current = captureBaseline({ repoRoot: REPO_ROOT }); @@ -94,7 +99,7 @@ describe('SKILL.md size budget regression (gate, free)', () => { ` ${r.skill}: ${r.beforeBytes} → ${r.afterBytes} bytes (×${r.growth.toFixed(2)})`, ).join('\n'); throw new Error( - `${regressions.length} skill(s) regressed past v1.44.1 baseline × ${RATIO}:\n${msg}\n` + + `${regressions.length} skill(s) regressed past v1.47.0.0 baseline × ${RATIO}:\n${msg}\n` + `Override: set GSTACK_SIZE_BUDGET_OVERRIDE_REASON="why this is OK" to allow and audit-log.`, ); }); @@ -120,7 +125,7 @@ describe('SKILL.md size budget regression (gate, free)', () => { return; } throw new Error( - `Total corpus regressed past v1.44.1 baseline × ${RATIO}: ` + + `Total corpus regressed past v1.47.0.0 baseline × ${RATIO}: ` + `${baseline.totalCorpusBytes} → ${current.totalCorpusBytes} bytes (×${ratio.toFixed(3)}). ` + `Override: set GSTACK_SIZE_BUDGET_OVERRIDE_REASON to allow.`, ); @@ -130,13 +135,13 @@ describe('SKILL.md size budget regression (gate, free)', () => { * Gap E (v1.46.0.0): per-skill min-size floor. * * The existing skill-coverage-floor enforces body ≥ 200 bytes, which is - * a tiny noise floor. A skill that was 100 KB at v1.44.1 and shrinks to + * a tiny noise floor. A skill that was 100 KB at v1.47.0.0 and shrinks to * 250 bytes passes that check despite losing 99.75% of content. The * parity-suite content invariants cover this for 10 hand-picked skills * (cso, ship, plan-ceo, etc.); the remaining 41 skills had no per-skill * shrinkage floor. * - * Floor: 80% of the v1.44.1 baseline. v1.46 actual shrinkage is <1% per + * Floor: 80% of the v1.47.0.0 baseline. v1.46 actual shrinkage is <1% per * skill, so this is a comfortable ceiling that still catches accidental * mass deletion (e.g., a refactor that strips the body of a skill). * @@ -146,7 +151,7 @@ describe('SKILL.md size budget regression (gate, free)', () => { * skeletons. When that lands, add them to SECTIONS_EXTRACTED so the floor * relaxes for them. */ - test('no skill shrinks past 80% of v1.44.1 baseline (catches accidental body strip)', () => { + test('no skill shrinks past 80% of v1.47.0.0 baseline (catches accidental body strip)', () => { const baseline: ParityBaseline = JSON.parse(fs.readFileSync(BASELINE_PATH, 'utf-8')); const current = captureBaseline({ repoRoot: REPO_ROOT }); const MIN_RATIO = 0.80; // a skill at <80% of its v1.44 size signals mass-deletion @@ -187,7 +192,7 @@ describe('SKILL.md size budget regression (gate, free)', () => { ` ${u.skill}: ${u.beforeBytes} → ${u.afterBytes} bytes (×${u.ratio.toFixed(2)} — below ${MIN_RATIO} floor)`, ).join('\n'); throw new Error( - `${undershoots.length} skill(s) shrunk past v1.44.1 × ${MIN_RATIO} floor:\n${msg}\n` + + `${undershoots.length} skill(s) shrunk past v1.47.0.0 × ${MIN_RATIO} floor:\n${msg}\n` + `This usually signals accidental body strip (e.g., a resolver returning empty, a ` + `template losing a section). If the shrinkage is intentional (e.g., the skill moved ` + `to the sections/ pattern), add it to SECTIONS_EXTRACTED in this test. Override: ` + diff --git a/test/touchfiles.test.ts b/test/touchfiles.test.ts index 416ef05763..5e3a8c3e0a 100644 --- a/test/touchfiles.test.ts +++ b/test/touchfiles.test.ts @@ -105,8 +105,12 @@ describe('selectTests', () => { expect(result.selected).toContain('auto-decide-preserved'); // v1.27+ gate-tier reviewCount-floor regression for transcript bug expect(result.selected).toContain('plan-ceo-finding-floor'); - expect(result.selected.length).toBe(21); - expect(result.skipped.length).toBe(Object.keys(E2E_TOUCHFILES).length - 21); + // garrytan/askuserquestion-split-on-overflow: split-overflow periodic + // E2E test also depends on plan-ceo-review/** (5-option scope decision + // regression for the "drop to fit 4 options" failure mode). + expect(result.selected).toContain('plan-ceo-split-overflow'); + expect(result.selected.length).toBe(22); + expect(result.skipped.length).toBe(Object.keys(E2E_TOUCHFILES).length - 22); }); test('global touchfile triggers ALL tests', () => {