Skip to content

Commit f8bb590

Browse files
garrytanjayzalowitzclaude
authored
v1.47.0.0 feat: /spec — author backlog-ready spec in 5 phases + optional agent spawn (garrytan#1698) (garrytan#1733)
* feat(issue): add /issue skill for backlog-ready GitHub issue authoring Interrogates an ambiguous request through five strict phases (why, scope, technical, draft, final) and produces a GitHub issue precise enough that an unfamiliar engineer or AI agent can execute it without follow-up. Slots in after /office-hours (when the idea has passed the "worth building" bar) and before /plan-eng-review (which assumes a plan already exists). - issue/SKILL.md.tmpl + generated SKILL.md - routing entry in root SKILL.md.tmpl - llms.txt regenerated to include the new skill * chore(spec): rename /issue → /spec + fix duplicate analytics block Foundation commit for the /spec skill (extends PR garrytan#1698 by @jayzalowitz). - Renames issue/ → spec/ (template + generated) - Removes the hand-rolled analytics block in spec/SKILL.md.tmpl (lines 46-49 of the original); {{PREAMBLE}} already emits the analytics write with the telemetry opt-out guard, so the duplicate would have bypassed gstack-config set telemetry off - Updates frontmatter (name: spec, expanded description with magical-moment preview, triggers reordered to lead with "spec this out") - Updates root SKILL.md.tmpl routing entry → /spec - Regenerates spec/SKILL.md and gstack/llms.txt via bun run gen:skill-docs Co-Authored-By: Jay Zalowitz <jayzalowitz@gmail.com> Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(spec): expansions — flags, archive, quality gate, plan-mode-aware Phase 5, /ship integration, tests Builds on the @jayzalowitz foundation (commit a4e6ee3) with the full expansion set from CEO + Eng + DX review (24 user decisions + 23 of 28 codex adversarial findings). spec/SKILL.md.tmpl additions: - Flag reference table (--dedupe / --no-gate / --audit / --execute / --no-execute / --file-only / --plan-file / --sync-archive). - Phase 1b --dedupe (default ON): gh issue list --search with graceful skip on gh-not-installed / unauthed / rate-limited / other errors. AskUserQuestion when matches found (merge / file-new / cancel). - Phase 3 HARD requirement: agent MUST grep/read at least one piece of evidence before asking. Project-level fallback prose for prompts with no concrete file mapping. Greenfield escape clause. - Phase 4.5 quality gate (default ON): codex adversarial dispatch with fail-closed redaction (AWS/GitHub/Anthropic/OpenAI/private-key regex), hard <<<USER_SPEC>>> delimiters + instruction boundary (prompt-injection defense), score 0-10 with <7 block, up to 3 iterations, AskUserQuestion escape on persistent <7 (ship anyway / save draft / one more try). - Phase 5 plan-mode-aware dispatch: reads GSTACK_PLAN_MODE env. Active → file-only + load into plan file. Inactive → file + --execute spawn by default. CLI overrides for explicit control. - Archive block via eval $(gstack-paths) → $GSTACK_STATE_ROOT/projects/ $SLUG/specs/<datetime>-<pid>-<slug>.md. Atomic .tmp/mv write. Sync excluded by default; --sync-archive to opt in. - --execute path: dirty-worktree gate (porcelain check + 3-option AUQ continue/stash/cancel), TOCTOU re-check after AUQ answer, SHA pin via git rev-parse HEAD, unique branch spec/<slug>-$$ + PID-suffixed worktree, mandatory final-confirm gate, stash policy with restore safety (preserve ref, never auto-drop). - TTHW timestamps captured at Phase 1 / first citation / file-or-spawn, emitted as ttfc_ms + tthw_ms in preamble telemetry envelope. Cross-system plumbing: - scripts/resolvers/preamble/generate-preamble-bash.ts: emit GSTACK_PLAN_MODE=active|inactive based on CLAUDE_PLAN_FILE presence. - scripts/resolvers/preamble/generate-routing-injection.ts: add /spec to the routing block injected into project CLAUDE.md. - ship/SKILL.md.tmpl: new "Linked Spec" PR-body section. Reads archive frontmatter spec_issue_number and adds Closes #N when full delivery confirmed by existing plan-completion gate (codex F4 — conditional). Branch-name inference NOT used (codex F3 — fragile under rebase). Tests (W7): - test/spec-template-invariants.test.ts: 35 deterministic assertions covering Phase 1 hard gate, Phase 3 hard-grep mandate, --dedupe graceful-skip paths, --execute race + security hardening (TOCTOU, SHA pin, unique branch), quality-gate redaction + BLOCKED path, archive atomic write + sync exclusion, plan-mode-aware Phase 5. - test/spec-template-sync.test.ts: regen + byte-identical check. - test/skill-e2e-spec-execute.test.ts (periodic-tier scaffold). - test/skill-llm-eval-spec.test.ts (periodic-tier scaffold). - test/helpers/touchfiles.ts: register both periodics in E2E_TIERS + LLM_JUDGE_TOUCHFILES. 37/37 /spec tests pass. Full bun test exit 0 (pre-existing url-validation timeout unrelated to /spec). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore: v1.45.0.0 — regen all SKILL.md, bump VERSION, CHANGELOG entry Mechanical regen pulling in two template-side changes: - /spec expansion (spec/SKILL.md picks up ~1100 new lines) - {{PREAMBLE}} now echoes GSTACK_PLAN_MODE env (every skill picks up the new echo line in the preamble bash block) VERSION 1.44.0.0 → 1.45.0.0 (MINOR per scale-aware rules: substantial new capability — /spec skill with 5 CLI flags + race/security hardening + plan-mode-aware Phase 5 + /ship integration). CHANGELOG entry frames /spec as agent feedstock with the two-line headline, "numbers that matter" table, and "what this means for builders" close. Credits @jayzalowitz for the foundation contribution. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore(spec): register /spec in scripts/proactive-suggestions.json Auto-generated by bun run gen:skill-docs after the v1.46 catalog-trim contract picked up /spec's frontmatter. lead + routing extracted from spec/SKILL.md.tmpl description: block. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore(spec): TODOS deferrals + package.json sync for v1.47.0.0 - TODOS.md: add P2 entry for /spec --epic mode (deferred from CEO SCOPE EXPANSION review), P3 entry for --dedupe semantic matching upgrade. Both have full context blocks so future picker can resume cold. - package.json: bump 1.46.0.0 → 1.47.0.0 to match VERSION (was stale from the main merge; /ship Step 12 idempotency caught it). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs: register /spec skill in README, AGENTS, CLAUDE.md project tree Adds /spec to the three discoverability surfaces it was missing: - README.md sprint skills table (between /autoplan and /learn) - AGENTS.md plan-mode reviews table - CLAUDE.md project structure tree (between /investigate and /retro) /spec shipped in v1.47.0.0 with CHANGELOG coverage but the entry-point docs hadn't been updated; a user landing on README or AGENTS would not discover the skill exists without reading CHANGELOG. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Jay Zalowitz <jayzalowitz@gmail.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 22f8c7f commit f8bb590

68 files changed

Lines changed: 4018 additions & 2 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

AGENTS.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,7 @@ Invoke them by name (e.g., `/office-hours`).
2121
| `/plan-tune` | Self-tune AskUserQuestion sensitivity per question. |
2222
| `/autoplan` | One command runs CEO → design → eng → DX review. |
2323
| `/design-consultation` | Build a complete design system from scratch. |
24+
| `/spec` | Turn vague intent into a precise, executable spec in five phases. Files a GitHub issue, optionally spawns a Claude Code agent in a fresh worktree, and lets `/ship` close the source issue on merge. |
2425

2526
### Implementation + review
2627

CHANGELOG.md

Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,61 @@
11
# Changelog
22

3+
## [1.47.0.0] - 2026-05-26
4+
5+
## **`/spec` ships: turn vague intent into a precise, executable spec in five phases.** Pipe the spec into a spawned Claude Code agent, dedupe against existing issues, archive locally for the team corpus, and let `/ship` close the source issue on merge.
6+
7+
A precise spec collapses an agent's clarification roundtrips from N to zero. `/spec` is the verb that turns thoughts into commits: five strict phases (why, scope, technical with mandatory code-reading, draft, file), a codex quality gate before file, archive to `$GSTACK_STATE_ROOT/projects/$SLUG/specs/`, and optional pipeline-mode spawn into a fresh worktree. Plan-mode aware: in plan mode `/spec` files the issue and loads the spec into your active plan file; in execution mode it files the issue and spawns `claude -p` in a fresh worktree by default. `/ship` reads the archive frontmatter and auto-closes the source issue on full delivery. Adapted from a community-contributed `/issue` skill (PR #1698 by @jayzalowitz) with rename, race+security hardening, and DX polish.
8+
9+
`/spec` is the first skill registered against the v1.46 eval-first floor (`test/skill-coverage-matrix.ts`), passing all six structural floor checks plus 37 deterministic invariant assertions specific to `/spec`'s contract. Skill catalog count: 51 → 52.
10+
11+
### The numbers that matter
12+
13+
Source: 1 contributor commit + 8 follow-on bundled fixes/expansions on this branch (`git log v1.46.0.0..HEAD --oneline`). Template at `spec/SKILL.md.tmpl` (404 → ~750 lines after expansions), 4 new test files (37 deterministic scenarios + 2 periodic-tier stubs).
14+
15+
| Capability | Without `/spec` | With `/spec` |
16+
|---|---|---|
17+
| Author backlog-ready issue | freehand prose, sloppy AC, no file refs, 10-15 min per issue | 5-phase interrogation with hard-grep Phase 3, file-refs at `path:line`, quantified impact, ~4 min |
18+
| Spec → agent execution | copy-paste into new `claude -p` session, ~30s context-switch friction | `--execute` spawns automatically in fresh worktree `spec/<slug>-$$`, zero hand-off |
19+
| Catch ambiguity before file | none (you find out when the implementer asks) | codex quality gate scores 0-10, blocks below 7, lists ambiguities, up to 3 iterations |
20+
| Secret leakage to second-AI judge | possible if spec contains pasted secret | fail-closed redaction blocks dispatch on AWS/GitHub/Anthropic key patterns + private key blocks |
21+
| Concurrent `/spec` runs | branch/archive collisions on same second | unique `spec/<slug>-$$` branches, atomic `.tmp/mv` archive write, PID-suffixed filenames |
22+
| Linked issue closure | manual `Closes #N` in PR body | `/ship` auto-adds when archive present AND full spec delivered |
23+
24+
### What this means for builders
25+
26+
Type `/spec` on a vague bug; four minutes later you have a filed GitHub issue with file refs and a Claude Code agent already executing it in a fresh worktree. When `/ship` lands the PR, the source issue auto-closes. The corpus of past specs in `$GSTACK_STATE_ROOT/projects/$SLUG/specs/` is mineable by `gbrain` for cross-session pattern recall. `--no-gate`, `--no-execute`, `--file-only`, and `--plan-file <path>` are escape hatches when the defaults don't fit; `--audit` routes to the Audit/Cleanup template structure.
27+
28+
### Itemized changes
29+
30+
#### Added
31+
- `/spec` skill (renamed from contributor's `/issue`): five-phase interrogation producing backlog-ready specs. Lives at `spec/SKILL.md.tmpl`.
32+
- `--dedupe` flag (default ON): `gh issue list --search` before drafting, surfaces near-duplicates via AskUserQuestion; graceful skip on `gh` missing/unauthed/rate-limited.
33+
- `--execute` flag (default ON in execution mode): spawns `claude -p` in a fresh Conductor worktree on branch `spec/<slug>-$$`, with dirty-worktree gate, TOCTOU re-check after AskUserQuestion answer, SHA pin via `git rev-parse HEAD`, and mandatory final-confirm gate.
34+
- Quality-score gate (default ON): codex adversarial dispatch with hard `<<<USER_SPEC>>>` delimiter + instruction boundary, score 0-10, blocks at <7, up to 3 iterations, AskUserQuestion escape on persistent <7 (ship anyway / save draft / one more try).
35+
- Fail-closed redaction in quality gate: regex match against AWS access keys (`AKIA...`), GitHub tokens (`ghp_/gho_/ghs_`), Anthropic keys (`sk-ant-...`), OpenAI keys, `.env`-style `KEY=value`, and `-----BEGIN ... PRIVATE KEY-----` blocks → block dispatch entirely. Raw spec never persisted to archive or transcript on block.
36+
- `--audit` flag routes Phase 5 to the Audit/Cleanup template structure.
37+
- `--file-only` / `--no-execute` / `--plan-file <path>` overrides for plan-mode-aware Phase 5 default.
38+
- `--sync-archive` opt-in for cross-machine spec sync (archives stay local by default; `/specs/` excluded from artifacts-sync allowlist).
39+
- Spec archive: writes to `$GSTACK_STATE_ROOT/projects/$SLUG/specs/<datetime>-<pid>-<slug>.md` via existing `gstack-paths` resolver (handles `GSTACK_HOME`, `CLAUDE_PLUGIN_DATA`, Windows fallback). Atomic `.tmp/mv` write prevents collision on concurrent runs.
40+
- `GSTACK_PLAN_MODE` env var: emitted by `{{PREAMBLE}}` based on `CLAUDE_PLAN_FILE` presence. Skills can branch behavior on plan-mode state without parsing system reminders.
41+
- `/spec` entry in the gstack routing block injected into project CLAUDE.md.
42+
- `/ship` PR body integration: reads `spec_issue_number` from archive frontmatter and adds `Closes #N` when the spec is fully delivered per the existing plan-completion gate. Partial delivery emits a "Linked to #N (not auto-closing)" notice instead.
43+
- `/spec` entry in `test/skill-coverage-matrix.ts` (52nd skill, eval-first floor compliance per v1.46 contract).
44+
45+
#### Tests
46+
- `test/spec-template-invariants.test.ts`: 35 deterministic invariants covering Phase 1 hard gate, Phase 3 hard-grep mandate, `--dedupe` graceful-skip paths, `--execute` race + security hardening (TOCTOU re-check, SHA pin, unique branch), quality-gate redaction patterns and BLOCKED path, archive atomic write + sync exclusion, plan-mode-aware Phase 5 dispatch.
47+
- `test/spec-template-sync.test.ts`: regenerates `spec/SKILL.md` and asserts byte-identical output (prevents template-vs-generated drift).
48+
- `test/skill-e2e-spec-execute.test.ts` (periodic-tier): full `/spec --execute` pipeline scaffold registered in `E2E_TIERS`.
49+
- `test/skill-llm-eval-spec.test.ts` (periodic-tier): authored-spec quality eval against the 14-Quality-Standards rubric.
50+
51+
#### Fixed
52+
- Duplicate analytics block in `spec/SKILL.md.tmpl` (was bypassing the `_TEL != "off"` opt-out gate; `{{PREAMBLE}}` already emits the analytics write with the guard).
53+
54+
#### For contributors
55+
- Community contribution: PR #1698 by @jayzalowitz (Jay Zalowitz) lands as the foundation commit with original authorship preserved. Contributor's 5 phases, 14 Quality Standards, and Standard/Epic/Audit templates carried forward intact; expansions are additive.
56+
- Plan reviewed across `/plan-ceo-review` (SCOPE EXPANSION, 5 of 6 expansions accepted), `/plan-eng-review` (race + security hardening), and `/plan-devex-review` (persona, magical moment, error-message Tier 1, plan-mode-aware Phase 5).
57+
- 28 codex adversarial findings across 3 review rounds, 23 accepted.
58+
359
## [1.46.0.0] - 2026-05-26
460

561
## **gstack v2 foundation lands. Catalog tokens drop 56%, eval-first floor covers all 51 skills, hard token + dollar caps gate every PR.**

CLAUDE.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -111,6 +111,7 @@ gstack/
111111
├── land-and-deploy/ # /land-and-deploy skill (merge → deploy → canary verify)
112112
├── office-hours/ # /office-hours skill (YC Office Hours — startup diagnostic + builder brainstorm)
113113
├── investigate/ # /investigate skill (systematic root-cause debugging)
114+
├── spec/ # /spec skill (five-phase spec → GitHub issue, optional agent spawn, /ship auto-closes)
114115
├── retro/ # Retrospective skill (includes /retro global cross-project mode)
115116
├── bin/ # CLI utilities (gstack-repo-mode, gstack-slug, gstack-config, etc.)
116117
├── document-release/ # /document-release skill (post-ship doc updates + Diataxis coverage map)

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -204,6 +204,7 @@ Each skill feeds into the next. `/office-hours` writes a design doc that `/plan-
204204
| `/browse` | **QA Engineer** | Give the agent eyes. Real Chromium browser, real clicks, real screenshots. ~100ms per command. `/open-gstack-browser` launches GStack Browser with sidebar, anti-bot stealth, and auto model routing. |
205205
| `/setup-browser-cookies` | **Session Manager** | Import cookies from your real browser (Chrome, Arc, Brave, Edge) into the headless session. Test authenticated pages. |
206206
| `/autoplan` | **Review Pipeline** | One command, fully reviewed plan. Runs CEO → design → eng review automatically with encoded decision principles. Surfaces only taste decisions for your approval. |
207+
| `/spec` | **Spec Author** | Turn vague intent into a precise, executable spec in five phases (why, scope, technical with mandatory code-reading, draft, file). Codex quality gate before file (blocks below 7/10), fail-closed secret redaction, dedupe against existing issues, archive to `$GSTACK_STATE_ROOT/projects/$SLUG/specs/` for team-corpus recall. `--execute` spawns `claude -p` in a fresh worktree; `/ship` auto-closes the source issue on merge. Plan-mode aware. |
207208
| `/learn` | **Memory** | Manage what gstack learned across sessions. Review, search, prune, and export project-specific patterns, pitfalls, and preferences. Learnings compound across sessions so gstack gets smarter on your codebase over time. |
208209

209210
### Which review should I use?

SKILL.md

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -102,6 +102,19 @@ _CHECKPOINT_MODE=$(~/.claude/skills/gstack/bin/gstack-config get checkpoint_mode
102102
_CHECKPOINT_PUSH=$(~/.claude/skills/gstack/bin/gstack-config get checkpoint_push 2>/dev/null || echo "false")
103103
echo "CHECKPOINT_MODE: $_CHECKPOINT_MODE"
104104
echo "CHECKPOINT_PUSH: $_CHECKPOINT_PUSH"
105+
# Plan-mode hint for skills like /spec that branch behavior on plan-mode state.
106+
# Claude Code exposes plan mode via system reminders; we detect best-effort
107+
# from CLAUDE_PLAN_FILE (set by the harness when plan mode is active) and
108+
# fall back to "inactive". Codex hosts and Claude execution mode both end up
109+
# inactive, which is the safe default (defaults to file+execute pipeline).
110+
if [ -n "${CLAUDE_PLAN_FILE:-}${GSTACK_PLAN_MODE_FORCE:-}" ]; then
111+
export GSTACK_PLAN_MODE="active"
112+
elif [ "${GSTACK_PLAN_MODE:-}" = "active" ]; then
113+
export GSTACK_PLAN_MODE="active"
114+
else
115+
export GSTACK_PLAN_MODE="inactive"
116+
fi
117+
echo "GSTACK_PLAN_MODE: $GSTACK_PLAN_MODE"
105118
[ -n "$OPENCLAW_SESSION" ] && echo "SPAWNED_SESSION: true" || true
106119
```
107120

@@ -233,6 +246,7 @@ Key routing rules:
233246
- Ship/deploy/PR → invoke /ship or /land-and-deploy
234247
- Save progress → invoke /context-save
235248
- Resume context → invoke /context-restore
249+
- Author a backlog-ready spec/issue → invoke /spec
236250
```
237251

238252
Then commit the change: `git add CLAUDE.md && git commit -m "chore: add gstack skill routing rules to CLAUDE.md"`
@@ -490,6 +504,7 @@ quality gates that produce better results than answering inline.
490504

491505
**Routing rules — when you see these patterns, INVOKE the skill via the Skill tool:**
492506
- User describes a new idea, asks "is this worth building", brainstorms, pitches a concept → invoke `/office-hours`
507+
- User asks to spec something out, file an issue, write up a ticket, "turn this into a GitHub issue", "backlog item" → invoke `/spec`
493508
- User asks about strategy, scope, ambition, "think bigger", "what should we build" → invoke `/plan-ceo-review`
494509
- User asks to review architecture, lock in the plan, "does this design make sense" → invoke `/plan-eng-review`
495510
- User asks about design system, brand, visual identity, "how should this look" → invoke `/design-consultation`

SKILL.md.tmpl

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -32,6 +32,7 @@ quality gates that produce better results than answering inline.
3232

3333
**Routing rules — when you see these patterns, INVOKE the skill via the Skill tool:**
3434
- User describes a new idea, asks "is this worth building", brainstorms, pitches a concept → invoke `/office-hours`
35+
- User asks to spec something out, file an issue, write up a ticket, "turn this into a GitHub issue", "backlog item" → invoke `/spec`
3536
- User asks about strategy, scope, ambition, "think bigger", "what should we build" → invoke `/plan-ceo-review`
3637
- User asks to review architecture, lock in the plan, "does this design make sense" → invoke `/plan-eng-review`
3738
- User asks about design system, brand, visual identity, "how should this look" → invoke `/design-consultation`

TODOS.md

Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1768,6 +1768,49 @@ Shipped in v0.6.5. TemplateContext in gen-skill-docs.ts bakes skill name into pr
17681768
**Priority:** P2
17691769
**Depends on:** CDP patches proving the value of anti-bot stealth first
17701770

1771+
## /spec follow-ups (deferred from v1.47.0.0 via /plan-ceo-review SCOPE EXPANSION)
1772+
1773+
### P2: `/spec --epic` mode (parent issue + child issues + dependency graph)
1774+
1775+
**Priority:** P2
1776+
1777+
**What:** Add `--epic` flag that produces an Epic issue (parent) plus N child issues with explicit dependency graph and topological order. Emits multiple `gh issue create` calls with parent linkage in child bodies.
1778+
1779+
**Why:** Multi-week initiatives often span 3-5 specs that share context but ship sequentially. Today `/spec --epic` would let users author the full initiative in one session and file all linked issues atomically. The Epic template already exists in `spec/SKILL.md.tmpl` (carried over from PR #1698); only the flag routing + multi-issue `gh` orchestration is missing.
1780+
1781+
**Pros:**
1782+
- Closes the multi-issue workflow gap that `/spec` v1 doesn't cover.
1783+
- Parent + child linkage means project boards show the full initiative at-a-glance.
1784+
- Composes cleanly with existing `--execute` (spawn an agent on the parent epic; agent files children as it works).
1785+
1786+
**Cons:**
1787+
- More gh API surface (one create per child, parent-link edit pass).
1788+
- Dependency-graph rendering in markdown is fiddly across GitHub vs GitLab renderers.
1789+
1790+
**Context:** Considered in `/plan-ceo-review` SCOPE EXPANSION (D5), deferred 2026-05-25 in favor of shipping the 5 critical-path expansions (--execute, --dedupe, archive, quality gate, --audit). Re-evaluate once v1.47 ships and we see how often users hit "this should be 3 issues" in real /spec sessions.
1791+
1792+
**Depends on:** v1.47.0.0 `/spec` lands first; need real usage data to calibrate the multi-issue surface.
1793+
1794+
### P3: `/spec --dedupe` semantic matching (LLM-based) for v1.1
1795+
1796+
**Priority:** P3
1797+
1798+
**What:** Upgrade `--dedupe`'s string match against `gh issue list --search` to LLM-based semantic similarity. Today's v1 picks string overlap on title keywords; semantic match would catch "the sidebar terminal flakes on reload" matching an existing issue titled "PTY reconnect fails after extension restart" where keyword overlap is zero.
1799+
1800+
**Why:** String match has high precision but low recall — it misses near-duplicates with different vocabulary. LLM semantic match catches more dupes but costs ~$0.01-0.05 per spec dispatch and adds 5-10s latency.
1801+
1802+
**Pros:**
1803+
- Catches dupes string match misses.
1804+
- One more reason `/spec` is more useful than freehand authoring.
1805+
1806+
**Cons:**
1807+
- Paid + slower. Most v1 users probably don't hit enough false-negatives to justify the cost.
1808+
- Adds another LLM-judged decision to a skill that already has the quality gate.
1809+
1810+
**Context:** Considered in `/plan-ceo-review` build-time decisions; chose string match for v1 to keep the dedupe path free + fast. Revisit if v1 produces a meaningful false-negative rate in real use.
1811+
1812+
**Depends on:** v1.47.0.0 ships; gather real false-negative data from the v1 string matcher.
1813+
17711814
## Completed
17721815

17731816
### Slim preamble + real-PTY plan-mode E2E harness (v1.13.1.0)

VERSION

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
1.46.0.0
1+
1.47.0.0

autoplan/SKILL.md

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -111,6 +111,19 @@ _CHECKPOINT_MODE=$(~/.claude/skills/gstack/bin/gstack-config get checkpoint_mode
111111
_CHECKPOINT_PUSH=$(~/.claude/skills/gstack/bin/gstack-config get checkpoint_push 2>/dev/null || echo "false")
112112
echo "CHECKPOINT_MODE: $_CHECKPOINT_MODE"
113113
echo "CHECKPOINT_PUSH: $_CHECKPOINT_PUSH"
114+
# Plan-mode hint for skills like /spec that branch behavior on plan-mode state.
115+
# Claude Code exposes plan mode via system reminders; we detect best-effort
116+
# from CLAUDE_PLAN_FILE (set by the harness when plan mode is active) and
117+
# fall back to "inactive". Codex hosts and Claude execution mode both end up
118+
# inactive, which is the safe default (defaults to file+execute pipeline).
119+
if [ -n "${CLAUDE_PLAN_FILE:-}${GSTACK_PLAN_MODE_FORCE:-}" ]; then
120+
export GSTACK_PLAN_MODE="active"
121+
elif [ "${GSTACK_PLAN_MODE:-}" = "active" ]; then
122+
export GSTACK_PLAN_MODE="active"
123+
else
124+
export GSTACK_PLAN_MODE="inactive"
125+
fi
126+
echo "GSTACK_PLAN_MODE: $GSTACK_PLAN_MODE"
114127
[ -n "$OPENCLAW_SESSION" ] && echo "SPAWNED_SESSION: true" || true
115128
```
116129

@@ -242,6 +255,7 @@ Key routing rules:
242255
- Ship/deploy/PR → invoke /ship or /land-and-deploy
243256
- Save progress → invoke /context-save
244257
- Resume context → invoke /context-restore
258+
- Author a backlog-ready spec/issue → invoke /spec
245259
```
246260

247261
Then commit the change: `git add CLAUDE.md && git commit -m "chore: add gstack skill routing rules to CLAUDE.md"`

benchmark-models/SKILL.md

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -105,6 +105,19 @@ _CHECKPOINT_MODE=$(~/.claude/skills/gstack/bin/gstack-config get checkpoint_mode
105105
_CHECKPOINT_PUSH=$(~/.claude/skills/gstack/bin/gstack-config get checkpoint_push 2>/dev/null || echo "false")
106106
echo "CHECKPOINT_MODE: $_CHECKPOINT_MODE"
107107
echo "CHECKPOINT_PUSH: $_CHECKPOINT_PUSH"
108+
# Plan-mode hint for skills like /spec that branch behavior on plan-mode state.
109+
# Claude Code exposes plan mode via system reminders; we detect best-effort
110+
# from CLAUDE_PLAN_FILE (set by the harness when plan mode is active) and
111+
# fall back to "inactive". Codex hosts and Claude execution mode both end up
112+
# inactive, which is the safe default (defaults to file+execute pipeline).
113+
if [ -n "${CLAUDE_PLAN_FILE:-}${GSTACK_PLAN_MODE_FORCE:-}" ]; then
114+
export GSTACK_PLAN_MODE="active"
115+
elif [ "${GSTACK_PLAN_MODE:-}" = "active" ]; then
116+
export GSTACK_PLAN_MODE="active"
117+
else
118+
export GSTACK_PLAN_MODE="inactive"
119+
fi
120+
echo "GSTACK_PLAN_MODE: $GSTACK_PLAN_MODE"
108121
[ -n "$OPENCLAW_SESSION" ] && echo "SPAWNED_SESSION: true" || true
109122
```
110123

@@ -236,6 +249,7 @@ Key routing rules:
236249
- Ship/deploy/PR → invoke /ship or /land-and-deploy
237250
- Save progress → invoke /context-save
238251
- Resume context → invoke /context-restore
252+
- Author a backlog-ready spec/issue → invoke /spec
239253
```
240254

241255
Then commit the change: `git add CLAUDE.md && git commit -m "chore: add gstack skill routing rules to CLAUDE.md"`

0 commit comments

Comments
 (0)