You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* fix: check for bun before running setup (garrytan#147)
Users without bun installed got a cryptic "command not found" error.
Now prints a clear message with install instructions.
Closesgarrytan#147
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: block SSRF via URL validation in browse commands (garrytan#17)
Adds validateNavigationUrl() that blocks non-HTTP(S) schemes (file://,
javascript:, data:) and cloud metadata endpoints (169.254.169.254,
metadata.google.internal). Applied to goto, diff, and newTab commands.
Localhost and private IPs remain allowed for local dev QA.
Closesgarrytan#17
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: replace eval $(gstack-slug) with source <(...) (garrytan#133)
Eliminates unnecessary use of eval across all skill templates and
generated files. source <(...) has identical behavior without the
shell injection surface. Also hardens gstack-diff-scope usage.
Closesgarrytan#133
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: rename /debug to /investigate to avoid Claude Code conflict (garrytan#190)
Claude Code has a built-in /debug command that shadows the gstack skill.
Renaming to /investigate which better reflects the systematic root-cause
investigation methodology.
Closesgarrytan#190
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* test: add unit tests for path validation helpers
validateOutputPath() and validateReadPath() are security-critical
functions with zero test coverage. Adds 14 tests covering safe paths,
traversal attacks, and prefix collision edge cases.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* chore: bump version and changelog (v0.8.3)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* docs: update /debug → /investigate references in docs
CLAUDE.md, README.md, and docs/skills.md still referenced the old
/debug skill name after the rename.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: harden URL validation against hostname bypasses (Codex P1)
Codex review found that metadata IPs could be reached via hex
(0xA9FEA9FE), decimal (2852039166), octal, trailing dot, and IPv6
bracket forms. Now normalizes hostnames before checking the blocklist
and probes numeric IP representations via URL constructor.
Also moves URL validation before page allocation in newTab() to
prevent zombie tabs on rejection (Codex P3).
5 new test cases for bypass variants.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy file name to clipboardExpand all lines: CHANGELOG.md
+10-3Lines changed: 10 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -18,6 +18,14 @@
18
18
-**`skip_eng_review` respected everywhere.** If you've opted out of eng review globally, the chaining recommendations won't nag you about it.
19
19
-**Design review lite now tracks commits too.** The lightweight design check that runs inside `/review` and `/ship` gets the same staleness tracking as full reviews.
20
20
21
+
### Fixed
22
+
23
+
-**Browse no longer navigates to dangerous URLs.**`goto`, `diff`, and `newtab` now block `file://`, `javascript:`, `data:` schemes and cloud metadata endpoints (`169.254.169.254`, `metadata.google.internal`). Localhost and private IPs are still allowed for local QA testing. (Closes #17)
24
+
-**Setup script tells you what's missing.** Running `./setup` without `bun` installed now shows a clear error with install instructions instead of a cryptic "command not found." (Closes #147)
25
+
-**`/debug` renamed to `/investigate`.** Claude Code has a built-in `/debug` command that shadowed the gstack skill. The systematic root-cause debugging workflow now lives at `/investigate`. (Closes #190)
26
+
-**Shell injection surface removed.** All skill templates now use `source <(gstack-slug)` instead of `eval $(gstack-slug)`. Same behavior, no `eval`. (Closes #133)
27
+
-**25 new security tests.** URL validation (16 tests) and path traversal validation (14 tests) now have dedicated unit test suites covering scheme blocking, metadata IP blocking, directory escapes, and prefix collision edge cases.
28
+
21
29
## [0.8.2] - 2026-03-19
22
30
23
31
### Added
@@ -86,7 +94,6 @@ When both `/review` (Claude) and `/codex review` have run, you get a cross-model
86
94
### Fixed
87
95
88
96
-`/debug` and `/office-hours` were completely invisible to natural language — no trigger phrases at all. Now both have full reactive + proactive triggers.
89
-
>>>>>>> origin/main
90
97
91
98
## [0.7.0] - 2026-03-18 — YC Office Hours
92
99
@@ -124,7 +131,7 @@ When something is broken and you don't know why, `/debug` is your systematic deb
124
131
### Added
125
132
126
133
-**Every PR touching frontend code now gets a design review automatically.**`/review` and `/ship` apply a 20-item design checklist against changed CSS, HTML, JSX, and view files. Catches AI slop patterns (purple gradients, 3-column icon grids, generic hero copy), typography issues (body text < 16px, blacklisted fonts), accessibility gaps (`outline: none`), and `!important` abuse. Mechanical CSS fixes are auto-applied; design judgment calls ask you first.
127
-
-**`gstack-diff-scope` categorizes what changed in your branch.** Run `eval $(gstack-diff-scope main)` and get `SCOPE_FRONTEND=true/false`, `SCOPE_BACKEND`, `SCOPE_PROMPTS`, `SCOPE_TESTS`, `SCOPE_DOCS`, `SCOPE_CONFIG`. Design review uses it to skip silently on backend-only PRs. Ship pre-flight uses it to recommend design review when frontend files are touched.
134
+
-**`gstack-diff-scope` categorizes what changed in your branch.** Run `source <(gstack-diff-scope main)` and get `SCOPE_FRONTEND=true/false`, `SCOPE_BACKEND`, `SCOPE_PROMPTS`, `SCOPE_TESTS`, `SCOPE_DOCS`, `SCOPE_CONFIG`. Design review uses it to skip silently on backend-only PRs. Ship pre-flight uses it to recommend design review when frontend files are touched.
128
135
-**Design review shows up in the Review Readiness Dashboard.** The dashboard now distinguishes between "LITE" (code-level, runs automatically in /review and /ship) and "FULL" (visual audit via /plan-design-review with browse binary). Both show up as Design Review entries.
129
136
-**E2E eval for design review detection.** Planted CSS/HTML fixtures with 7 known anti-patterns (Papyrus font, 14px body text, `outline: none`, `!important`, purple gradient, generic hero copy, 3-column feature grid). The eval verifies `/review` catches at least 4 of 7.
130
137
@@ -240,7 +247,7 @@ Read the philosophy: https://garryslist.org/posts/boil-the-ocean
240
247
## 0.5.1 — 2026-03-17
241
248
-**Know where you stand before you ship.** Every `/plan-ceo-review`, `/plan-eng-review`, and `/plan-design-review` now logs its result to a review tracker. At the end of each review, you see a **Review Readiness Dashboard** showing which reviews are done, when they ran, and whether they're clean — with a clear CLEARED TO SHIP or NOT READY verdict.
242
249
-**`/ship` checks your reviews before creating the PR.** Pre-flight now reads the dashboard and asks if you want to continue when reviews are missing. Informational only — it won't block you, but you'll know what you skipped.
243
-
-**One less thing to copy-paste.** The SLUG computation (that opaque sed pipeline for computing `owner-repo` from git remote) is now a shared `bin/gstack-slug` helper. All 14 inline copies across templates replaced with `eval $(gstack-slug)`. If the format ever changes, fix it once.
250
+
-**One less thing to copy-paste.** The SLUG computation (that opaque sed pipeline for computing `owner-repo` from git remote) is now a shared `bin/gstack-slug` helper. All 14 inline copies across templates replaced with `source <(gstack-slug)`. If the format ever changes, fix it once.
244
251
-**Screenshots are now visible during QA and browse sessions.** When gstack takes screenshots, they now show up as clickable image elements in your output — no more invisible `/tmp/browse-screenshot.png` paths you can't see. Works in `/qa`, `/qa-only`, `/plan-design-review`, `/qa-design-review`, `/browse`, and `/gstack`.
Copy file name to clipboardExpand all lines: README.md
+5-5Lines changed: 5 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -48,11 +48,11 @@ Expect first useful run in under 5 minutes on any repo with tests already set up
48
48
49
49
Open Claude Code and paste this. Claude does the rest.
50
50
51
-
> Install gstack: run **`git clone https://github.com/garrytan/gstack.git ~/.claude/skills/gstack && cd ~/.claude/skills/gstack && ./setup`** then add a "gstack" section to CLAUDE.md that says to use the /browse skill from gstack for all web browsing, never use mcp\_\_claude-in-chrome\_\_\* tools, and lists the available skills: /office-hours, /plan-ceo-review, /plan-eng-review, /plan-design-review, /design-consultation, /review, /ship, /browse, /qa, /qa-only, /design-review, /setup-browser-cookies, /retro, /debug, /document-release, /codex, /careful, /freeze, /guard, /unfreeze, /gstack-upgrade. Then ask the user if they also want to add gstack to the current project so teammates get it.
51
+
> Install gstack: run **`git clone https://github.com/garrytan/gstack.git ~/.claude/skills/gstack && cd ~/.claude/skills/gstack && ./setup`** then add a "gstack" section to CLAUDE.md that says to use the /browse skill from gstack for all web browsing, never use mcp\_\_claude-in-chrome\_\_\* tools, and lists the available skills: /office-hours, /plan-ceo-review, /plan-eng-review, /plan-design-review, /design-consultation, /review, /ship, /browse, /qa, /qa-only, /design-review, /setup-browser-cookies, /retro, /investigate, /document-release, /codex, /careful, /freeze, /guard, /unfreeze, /gstack-upgrade. Then ask the user if they also want to add gstack to the current project so teammates get it.
52
52
53
53
### Step 2: Add to your repo so teammates get it (optional)
54
54
55
-
> Add gstack to this project: run **`cp -Rf ~/.claude/skills/gstack .claude/skills/gstack && rm -rf .claude/skills/gstack/.git && cd .claude/skills/gstack && ./setup`** then add a "gstack" section to this project's CLAUDE.md that says to use the /browse skill from gstack for all web browsing, never use mcp\_\_claude-in-chrome\_\_\* tools, lists the available skills: /office-hours, /plan-ceo-review, /plan-eng-review, /plan-design-review, /design-consultation, /review, /ship, /browse, /qa, /qa-only, /design-review, /setup-browser-cookies, /retro, /debug, /document-release, /codex, /careful, /freeze, /guard, /unfreeze, /gstack-upgrade, and tells Claude that if gstack skills aren't working, run `cd .claude/skills/gstack && ./setup` to build the binary and register skills.
55
+
> Add gstack to this project: run **`cp -Rf ~/.claude/skills/gstack .claude/skills/gstack && rm -rf .claude/skills/gstack/.git && cd .claude/skills/gstack && ./setup`** then add a "gstack" section to this project's CLAUDE.md that says to use the /browse skill from gstack for all web browsing, never use mcp\_\_claude-in-chrome\_\_\* tools, lists the available skills: /office-hours, /plan-ceo-review, /plan-eng-review, /plan-design-review, /design-consultation, /review, /ship, /browse, /qa, /qa-only, /design-review, /setup-browser-cookies, /retro, /investigate, /document-release, /codex, /careful, /freeze, /guard, /unfreeze, /gstack-upgrade, and tells Claude that if gstack skills aren't working, run `cd .claude/skills/gstack && ./setup` to build the binary and register skills.
56
56
57
57
Real files get committed to your repo (not a submodule), so `git clone` just works. Everything lives inside `.claude/`. Nothing touches your PATH or runs in the background.
58
58
@@ -117,7 +117,7 @@ One sprint, one person, one feature — that takes about 30 minutes with gstack.
117
117
|`/plan-design-review`|**Senior Designer**| Rates each design dimension 0-10, explains what a 10 looks like, then edits the plan to get there. AI Slop detection. Interactive — one AskUserQuestion per design choice. |
118
118
|`/design-consultation`|**Design Partner**| Build a complete design system from scratch. Knows the landscape, proposes creative risks, generates realistic product mockups. Design at the heart of all other phases. |
119
119
|`/review`|**Staff Engineer**| Find the bugs that pass CI but blow up in production. Auto-fixes the obvious ones. Flags completeness gaps. |
120
-
|`/debug`|**Debugger**| Systematic root-cause debugging. Iron Law: no fixes without investigation. Traces data flow, tests hypotheses, stops after 3 failed fixes. |
120
+
|`/investigate`|**Debugger**| Systematic root-cause debugging. Iron Law: no fixes without investigation. Traces data flow, tests hypotheses, stops after 3 failed fixes. |
121
121
|`/design-review`|**Designer Who Codes**| Same audit as /plan-design-review, then fixes what it finds. Atomic commits, before/after screenshots. |
122
122
|`/qa`|**QA Lead**| Test your app, find bugs, fix them with atomic commits, re-verify. Auto-generates regression tests for every fix. |
123
123
|`/qa-only`|**QA Reporter**| Same methodology as /qa but report only. Use when you want a pure bug report without code changes. |
@@ -158,7 +158,7 @@ One sprint, one person, one feature — that takes about 30 minutes with gstack.
158
158
159
159
**Multi-AI second opinion.**`/codex` gets an independent review from OpenAI's Codex CLI — a completely different AI looking at the same diff. Three modes: code review with a pass/fail gate, adversarial challenge that actively tries to break your code, and open consultation with session continuity. When both `/review` (Claude) and `/codex` (OpenAI) have reviewed the same branch, you get a cross-model analysis showing which findings overlap and which are unique to each.
160
160
161
-
**Safety guardrails on demand.** Say "be careful" and `/careful` warns before any destructive command — rm -rf, DROP TABLE, force-push, git reset --hard. `/freeze` locks edits to one directory while debugging so Claude can't accidentally "fix" unrelated code. `/guard` activates both. `/debug` auto-freezes to the module being investigated.
161
+
**Safety guardrails on demand.** Say "be careful" and `/careful` warns before any destructive command — rm -rf, DROP TABLE, force-push, git reset --hard. `/freeze` locks edits to one directory while debugging so Claude can't accidentally "fix" unrelated code. `/guard` activates both. `/investigate` auto-freezes to the module being investigated.
162
162
163
163
**Proactive skill suggestions.** gstack notices what stage you're in — brainstorming, reviewing, debugging, testing — and suggests the right skill. Don't like it? Say "stop suggesting" and it remembers across sessions.
164
164
@@ -213,7 +213,7 @@ Fifteen specialists and six power tools. All slash commands. All Markdown. All f
213
213
Use /browse from gstack for all web browsing. Never use mcp__claude-in-chrome__* tools.
214
214
Available skills: /office-hours, /plan-ceo-review, /plan-eng-review, /plan-design-review,
Copy file name to clipboardExpand all lines: TODOS.md
+6-6Lines changed: 6 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -512,25 +512,25 @@ Shipped as `/careful`, `/freeze`, `/guard`, and `/unfreeze` in v0.6.5. Includes
512
512
513
513
Shipped in v0.6.5. TemplateContext in gen-skill-docs.ts bakes skill name into preamble telemetry line. Analytics CLI (`bun run analytics`) for querying. /retro integration shows skills-used-this-week.
514
514
515
-
### /debug scoped debugging enhancements (gated on telemetry)
515
+
### /investigate scoped debugging enhancements (gated on telemetry)
516
516
517
-
**What:** Six enhancements to /debug auto-freeze, contingent on telemetry showing the freeze hook actually fires in real debugging sessions.
517
+
**What:** Six enhancements to /investigate auto-freeze, contingent on telemetry showing the freeze hook actually fires in real debugging sessions.
518
518
519
-
**Why:** /debug v0.7.1 auto-freezes edits to the module being debugged. If telemetry shows the hook fires often, these enhancements make the experience smarter. If it never fires, the problem wasn't real and these aren't worth building.
519
+
**Why:** /investigate v0.7.1 auto-freezes edits to the module being debugged. If telemetry shows the hook fires often, these enhancements make the experience smarter. If it never fires, the problem wasn't real and these aren't worth building.
520
520
521
-
**Context:** All items are prose additions to `debug/SKILL.md.tmpl`. No new scripts.
521
+
**Context:** All items are prose additions to `investigate/SKILL.md.tmpl`. No new scripts.
0 commit comments