Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
59 changes: 59 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,64 @@
# Changelog

## [1.45.0.0] - 2026-05-25

## **Design boards now live 24 hours, not 10 minutes. One daemon hosts every board, one tab survives the whole day.**

Run `$D compare --serve` and you get a persistent design daemon at `.gstack/design.json` instead of a fresh process per call. Open three design sessions across an afternoon and they all land at `/boards/<id>/` on the same port. The browser tab you opened first still works for the board you published an hour later. The idle timeout went from 10 minutes (the old per-process server) to 24 hours of inactivity (the daemon's lifetime). Submit a board, the URL stays accessible until the daemon idles out, so you can scroll back through the day's design history at `http://127.0.0.1:N/`.

Skill invocations (`/design-shotgun`, `/design-consultation`, `/plan-design-review`, `/design-review`, `/office-hours`) keep calling `$D compare --serve` exactly the same way. The CLI shape is unchanged. What's different is the binary now self-execs into daemon mode under the hood, attaches to a running daemon if one is there, spawns a fresh one if not, and prints `BOARD_PUBLISHED: http://127.0.0.1:N/boards/<id>/` to stderr so the skill can echo the URL. The legacy `--no-daemon` flag preserves the old single-process behavior for tests and debugging.

### The numbers that matter

Source: `bun test design/test/` and `git diff origin/main...HEAD --stat`.

| Metric | Before | After | Δ |
|-----------------------------------------|---------------|---------------|----------------|
| Idle timeout per board | 10 minutes | 24 hours | 144× |
| Server processes for N boards | N | 1 | N× |
| Browser tabs to keep open | one per board | one total | N× |
| Design tests in repo | 16 | 77 | +61 |
| Test paths covered (failure modes) | not enumerated| 38 / 100% | full coverage |
| Plan-review findings absorbed pre-impl | 2 | 19 | 17× from Codex |

| Component | New lines | Test lines |
|----------------------------|-----------|------------|
| design/src/daemon.ts | ~580 | 34 tests |
| design/src/daemon-client.ts| ~340 | 23 tests |
| design/src/daemon-state.ts | ~180 | (via client + daemon tests; direct stale-lock reclaim coverage) |
| Browser round-trip via HTTP| (existed) | 4 tests |

The compression: 61 new tests cover every endpoint, lifecycle path, LRU eviction, real idle-shutdown behavior (spawn-based, daemon process observed exiting after `IDLE_MS`), the bare-GET-doesn't-reset-idle invariant (poll loop in background, daemon still idles out), the idle-with-active-boards extension path with `MAX_EXTENSIONS` hard ceiling, concurrent-CLIs lock race (two parallel `ensureDaemon` calls converge on one daemon), identity-verified spawn, version mismatch with and without active boards, PID-reuse safety, path traversal rejection, malformed-body negatives on every POST, and cross-board feedback isolation. The plan-review pass caught 2 architectural issues in-house; an outside Codex pass caught 17 more, all absorbed into the implementation before any code was written; the /ship review army caught 1 backwards-compat break in skill resolvers (fixed) + 5 deferred test gaps (filled). The version-mismatch path now refuses to silently kill a daemon with active boards (it prints a warning and exits 1), so upgrading gstack mid-design-session doesn't drop your in-memory board history.

### What this means for the builder

Open `/design-shotgun` Monday morning, work through three rounds of variants, walk away for lunch, come back, click Submit. The board is still there. Open a second `/design-shotgun` for a different feature in the afternoon, get a new URL at `/boards/<another-id>/`, no port churn, your morning board still works. The whole day's worth of design exploration accumulates as a browsable history at the daemon's root. Stop worrying about the 10-minute death clock.

### Itemized changes

#### Added
- **Persistent design daemon** (`design/src/daemon.ts`). Bun HTTP server on `127.0.0.1` hosting many boards under `/boards/<id>/`. Per-board state machine (`serving | regenerating | done`), LRU cap of 50 boards (evicts `done` first, returns 503 when 50 non-done coexist), 24h idle timeout with 1h extensions up to a 28h ceiling when boards are still active, per-board async mutex serializing feedback POST vs reload POST. Index page at `/` lists recent boards newest first.
- **`$D daemon status`** and **`$D daemon stop [--force]`**. The stop sub-command refuses without `--force` when active boards exist, so a casual stop doesn't drop in-flight history.
- **Daemon client** (`design/src/daemon-client.ts`). `ensureDaemon()` handles spawn-or-attach with file-lock-protected spawn (re-reads state inside the lock to close the two-CLIs-race window) and identity-verified SIGTERM (reads `/proc/PID/cmdline` on Linux, `ps -p PID -o command=` on macOS, only signals if `gstack-design-daemon` is in the cmdline). PID-reuse safety: if the state file points at a PID belonging to an unrelated process, no signal is sent and a fresh daemon spawns. Version-mismatch refusal: if a CLI from a newer gstack version arrives while boards are still open in an older daemon, the CLI prints a user-actionable warning and exits 1 instead of silently restarting and losing history.
- **Shared daemon state utilities** (`design/src/daemon-state.ts`). Atomic state-file write (`<tmp>` + `renameSync` at mode `0o600`), `fs.openSync('wx')` exclusive lock, cross-platform cmdline reader, version lookup that falls back through `DESIGN_DAEMON_VERSION` env → `design/dist/.version` baked at build time → source-tree `VERSION` → `"unknown"`.
- **End-to-end round-trip tests against a real spawned daemon** (`design/test/feedback-roundtrip-daemon.test.ts`). HTTP fetch drives publish → submit → regenerate → reload → round-2 submit, asserting `feedback.json` lands at the daemon-derived `sourceDir` with `boardId` and `publishedAt` augmented fields.

#### Changed
- **Board JS uses relative URLs** instead of an injected `__GSTACK_SERVER_URL` global. The same generated HTML works at `/` (legacy `--no-daemon`) and `/boards/<id>/` (daemon). `location.protocol` feature-detect keeps the `file://` DOM-only fallback path working.
- **Bare `GET /boards/<id>` returns 301** to `/boards/<id>/`. The trailing slash is load-bearing for relative-URL resolution in the board JS; without it, `fetch('./api/feedback')` would resolve to the wrong scope.
- **Reload guard rejects directory paths**. `design/src/serve.ts:200-212` previously let `resolvedReload === allowedDir` through, which then crashed `readFileSync` with `EISDIR`. Now requires `statSync(resolvedReload).isFile()` with a clear 400 instead.
- **Feedback files carry `boardId` and `publishedAt`** so agents polling `feedback.json` / `feedback-pending.json` in a multi-board world can verify which board produced what.
- **`sourceDir` is derived from `realpath(html)` server-side**, never trusted from the publish POST body.
- **Skill resolvers and templates** (`scripts/resolvers/design.ts`, `design-shotgun/SKILL.md`, `design-consultation/SKILL.md`, `plan-design-review/SKILL.md`, `office-hours/SKILL.md`) updated to parse `BOARD_URL:` from stderr and POST reloads to `${BOARD_URL}api/reload` instead of the legacy port-only `/api/reload`. Legacy `SERVE_STARTED: port=N html=...` line still emitted for back-compat.

#### Fixed
- **Compiled design binary self-execs as the daemon** via a `--daemon-mode` flag, so the daemon lifecycle works for users installing from `design/dist/design` (not just `bun run` against the source tree).
- **Version lookup** is consistent between client and daemon. Both go through `readVersionString()`, so the version-mismatch refusal path works on the compiled binary instead of always reading `"unknown"` and matching itself.

#### For contributors
- **Test infrastructure split**: `design/test/daemon.test.ts` (30 in-process tests against the exported `fetchHandler`, ~70ms) for fast iteration; `design/test/daemon-discovery.test.ts` (17 real-spawn tests, ~8s) for lifecycle + lock + identity guarantees. Shared helpers in `design/test/daemon-tests-fixtures.ts`.
- **Plan-review process**: this branch ran `/plan-eng-review` twice. Round 1 caught 2 architecture findings. An outside-voice Codex pass after round 1 found 17 more (URL contract self-contradiction, false test-green claim, lock semantics, identity verification, version-mismatch silent data loss, several others). Round 2 absorbed all 17 before implementation started. The full review trail is preserved in the plan file's `## GSTACK REVIEW REPORT` section.

## [1.44.1.0] - 2026-05-24

## **Nine community fixes ship in one bundle.** Office-hours session counter works again, iOS QA tunnels survive macOS 26.x, Windows brain-sync stops dropping artifacts, browse server tells you whether the bind failure was a port collision or a sandbox block.
Expand Down
48 changes: 48 additions & 0 deletions TODOS.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,53 @@
# TODOS

## design daemon: follow-ups (filed v1.45.0.0 via /ship review army)

### ✅ DONE (v1.45.0.0): Tighten daemon test coverage

**Resolved in commit `6b037c55` (same PR):** All 5 test gaps filled before
landing. Per-file totals after: serve 16, daemon 34, daemon-discovery 23,
feedback-roundtrip-daemon 4 = 77 (+10 from initial ship). Specifically:
- Idle-shutdown actually fires (spawn-based, daemon process observed exiting,
state file removed).
- Bare GET polling doesn't reset idle (hammers `/api/progress` in background,
daemon still idles out).
- Idle-with-active-boards extends, then force-shuts after MAX_EXTENSIONS
(with `DESIGN_DAEMON_EXTENSION_MS=1500` + `MAX_EXTENSIONS=2`).
- Concurrent `ensureDaemon()` race converges on one daemon (lock wins).
- Stale-lock reclaim (dead PID succeeds, alive unrelated PID refuses).
- Malformed-JSON + non-object + array-body + missing-html negatives for
`POST /api/boards` and `POST /boards/<id>/api/reload`.

### P3: Minor maintainability nits from /ship review

- `design/src/cli.ts` and `design/src/serve.ts` both have a small `openBrowser`
helper with identical darwin/linux/else branches. Extract a shared
`design/src/open-browser.ts`.
- `design/src/daemon-client.ts:320` (`AbortSignal.timeout(2000)`) and `:357`
(`delay(50)`) use bare numeric literals while sibling timeouts are named
constants. Promote to `SHUTDOWN_POST_TIMEOUT_MS` and `ALIVE_POLL_INTERVAL_MS`.
- `design/src/daemon-state.ts:21` `serverPath` field is written
(`daemon.ts:541`) but never read by production code. Either remove or
document the forensic intent.

### P3: Daemon scope deferred from v1.45.0.0 plan

Originally listed in the plan's "TODOs surfaced for later" section:

- Per-daemon scoped auth tokens (only relevant once a tunnel/share use case appears).
- Optional persistent board history on disk in
`~/.gstack/projects/$SLUG/designs/history/` so submitted boards survive
daemon restarts.
- Windows spawn branch lifted from browse (V1 daemon is macOS + Linux;
Windows users fall back to legacy `--no-daemon` per-process server).
- `$D board list` / `$D board stop <id>` per-board ops CLI (V1 has only
`$D daemon status` / `stop`).
- Cross-worktree daemon attach (conductor sibling worktrees of the same
repo currently each spawn their own daemon — matches browse; revisit
if it causes friction).

---

## browse server: terminal-agent teardown follow-ups (filed v1.41 via /plan-eng-review)

### ✅ DONE (v1.44.0.0): Identity-based terminal-agent kill (replace pkill regex with PID)
Expand Down
2 changes: 1 addition & 1 deletion VERSION
Original file line number Diff line number Diff line change
@@ -1 +1 @@
1.44.1.0
1.45.0.0
22 changes: 17 additions & 5 deletions design-consultation/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -1307,20 +1307,27 @@ This command generates the board HTML, starts an HTTP server on a random port,
and opens it in the user's default browser. **Run it in the background** with `&`
because the server needs to stay running while the user interacts with the board.

Parse the port from stderr output: `SERVE_STARTED: port=XXXXX`. You need this
for the board URL and for reloading during regeneration cycles.
Parse the board URL from stderr output. Default daemon path:
`BOARD_URL: http://127.0.0.1:N/boards/<id>/` (already includes the per-board
path; use this for the AskUserQuestion URL AND as the base for the reload
endpoint). Legacy `--no-daemon` path emits `SERVE_STARTED: port=XXXXX` and
serves a single board at `/`, with reload at `/api/reload` — only relevant
when an external caller explicitly passes `--no-daemon`.

**PRIMARY WAIT: AskUserQuestion with board URL**

After the board is serving, use AskUserQuestion to wait for the user. Include the
board URL so they can click it if they lost the browser tab:

"I've opened a comparison board with the design variants:
http://127.0.0.1:<PORT>/ — Rate them, leave comments, remix
<BOARD_URL> — Rate them, leave comments, remix
elements you like, and click Submit when you're done. Let me know when you've
submitted your feedback (or paste your preferences here). If you clicked
Regenerate or Remix on the board, tell me and I'll generate new variants."

Substitute `<BOARD_URL>` with the URL parsed from stderr (the daemon path
emits `BOARD_URL: http://127.0.0.1:N/boards/<id>/`).

**Do NOT use AskUserQuestion to ask which variant the user prefers.** The comparison
board IS the chooser. AskUserQuestion is just the blocking wait mechanism.

Expand Down Expand Up @@ -1364,8 +1371,13 @@ the approved variant.
2. If `regenerateAction` is `"remix"`, read `remixSpec` (e.g. `{"layout":"A","colors":"B"}`)
3. Generate new variants with `$D iterate` or `$D variants` using updated brief
4. Create new board: `$D compare --images "..." --output "$_DESIGN_DIR/design-board.html"`
5. Reload the board in the user's browser (same tab):
`curl -s -X POST http://127.0.0.1:PORT/api/reload -H 'Content-Type: application/json' -d '{"html":"$_DESIGN_DIR/design-board.html"}'`
5. Reload the board in the user's browser (same tab) — the URL is per-board
under daemon mode, so use `<BOARD_URL>` (from the `BOARD_URL:` stderr
line) as the base:
`curl -s -X POST "${BOARD_URL}api/reload" -H 'Content-Type: application/json' -d '{"html":"$_DESIGN_DIR/design-board.html"}'`
Under `--no-daemon` the reload endpoint is `/api/reload` at the legacy
port; this path only matters if the caller explicitly opted out of the
daemon.
6. The board auto-refreshes. **AskUserQuestion again** with the same board URL to
wait for the next round of feedback. Repeat until `feedback.json` appears.

Expand Down
22 changes: 17 additions & 5 deletions design-shotgun/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -1193,20 +1193,27 @@ This command generates the board HTML, starts an HTTP server on a random port,
and opens it in the user's default browser. **Run it in the background** with `&`
because the server needs to stay running while the user interacts with the board.

Parse the port from stderr output: `SERVE_STARTED: port=XXXXX`. You need this
for the board URL and for reloading during regeneration cycles.
Parse the board URL from stderr output. Default daemon path:
`BOARD_URL: http://127.0.0.1:N/boards/<id>/` (already includes the per-board
path; use this for the AskUserQuestion URL AND as the base for the reload
endpoint). Legacy `--no-daemon` path emits `SERVE_STARTED: port=XXXXX` and
serves a single board at `/`, with reload at `/api/reload` — only relevant
when an external caller explicitly passes `--no-daemon`.

**PRIMARY WAIT: AskUserQuestion with board URL**

After the board is serving, use AskUserQuestion to wait for the user. Include the
board URL so they can click it if they lost the browser tab:

"I've opened a comparison board with the design variants:
http://127.0.0.1:<PORT>/ — Rate them, leave comments, remix
<BOARD_URL> — Rate them, leave comments, remix
elements you like, and click Submit when you're done. Let me know when you've
submitted your feedback (or paste your preferences here). If you clicked
Regenerate or Remix on the board, tell me and I'll generate new variants."

Substitute `<BOARD_URL>` with the URL parsed from stderr (the daemon path
emits `BOARD_URL: http://127.0.0.1:N/boards/<id>/`).

**Do NOT use AskUserQuestion to ask which variant the user prefers.** The comparison
board IS the chooser. AskUserQuestion is just the blocking wait mechanism.

Expand Down Expand Up @@ -1250,8 +1257,13 @@ the approved variant.
2. If `regenerateAction` is `"remix"`, read `remixSpec` (e.g. `{"layout":"A","colors":"B"}`)
3. Generate new variants with `$D iterate` or `$D variants` using updated brief
4. Create new board: `$D compare --images "..." --output "$_DESIGN_DIR/design-board.html"`
5. Reload the board in the user's browser (same tab):
`curl -s -X POST http://127.0.0.1:PORT/api/reload -H 'Content-Type: application/json' -d '{"html":"$_DESIGN_DIR/design-board.html"}'`
5. Reload the board in the user's browser (same tab) — the URL is per-board
under daemon mode, so use `<BOARD_URL>` (from the `BOARD_URL:` stderr
line) as the base:
`curl -s -X POST "${BOARD_URL}api/reload" -H 'Content-Type: application/json' -d '{"html":"$_DESIGN_DIR/design-board.html"}'`
Under `--no-daemon` the reload endpoint is `/api/reload` at the legacy
port; this path only matters if the caller explicitly opted out of the
daemon.
6. The board auto-refreshes. **AskUserQuestion again** with the same board URL to
wait for the next round of feedback. Repeat until `feedback.json` appears.

Expand Down
Loading
Loading