Skip to content

Commit ed1e4be

Browse files
garrytanclaude
andauthored
feat: gstack browser sidebar = interactive Claude Code REPL with live tab awareness (v1.14.0.0) (#1216)
* build: vendor xterm@5 for the Terminal sidebar tab Adds xterm@5 + xterm-addon-fit as devDependencies and a `vendor:xterm` build step that copies the assets into `extension/lib/` at build time. The vendored files are .gitignored so the npm version stays the source of truth. xterm@5 is eval-free, so no MV3 CSP changes needed. No runtime callers yet — this just stages the assets. * feat(server): add pty-session-cookie module for the Terminal tab Mirrors `sse-session-cookie.ts` exactly. Mints short-lived 30-min HttpOnly cookies for authenticating the Terminal-tab WebSocket upgrade against the terminal-agent. Same TTL, same opportunistic-pruning shape, same "scoped tokens never valid as root" invariant. Two registries instead of one because the cookie names are different (`gstack_sse` vs `gstack_pty`) and the token spaces must not overlap. No callers yet — wired up in the next commit. * feat(server): add terminal-agent.ts (PTY for the Terminal sidebar tab) Translates phoenix gbrowser's Go PTY (cmd/gbd/terminal.go) into a Bun non-compiled process. Lives separately from `sidebar-agent.ts` so a WS-framing or PTY-cleanup bug can't take down the chat path (codex outside-voice review caught the coupling risk). Architecture: - Bun.serve on 127.0.0.1:0 (never tunneled). - POST /internal/grant accepts cookie tokens from the parent server over loopback, authenticated with a per-boot internal token. - GET /ws upgrades require BOTH (a) Origin: chrome-extension://<id> and (b) the gstack_pty cookie minted by /pty-session. Either gate alone is insufficient (CSWSH defense + auth defense). - Lazy spawn: claude PTY is not started until the WS receives its first data frame. Idle sidebar opens cost nothing. - Bun PTY API: `terminal: { rows, cols, data(t, chunk) }` — verified at impl time on Bun 1.3.10. proc.terminal.write() for input, proc.terminal.resize() for resize, proc.kill() + 3s SIGKILL fallback on close. - process.on('uncaughtException'|'unhandledRejection') handlers so a framing bug logs but doesn't kill the listener loop. Test-only `BROWSE_TERMINAL_BINARY` env override lets the integration tests spawn /bin/bash instead of requiring claude on every CI runner. Not yet spawned by anything — wired in the next commit. * feat(server): wire /pty-session route + spawn terminal-agent Server-side glue connecting the Terminal sidebar tab to the new terminal-agent process. server.ts: - New POST /pty-session route. Validates AUTH_TOKEN, mints a gstack_pty HttpOnly cookie via pty-session-cookie.ts, posts the cookie value to the agent's loopback /internal/grant. Returns the terminalPort + Set-Cookie to the extension. - /health response gains `terminalPort` (just the port number — never a shell token). Tokens flow via the cookie path, never /health, because /health already surfaces AUTH_TOKEN to localhost callers in headed mode (that's a separate v1.1+ TODO). - /pty-session and /terminal/* are deliberately NOT added to TUNNEL_PATHS, so the dual-listener tunnel surface 404s by default-deny. - Shutdown path now also pkills terminal-agent and unlinks its state files (terminal-port + terminal-internal-token) so a reconnect doesn't try to hit a dead port. cli.ts: - After spawning sidebar-agent.ts, also spawn terminal-agent.ts. Same pattern: pkill old instances, Bun.spawn(['bun', 'run', script]) with BROWSE_STATE_FILE + BROWSE_SERVER_PORT env. Non-fatal if the spawn fails — chat still works without the terminal agent. * feat(extension): Terminal as default sidebar tab Adds a primary tab bar (Terminal | Chat) above the existing tab-content panes. Terminal is the default-active tab; clicking Chat returns to the existing claude -p one-shot flow which is preserved verbatim. manifest.json: adds ws://127.0.0.1:*/ to host_permissions so MV3 doesn't block the WebSocket upgrade. sidepanel.html: new primary-tabs nav, new #tab-terminal pane with a "Press any key to start Claude Code" bootstrap card, claude-not-found install card, xterm mount point, and "session ended" restart UI. Loads xterm.js + xterm-addon-fit + sidepanel-terminal.js. tab-chat is no longer the .active default. sidepanel.js: new activePrimaryPaneId() helper that reads which primary tab is selected. Debug-close paths now route back to whichever primary pane is active (was hardcoded to tab-chat). Primary-tab click handler toggles .active classes and aria-selected. window.gstackServerPort and window.gstackAuthToken exposed so sidepanel-terminal.js can build the /pty-session POST and the WS URL. sidepanel-terminal.js (new): xterm.js lifecycle. Lazy-spawn — first keystroke fires POST /pty-session, then opens ws://127.0.0.1:<terminalPort>/ws. Origin + cookie are set automatically by the browser. Resize observer sends {type:"resize"} text frames. ResizeObserver, tab-switch hooks, restart button, install-card retry. On WS close shows "Session ended, click to restart" — no auto-reconnect (codex outside-voice flagged that as session-burning). sidepanel.css: primary-tabs bar + Terminal pane styling (full-height xterm container, install card, ended state). * test: terminal-agent + cookie module + sidebar default-tab regression Three new test files: terminal-agent.test.ts (16 tests): pty-session-cookie mint/validate/ revoke, Set-Cookie shape (HttpOnly + SameSite=Strict + Path=/, NO Secure since 127.0.0.1 over HTTP), source-level guards that /pty-session and /terminal/* are NOT in TUNNEL_PATHS, /health does NOT surface ptyToken or gstack_pty, terminal-agent binds 127.0.0.1, /ws upgrade enforces chrome-extension:// Origin AND gstack_pty cookie, lazy-spawn invariant (spawnClaude is called from message handler, not upgrade), uncaughtException/ unhandledRejection handlers exist, SIGINT-then-SIGKILL cleanup. terminal-agent-integration.test.ts (7 tests): spawns the agent as a real subprocess in a tmp state dir. Verifies /internal/grant accepts/rejects the loopback token, /ws gates (no Origin → 403, bad Origin → 403, no cookie → 401), real WebSocket round-trip with /bin/bash via the BROWSE_TERMINAL_BINARY override (write 'echo hello-pty-world\n', read it back), and resize message acceptance. sidebar-tabs.test.ts (13 tests): structural regression suite locking the load-bearing invariants of the default-tab change — Terminal is .active, Chat is not, xterm assets are loaded, debug-close path no longer hardcodes tab-chat (uses activePrimaryPaneId), primary-tab click handler exists, chat surface is not accidentally deleted, terminal JS does NOT auto- reconnect on close, manifest declares ws:// + http:// localhost host permissions, no unsafe-eval. Plan called for Playwright + extension regression; the codebase doesn't ship Playwright extension launcher infra, so we follow the existing extension-test pattern (source-level structural assertions). Same load-bearing intent — locks the invariants before they regress. * docs: Terminal flow + threat model + v1.1 follow-ups SIDEBAR_MESSAGE_FLOW.md: new "Terminal flow" section. Documents the WS upgrade path (/pty-session cookie mint → /ws Origin + cookie gate → lazy claude spawn), the dual-token model (AUTH_TOKEN for /pty-session, gstack_pty cookie for /ws, INTERNAL_TOKEN for server↔agent loopback), and the threat-model boundary — the Terminal tab bypasses the entire prompt-injection security stack on purpose; user keystrokes are the trust source. That trust assumption is load-bearing on three transport guarantees: local-only listener, Origin gate, cookie auth. Drop any one of those three and the tab becomes unsafe. CLAUDE.md: extends the "Sidebar architecture" note to include terminal-agent.ts in the read-this-first list. Adds a "Terminal tab is its own process" note so a future contributor doesn't bolt PTY logic onto sidebar-agent.ts. TODOS.md: three new follow-ups under a new "Sidebar Terminal" section: - v1.1: PTY session survives sidebar reload (Issue 1C deferred). - v1.1+: audit /health AUTH_TOKEN distribution (codex finding #2 — a pre-existing soft leak that cc-pty-import sidesteps but doesn't fix). - v1.1+: apply terminal-agent's process.on exception handlers to sidebar-agent.ts (codex finding #4 — chat path has no fatal handlers). * feat(extension): Terminal-only sidebar — auth fix, UX polish, chat rip The chat queue path is gone. The Chrome side panel is now just an interactive claude PTY in xterm.js. Activity / Refs / Inspector still exist behind the `debug` toggle in the footer. Three threads of change, all from dogfood iteration on top of cc-pty-import: 1. fix(server): cross-port WS auth via Sec-WebSocket-Protocol - Browsers can't set Authorization on a WebSocket upgrade. We had been minting an HttpOnly gstack_pty cookie via /pty-session, but SameSite=Strict cookies don't survive the cross-port jump from server.ts:34567 to the agent's random port from a chrome-extension origin. The WS opened then immediately closed → "Session ended." - /pty-session now also returns ptySessionToken in the JSON body. - Extension calls `new WebSocket(url, [`gstack-pty.<token>`])`. Browser sends Sec-WebSocket-Protocol on the upgrade. - Agent reads the protocol header, validates against validTokens, and MUST echo the protocol back (Chromium closes the connection immediately if a server doesn't pick one of the offered protocols). - Cookie path is kept as a fallback for non-browser callers (curl, integration tests). - New integration test exercises the full protocol-auth round-trip via raw fetch+Upgrade so a future regression of this exact class fails in CI. 2. fix(extension): UX polish on the Terminal pane - Eager auto-connect when the sidebar opens — no "Press any key to start" friction every reload. - Always-visible ↻ Restart button in the terminal toolbar (not gated on the ENDED state) so the user can force a fresh claude mid-session. - MutationObserver on #tab-terminal's class attribute drives a fitAddon.fit() + term.refresh() when the pane becomes visible again — xterm doesn't auto-redraw after display:none → display:flex. 3. feat(extension): rip the chat tab + sidebar-agent.ts - Sidebar is Terminal-only. No more Terminal | Chat primary nav. - sidebar-agent.ts deleted. /sidebar-command, /sidebar-chat, /sidebar-agent/event, /sidebar-tabs* and friends all deleted. - The pickSidebarModel router (sonnet vs opus) is gone — the live PTY uses whatever model the user's `claude` CLI is configured with. - Quick-actions (🧹 Cleanup / 📸 Screenshot / 🍪 Cookies) survive in the Terminal toolbar. Cleanup now injects its prompt into the live PTY via window.gstackInjectToTerminal — no more /sidebar-command POST. The Inspector "Send to Code" action uses the same injection path. - clear-chat button removed from the footer. - sidepanel.js shed ~900 lines of chat polling, optimistic UI, stop-agent, etc. Net diff: -3.4k lines across 16 files. CLAUDE.md, TODOS.md, and docs/designs/SIDEBAR_MESSAGE_FLOW.md rewritten to match. The sidebar regression test (browse/test/sidebar-tabs.test.ts) is rewritten as 27 structural assertions locking the new layout — Terminal sole pane, no chat input, quick-actions in toolbar, eager-connect, MutationObserver repaint, restart helper. * feat: live tab awareness for the Terminal pane claude in the PTY now has continuous tab-aware context. Three pieces: 1. Live state files. background.js listens to chrome.tabs.onActivated / onCreated / onRemoved / onUpdated (throttled to URL/title/status== complete so loading spinners don't spam) and pushes a snapshot. The sidepanel relays it as a custom event; sidepanel-terminal.js sends {type:"tabState"} text frames over the live PTY WebSocket. terminal-agent.ts writes: <stateDir>/tabs.json all open tabs (id, url, title, active, pinned, audible, windowId) <stateDir>/active-tab.json current active tab (skips chrome:// and chrome-extension:// internal pages) Atomic write via tmp + rename so claude never reads a half-written document. A fresh snapshot is pushed on WS open so the files exist by the time claude finishes booting. 2. New $B tab-each <command> [args...] meta-command. Fans out a single command across every open tab, returns {command, args, total, results: [{tabId, url, title, status, output}]}. Skips chrome:// pages; restores the originally active tab in a finally block (so a mid-batch error doesn't leave the user looking at a different tab); uses bringToFront: false so the OS window doesn't jump on every fanout. Scope-checks the inner command BEFORE the loop. 3. --append-system-prompt hint at spawn time. Claude is told about both the state files and the $B tab-each command up front, so it doesn't have to discover the surface by trial. Passed via the --append-system- prompt CLI flag, NOT as a leading PTY write — the hint stays out of the visible transcript. Tests: - browse/test/tab-each.test.ts (new) — registration + source-level invariants (scope check before loop, finally-restore, bringToFront:false, chrome:// skip) + behavior tests with a mock BrowserManager that verify iteration order, JSON shape, error handling, and active-tab restore. - browse/test/terminal-agent.test.ts — three new assertions for tabState handler shape, atomic-write pattern, and the --append-system-prompt wiring at spawn. Verified live: opened 5 tabs, ran $B tab-each url against the live server, got per-tab JSON results back, original active tab restored without OS focus stealing. * chore: drop sidebar-agent test refs after chat rip Five test files / describe blocks targeted the deleted chat path: - browse/test/security-e2e-fullstack.test.ts (full-stack chat-pipeline E2E with mock claude — whole file gone) - browse/test/security-review-fullstack.test.ts (review-flow E2E with real classifier — whole file gone) - browse/test/security-review-sidepanel-e2e.test.ts (Playwright E2E for the security event banner that was ripped from sidepanel.html) - browse/test/security-audit-r2.test.ts (5 describe blocks: agent queue permissions, isValidQueueEntry stateFile traversal, loadSession session-ID validation, switchChatTab DocumentFragment, pollChat reentrancy guard, /sidebar-tabs URL sanitization, sidebar-agent SIGTERM→SIGKILL escalation, AGENT_SRC top-level read converted to graceful fallback) - browse/test/security-adversarial-fixes.test.ts (canary stream-chunk split detection on detectCanaryLeak; one tool-output test on sidebar-agent) - test/skill-validation.test.ts (sidebar agent #584 describe block) These all assumed sidebar-agent.ts existed and tested chat-queue plumbing, chat-tab DOM round-trip, chat-polling reentrancy, or per-message classifier canary detection. With the live PTY there is no chat queue, no chat tab, no LLM stream to canary-scan, and no per-message subprocess. The Terminal pane's invariants are covered by the new browse/test/sidebar-tabs.test.ts (27 structural assertions), browse/test/terminal-agent.test.ts, and browse/test/terminal-agent-integration.test.ts. bun test → exit 0, 0 failures. * chore: bump version and changelog (v1.14.0.0) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(extension): xterm fills the full Terminal panel height The Terminal pane only rendered into the top portion of the panel — most of the panel below the prompt was an empty black gap. Three layered issues, all about xterm.js measuring dimensions during a layout state that wasn't ready yet: 1. order-of-operations in connect(): ensureXterm() ran BEFORE setState(LIVE), so term.open() measured els.mount while it was still display:none. xterm caches a 0-size viewport synchronously inside open() and never auto-recovers when the container goes visible. Flipped: setState(LIVE) → ensureXterm. 2. first fit() ran synchronously before the browser had applied the .active class transition. Wrapped in requestAnimationFrame so layout has settled before fit() reads clientHeight. 3. CSS flex-overflow trap: .terminal-mount has flex:1 inside the flex-column #tab-terminal, but .tab-content's `overflow-y: auto` and the lack of `min-height: 0` on .terminal-mount meant the item couldn't shrink below content size. flex:1 then refused to expand into available space and xterm rendered into whatever its initial 2x2 measurement happened to be. Fixes: - extension/sidepanel-terminal.js: reorder + RAF fit - extension/sidepanel.css: .terminal-mount gets `flex: 1 1 0` + `min-height: 0` + `position: relative`. #tab-terminal overrides .tab-content's `overflow-y: auto` to `overflow: hidden` (xterm has its own viewport scroll; the parent shouldn't compete) and explicitly re-declares `display: flex; flex-direction: column` for #tab-terminal.active. bun test browse/test/sidebar-tabs.test.ts → 27/27 pass. Manually verified: side panel opens → Terminal fills full panel height, xterm scrollback works, debug-tab toggle still repaints correctly. --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 23c4d7b commit ed1e4be

35 files changed

Lines changed: 3025 additions & 5139 deletions

.gitignore

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,10 @@ bin/gstack-global-discover
2020
.gbrain/
2121
.context/
2222
extension/.auth.json
23+
# xterm assets are vendored from npm at build time; not source-of-truth.
24+
extension/lib/xterm.js
25+
extension/lib/xterm.css
26+
extension/lib/xterm-addon-fit.js
2327
.gstack-worktrees/
2428
/tmp/
2529
*.log

CHANGELOG.md

Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,58 @@
11
# Changelog
22

3+
## [1.14.0.0] - 2026-04-25
4+
5+
## **The gstack browser sidebar is now an interactive Claude Code REPL with live tab awareness.**
6+
7+
Open the side panel and Claude Code is right there in a real terminal. Type, watch the agent work, switch browser tabs and Claude sees the change. The old one-shot chat queue is gone. Two-way conversation, slash commands, `/resume`, ANSI colors, all of it. Plus a `$B tab-each` command that fans out a single browse command across every open tab and returns per-tab JSON results.
8+
9+
### The numbers that matter
10+
11+
| Metric | Before | After | Δ |
12+
|---|---|---|---|
13+
| Sidebar surfaces | Chat (one-shot `claude -p`) + 3 debug | Terminal (live PTY) + 3 debug | -1 surface, +interactive |
14+
| Subprocesses spawned per session | Many (one per chat message) | One (PTY claude, lazy-spawned) | -N |
15+
| Lines in `extension/sidepanel.js` | 1969 | 1042 | -47% |
16+
| Total diff || 27 files, +2875 / -3885 | -1010 net |
17+
| New unit + integration + regression tests | 0 | 56+ | +56 |
18+
| Live `tabs.json` push latency | n/a (no live state) | <50ms after `chrome.tabs` event | new capability |
19+
20+
### What this means for builders
21+
22+
Open the sidebar, type. Real PTY means slash commands, `/resume`, real ANSI rendering, real claude process lifecycle. Switch browser tabs while Claude is running and `<stateDir>/tabs.json` + `active-tab.json` update in place — Claude reads them, no need to ask `$B tabs`. Need to do the same thing on every tab? `$B tab-each <command>` returns a JSON array, original active tab restored when done, no OS focus stealing.
23+
24+
The old chat queue is gone. `sidebar-agent.ts`, `/sidebar-command`, `/sidebar-chat`, `/sidebar-agent/event` all deleted. The Cleanup / Screenshot / Cookies toolbar buttons survive in the Terminal pane — Cleanup pipes its prompt straight into the live PTY via `window.gstackInjectToTerminal()` instead of spawning yet another `claude -p`.
25+
26+
### Itemized changes
27+
28+
#### Added
29+
- **Interactive Terminal sidebar tab.** xterm.js + a non-compiled `terminal-agent.ts` Bun process that spawns claude with `Bun.spawn({terminal: {rows, cols, data}})`. Auto-connects when the side panel opens, no keypress needed.
30+
- **`$B tab-each <command>`** — fan-out helper for multi-tab work. Returns `{command, args, total, results: [{tabId, url, title, status, output}]}`. Skips chrome:// pages, scope-checks the inner command before iterating, restores the original active tab in a `finally` block, never pulls focus away from the user's foreground app.
31+
- **Live tab state files.** `<stateDir>/tabs.json` (full list with id, url, title, active, pinned, audible, windowId) and `<stateDir>/active-tab.json` (current active). Updated atomically on every `chrome.tabs` event (activated, created, removed, URL/title change). Claude reads on demand instead of running `$B tabs`.
32+
- **Tab-awareness system prompt** injected via `claude --append-system-prompt` at spawn so the model knows about the state files and the `$B tab-each` command without being told.
33+
- **Always-visible Restart button** in the Terminal toolbar. Force-restart claude any time, not just from the "session ended" state.
34+
35+
#### Changed
36+
- **Sidebar is Terminal-only.** No more `Terminal | Chat` primary tab nav. Activity / Refs / Inspector still live behind the `debug` toggle in the footer. Quick-actions (🧹 Cleanup / 📸 Screenshot / 🍪 Cookies) moved into the Terminal toolbar.
37+
- **WebSocket auth uses `Sec-WebSocket-Protocol`** instead of cookies. Browsers can't set `Authorization` on WS upgrades, and `SameSite=Strict` cookies don't survive the cross-port jump from server.ts:34567 to the agent's random port from a chrome-extension origin. The token rides on `new WebSocket(url, [`gstack-pty.<token>`])` and the agent echoes the protocol back (Chromium closes connections that don't pick a protocol).
38+
- **Cleanup button now drives the live PTY.** Clicking "🧹 Cleanup" injects the cleanup prompt straight into claude via `window.gstackInjectToTerminal()`. The Inspector "Send to Code" action uses the same path. No more `/sidebar-command` POSTs.
39+
- **Repaint after debug-tab close.** xterm.js doesn't auto-redraw when its container flips from `display: none` back to `display: flex`. A MutationObserver on `#tab-terminal`'s class attribute now forces a `fitAddon.fit() + term.refresh() + resize` push when the pane becomes visible.
40+
41+
#### Removed
42+
- **`browse/src/sidebar-agent.ts`** — the one-shot `claude -p` queue worker. ~900 lines.
43+
- **Server endpoints**: `/sidebar-command`, `/sidebar-chat[/clear]`, `/sidebar-agent/{event,kill,stop}`, `/sidebar-tabs[/switch]`, `/sidebar-session{,/new,/list}`, `/sidebar-queue/dismiss`. ~600 lines.
44+
- **Chat-related state** in server.ts: `ChatEntry`, `SidebarSession`, `TabAgentState`, `pickSidebarModel`, `addChatEntry`, `processAgentEvent`, `killAgent`, the agent-health watchdog, `chatBuffer`, the per-tab agent map.
45+
- **Chat UI in sidepanel.html**: primary-tab nav, `<main id="tab-chat">`, the chat input bar, the experimental "Browser co-pilot" banner, the security event banner, the `clear-chat` footer button.
46+
- **Five obsolete test files**: `sidebar-agent.test.ts`, `sidebar-agent-roundtrip.test.ts`, `security-e2e-fullstack.test.ts`, `security-review-fullstack.test.ts`, `security-review-sidepanel-e2e.test.ts`. Plus 5 chat-only describe blocks inside surviving security tests (loadSession session-ID validation, switchChatTab DocumentFragment, pollChat reentrancy, sidebar-tabs URL sanitization, agent queue security).
47+
48+
#### For contributors
49+
- **`browse/src/pty-session-cookie.ts`** mirrors `sse-session-cookie.ts`. Same TTL, same opportunistic pruning, separate registry (PTY tokens must never be valid as SSE tokens or vice versa).
50+
- **`docs/designs/SIDEBAR_MESSAGE_FLOW.md`** rewritten around the Terminal flow: WebSocket upgrade, dual-token model (`AUTH_TOKEN` for `/pty-session`, `gstack-pty.<token>` for `/ws`, `INTERNAL_TOKEN` for server↔agent loopback), threat-model boundary (Terminal tab bypasses the prompt-injection stack on purpose; user keystrokes are the trust source).
51+
- **`browse/test/terminal-agent.test.ts`** (16 tests) + `terminal-agent-integration.test.ts` (real `/bin/bash` PTY round-trip, raw `Sec-WebSocket-Protocol` upgrade verification) + `tab-each.test.ts` (10 tests with mock `BrowserManager`) + `sidebar-tabs.test.ts` (27 structural assertions locking the chat-rip invariants).
52+
- **CLAUDE.md** updated with the dual-token model, the cookie-vs-protocol rationale, and the cross-pane injection pattern.
53+
- **`vendor:xterm`** build step copies `xterm@5.x` and `xterm-addon-fit` from `node_modules/` into `extension/lib/` at build time. xterm files are gitignored.
54+
- **TODOS.md** carries three v1.1+ follow-ups: PTY session survival across sidebar reload (Issue 1C deferred), `/health` `AUTH_TOKEN` distribution audit (codex finding, pre-existing soft leak), and dropping the now-dead `security-classifier.ts` ML pipeline.
55+
356
## [1.13.0.0] - 2026-04-25
457

558
## **`/gstack-claude` gives non-Claude hosts a read-only outside voice.**

CLAUDE.md

Lines changed: 29 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -225,12 +225,35 @@ When you need to interact with a browser (QA, dogfooding, cookie setup), use the
225225
project uses.
226226

227227
**Sidebar architecture:** Before modifying `sidepanel.js`, `background.js`,
228-
`content.js`, `sidebar-agent.ts`, or sidebar-related server endpoints, read
229-
`docs/designs/SIDEBAR_MESSAGE_FLOW.md`. It documents the full initialization
230-
timeline, message flow, auth token chain, tab concurrency model, and known
231-
failure modes. The sidebar spans 5 files across 2 codebases (extension + server)
232-
with non-obvious ordering dependencies. The doc exists to prevent the kind of
233-
silent failures that come from not understanding the cross-component flow.
228+
`content.js`, `terminal-agent.ts`, or sidebar-related server endpoints,
229+
read `docs/designs/SIDEBAR_MESSAGE_FLOW.md`. The sidebar has one primary
230+
surface — the **Terminal** pane (interactive `claude` PTY) — with
231+
Activity / Refs / Inspector as debug overlays behind the footer's
232+
`debug` toggle. The chat queue path was ripped once the PTY proved out;
233+
`sidebar-agent.ts` and the `/sidebar-command` / `/sidebar-chat` /
234+
`/sidebar-agent/event` endpoints are gone. The doc covers the WS auth
235+
flow, dual-token model, and threat-model boundary — silent failures
236+
here usually trace to not understanding the cross-component flow.
237+
238+
**WebSocket auth uses Sec-WebSocket-Protocol, not cookies.** Browsers
239+
can't set `Authorization` on a WebSocket upgrade, but they CAN set
240+
`Sec-WebSocket-Protocol` via `new WebSocket(url, [token])`. The agent
241+
reads it, validates against `validTokens`, and MUST echo the protocol
242+
back in the upgrade response — without the echo, Chromium closes the
243+
connection immediately. `Set-Cookie: gstack_pty=...` is kept as a
244+
fallback for non-browser callers (the cross-port `SameSite=Strict`
245+
cookie path doesn't survive from a chrome-extension origin).
246+
247+
**Cross-pane PTY injection.** The toolbar's Cleanup button and the
248+
Inspector's "Send to Code" action both pipe text into the live claude
249+
PTY via `window.gstackInjectToTerminal(text)`, exposed by
250+
`sidepanel-terminal.js`. No `/sidebar-command` POST — the live REPL is
251+
the only execution surface in the sidebar now.
252+
253+
**`/health` MUST NOT surface any shell-grant token.** It already leaks
254+
`AUTH_TOKEN` to localhost callers in headed mode (a v1.1+ TODO). Don't
255+
make that worse by adding the PTY session token there. PTY auth flows
256+
through `POST /pty-session` only.
234257

235258
**Transport-layer security** (v1.6.0.0+). When `pair-agent` starts an ngrok tunnel,
236259
the daemon binds two HTTP listeners: a local listener (127.0.0.1, full command

SKILL.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1035,6 +1035,7 @@ Refs are invalidated on navigation — run `snapshot` again after `goto`.
10351035
| `closetab [id]` | Close tab |
10361036
| `newtab [url] [--json]` | Open new tab. With --json, returns {"tabId":N,"url":...} for programmatic use (make-pdf). |
10371037
| `tab <id>` | Switch to tab |
1038+
| `tab-each <command> [args...]` | Run a command on every open tab. Returns JSON with per-tab results. |
10381039
| `tabs` | List open tabs |
10391040

10401041
### Server

TODOS.md

Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,57 @@
11
# TODOS
22

3+
## Sidebar Terminal (cc-pty-import follow-ups)
4+
5+
### v1.1: PTY session survives sidebar reload
6+
7+
**What:** Today the Terminal tab's PTY dies with the WebSocket — sidebar
8+
reload, side-panel close, even a quick navigate-away in another tab close
9+
the session. v1.1 should key the PTY on a tab/session id so a reload
10+
reattaches to the existing claude process and you keep `/resume` history.
11+
12+
**Why:** Mid-task resilience. When you've been pair-programming with claude
13+
for 20 minutes and an accidental Cmd-R blows it away, the cost is real.
14+
15+
**Pros:** Better UX, fewer interrupted sessions. **Cons:** Session-tracking
16+
state, ghost-process risk, lifecycle bugs (when DOES the PTY actually go
17+
away?). v1 chose the simple "PTY dies with WS" model deliberately.
18+
19+
**Context:** /plan-eng-review Issue 1C decision (cc-pty-import branch,
20+
2026-04-25). v1 ships with phoenix's lifecycle. **Depends on:**
21+
cc-pty-import landed.
22+
23+
**Priority:** P2 (nice-to-have).
24+
**Effort:** M. Likely needs a per-tab session map keyed by chrome.tabs.id
25+
plus a TTL so abandoned PTYs eventually exit.
26+
27+
---
28+
29+
### v1.1+: Audit `/health` token distribution
30+
31+
**What:** Codex's outside-voice review on cc-pty-import flagged that
32+
`/health` already surfaces `AUTH_TOKEN` to any localhost caller in headed
33+
mode (`server.ts:1657`). That's a pre-existing soft leak — anything
34+
running on localhost gets the root token by hitting `/health`.
35+
36+
**Why:** cc-pty-import sidesteps it by NOT putting the PTY token there
37+
(uses an HttpOnly cookie path instead). But the underlying leak is still
38+
shippable surface. A second extension or a localhost web app could
39+
currently scrape `AUTH_TOKEN` and hit any browse-server endpoint.
40+
41+
**Pros:** Closes a real privilege-escalation path on multi-extension
42+
machines. **Cons:** Either we tighten the gate (Origin must be OUR
43+
extension id, not just any chrome-extension://) or we move bootstrap
44+
discovery off `/health` entirely. Either has migration cost for tests
45+
and the existing extension.
46+
47+
**Context:** codex finding #2 on cc-pty-import plan-eng review. Not in
48+
scope of that PR; deliberately deferred to keep PTY-import small.
49+
50+
**Priority:** P2.
51+
**Effort:** M.
52+
53+
---
54+
355
## Testing
456

557
### Pre-existing test failures surfaced during v1.12.0.0 ship

VERSION

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
1.13.0.0
1+
1.14.0.0

browse/SKILL.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -959,6 +959,7 @@ $B prettyscreenshot --cleanup --scroll-to ".pricing" --width 1440 ~/Desktop/hero
959959
| `closetab [id]` | Close tab |
960960
| `newtab [url] [--json]` | Open new tab. With --json, returns {"tabId":N,"url":...} for programmatic use (make-pdf). |
961961
| `tab <id>` | Switch to tab |
962+
| `tab-each <command> [args...]` | Run a command on every open tab. Returns JSON with per-tab results. |
962963
| `tabs` | List open tabs |
963964

964965
### Server

browse/src/cli.ts

Lines changed: 32 additions & 47 deletions
Original file line numberDiff line numberDiff line change
@@ -853,7 +853,7 @@ Refs: After 'snapshot', use @e1, @e2... as selectors:
853853
// Delete stale state file
854854
safeUnlinkQuiet(config.stateFile);
855855

856-
console.log('Launching headed Chromium with extension + sidebar agent...');
856+
console.log('Launching headed Chromium with extension + terminal agent...');
857857
try {
858858
// Start server in headed mode with extension auto-loaded
859859
// Use a well-known port so the Chrome extension auto-connects
@@ -882,56 +882,41 @@ Refs: After 'snapshot', use @e1, @e2... as selectors:
882882
const status = await resp.text();
883883
console.log(`Connected to real Chrome\n${status}`);
884884

885-
// Auto-start sidebar agent
886-
// __dirname is inside $bunfs in compiled binaries — resolve from execPath instead
887-
let agentScript = path.resolve(__dirname, 'sidebar-agent.ts');
888-
if (!fs.existsSync(agentScript)) {
889-
agentScript = path.resolve(path.dirname(process.execPath), '..', 'src', 'sidebar-agent.ts');
885+
// sidebar-agent.ts spawn was here. Ripped alongside the chat queue —
886+
// the Terminal pane runs an interactive PTY now, no more one-shot
887+
// claude -p subprocesses to multiplex.
888+
889+
// Auto-start terminal agent (non-compiled bun process). Owns the PTY
890+
// WebSocket for the sidebar Terminal pane.
891+
let termAgentScript = path.resolve(__dirname, 'terminal-agent.ts');
892+
if (!fs.existsSync(termAgentScript)) {
893+
termAgentScript = path.resolve(path.dirname(process.execPath), '..', 'src', 'terminal-agent.ts');
890894
}
891895
try {
892-
if (!fs.existsSync(agentScript)) {
893-
throw new Error(`sidebar-agent.ts not found at ${agentScript}`);
894-
}
895-
// Clear old agent queue
896-
const agentQueue = path.join(process.env.HOME || '/tmp', '.gstack', 'sidebar-agent-queue.jsonl');
897-
try {
898-
fs.mkdirSync(path.dirname(agentQueue), { recursive: true, mode: 0o700 });
899-
fs.writeFileSync(agentQueue, '', { mode: 0o600 });
900-
} catch (err: any) {
901-
if (err?.code !== 'EACCES') throw err;
902-
}
903-
904-
// Resolve browse binary path the same way — execPath-relative
905-
let browseBin = path.resolve(__dirname, '..', 'dist', 'browse');
906-
if (!fs.existsSync(browseBin)) {
907-
browseBin = process.execPath; // the compiled binary itself
908-
}
909-
910-
// Kill any existing sidebar-agent processes before starting a new one.
911-
// Old agents have stale auth tokens and will silently fail to relay events,
912-
// causing the server to mark the agent as "hung".
913-
try {
914-
const { spawnSync } = require('child_process');
915-
spawnSync('pkill', ['-f', 'sidebar-agent\\.ts'], { stdio: 'ignore', timeout: 3000 });
916-
} catch (err: any) {
917-
if (err?.code !== 'ENOENT') throw err;
896+
if (fs.existsSync(termAgentScript)) {
897+
// Kill old terminal-agents so a stale port file can't trick the
898+
// server into routing /pty-session at a dead listener.
899+
try {
900+
const { spawnSync } = require('child_process');
901+
spawnSync('pkill', ['-f', 'terminal-agent\\.ts'], { stdio: 'ignore', timeout: 3000 });
902+
} catch (err: any) {
903+
if (err?.code !== 'ENOENT') throw err;
904+
}
905+
const termProc = Bun.spawn(['bun', 'run', termAgentScript], {
906+
cwd: config.projectDir,
907+
env: {
908+
...process.env,
909+
BROWSE_STATE_FILE: config.stateFile,
910+
BROWSE_SERVER_PORT: String(newState.port),
911+
},
912+
stdio: ['ignore', 'ignore', 'ignore'],
913+
});
914+
termProc.unref();
915+
console.log(`[browse] Terminal agent started (PID: ${termProc.pid})`);
918916
}
919-
920-
const agentProc = Bun.spawn(['bun', 'run', agentScript], {
921-
cwd: config.projectDir,
922-
env: {
923-
...process.env,
924-
BROWSE_BIN: browseBin,
925-
BROWSE_STATE_FILE: config.stateFile,
926-
BROWSE_SERVER_PORT: String(newState.port),
927-
},
928-
stdio: ['ignore', 'ignore', 'ignore'],
929-
});
930-
agentProc.unref();
931-
console.log(`[browse] Sidebar agent started (PID: ${agentProc.pid})`);
932917
} catch (err: any) {
933-
console.error(`[browse] Sidebar agent failed to start: ${err.message}`);
934-
console.error(`[browse] Run manually: bun run ${agentScript}`);
918+
// Non-fatal: chat still works without the terminal agent.
919+
console.error(`[browse] Terminal agent failed to start: ${err.message}`);
935920
}
936921
} catch (err: any) {
937922
console.error(`[browse] Connect failed: ${err.message}`);

browse/src/commands.ts

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@ export const WRITE_COMMANDS = new Set([
3030
]);
3131

3232
export const META_COMMANDS = new Set([
33-
'tabs', 'tab', 'newtab', 'closetab',
33+
'tabs', 'tab', 'tab-each', 'newtab', 'closetab',
3434
'status', 'stop', 'restart',
3535
'screenshot', 'pdf', 'responsive',
3636
'chain', 'diff',
@@ -144,6 +144,7 @@ export const COMMAND_DESCRIPTIONS: Record<string, { category: string; descriptio
144144
'tab': { category: 'Tabs', description: 'Switch to tab', usage: 'tab <id>' },
145145
'newtab': { category: 'Tabs', description: 'Open new tab. With --json, returns {"tabId":N,"url":...} for programmatic use (make-pdf).', usage: 'newtab [url] [--json]' },
146146
'closetab':{ category: 'Tabs', description: 'Close tab', usage: 'closetab [id]' },
147+
'tab-each':{ category: 'Tabs', description: 'Run a command on every open tab. Returns JSON with per-tab results.', usage: 'tab-each <command> [args...]' },
147148
// Server
148149
'status': { category: 'Server', description: 'Health check' },
149150
'stop': { category: 'Server', description: 'Shutdown server' },

0 commit comments

Comments
 (0)