Skip to content

Commit 867e04e

Browse files
sohmnclaude
andcommitted
Merge origin/main into feat/fanout-skill — resolve conflicts, bump to v1.53.0.0
Conflicts resolved: - VERSION: 1.49.0.0 → 1.53.0.0 (queue-aware: 1.51 just landed, 1.52 claimed by PR garrytan#1741) - package.json: synced to 1.53.0.0 - CHANGELOG.md: our entry re-versioned to 1.53.0.0 above main's 1.51.0.0 release-summary entry preserved bit-for-bit Regenerated against merged state: - fanout/SKILL.md (gen-skill-docs picked up main's preamble updates) - gstack/llms.txt + scripts/proactive-suggestions.json (auto-regenerated) Tests: fanout test 6/6 pass post-merge. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2 parents 807652c + 19770ea commit 867e04e

29 files changed

Lines changed: 2367 additions & 157 deletions

BROWSER.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -317,6 +317,7 @@ from `snapshot`, or `@c` refs from `snapshot -C`. Full table:
317317
| `disconnect` | Close headed Chrome, return to headless |
318318
| `focus [@ref]` | Bring headed Chrome to foreground (macOS); `@ref` also scrolls into view |
319319
| `state save\|load <name>` | Save or load browser state (cookies + URLs) |
320+
| `memory [--json]` | Snapshot Bun heap + per-tab JS heap + Chromium process tree + bounded buffer sizes. Use `--json` for programmatic consumers; text mode renders sorted top-10 tabs with "and N more" tail. |
320321

321322
### Handoff
322323

CHANGELOG.md

Lines changed: 54 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Changelog
22

3-
## [1.49.0.0] - 2026-05-27
3+
## [1.53.0.0] - 2026-05-27
44

55
**New skill: `/fanout` decomposes a finished design doc into 2-3 parallel agent tasks.**
66
**Worktree dispatch script generated alongside. Plan stops short of spawning agents in v0.**
@@ -33,6 +33,59 @@ If your team or your single instance of Claude Code is sitting on a finished des
3333
- Design doc at [`docs/designs/FANOUT.md`](docs/designs/FANOUT.md) documents the 4-layer slab detection heuristic, Slab 0 promotion logic, conflict resolution rules, and edge cases.
3434
- No new infrastructure: skill is auto-discovered by `setup` via the existing top-level-directory glob at [setup:620-633](setup).
3535

36+
## [1.51.0.0] - 2026-05-27
37+
38+
## **Long-running browser sessions hold flat RSS on the Bun side. `$B memory` gives every future OOM receipts instead of a screenshot.** Four CDP-resource leak classes closed and pinned with tripwires; a structured diagnostic surfaces Bun heap + per-tab JS heap + Chromium process tree + bounded buffer sizes in real time.
39+
40+
This release closes four leak classes in the browse server that compounded silently across long sidebar sessions: response-body materialization in the requestfinished listener (multi-GB/hour Buffer churn on media-heavy pages), three undetached CDP session call sites (cdp-bridge, write-commands archive, cdp-inspector), an unbounded modificationHistory array in the CSS inspector, and SSE subscriber cleanup that only fired on the abort edge — TCP-died-without-abort cases (Chromium MV3 service-worker suspend, intermediate proxy half-close) left subscribers in the Set forever holding the controller and any queued bytes. All four have invariant tests; a static-grep tripwire fails CI if a future refactor reintroduces direct `newCDPSession(...)` calls outside the helper module.
41+
42+
Alongside the fixes, `$B memory` and `/memory` ship the diagnostic the original 160 GB OOM investigation was missing: Bun RSS + heap breakdown, per-tab JS heap via CDP `Performance.getMetrics`, Chromium process tree via `SystemInfo.getProcessInfo` (PID + type + CPU), and the bounded buffer sizes (modificationHistory, activity subscribers, inspector subscribers, console/network/dialog buffers, capture buffer bytes). The sidebar footer polls `/memory` every 30s with adaptive backoff (drops to 5min if response time exceeds 2s), and a tab-count guardrail fires soft-warn at 50 / hard-warn at 200 with a top-5-by-RAM toast offering one-click close. Single-tab JS heap above 4 GB triggers an immediate toast, catching the WebGL/video runaway case where one tab balloons without the count ever reaching 200.
43+
44+
### The numbers that matter
45+
46+
Source: this branch's 16 commits + the post-merge audit reports. Net diff: 23 files changed, +2251 / -143 = 2394 LOC across browse server (TypeScript), gstack extension (JS/HTML/CSS), and tests.
47+
48+
| Capability | Before this PR | After this PR |
49+
|---|---|---|
50+
| `requestfinished` body handling | `await res.body()` on every response, allocates full body Buffer for one `.length` read | `req.sizes()` reads structured byte count from `Network.loadingFinished`, zero body materialization, accurate for chunked / gzip / streaming responses |
51+
| CDP session lifecycle (3 sites) | direct `newCDPSession`, detach missing or success-path-only | `withCdpSession` (try/finally detach) + `getOrCreateCdpSession` (cached + close-detach) helpers, all 3 sites migrated, static-grep tripwire prevents regression |
52+
| modificationHistory in CSS inspector | unbounded array, grew for every `$B css` edit across the session | bounded FIFO cap 200, evicted-count surfaced in the undo error so the user knows why their target index is gone |
53+
| SSE subscriber cleanup | abort-edge only; TCP-died-without-abort leaked subscriber + controller + queued bytes until process exit | `createSseEndpoint` helper with cleanup on abort + enqueue-throw + heartbeat-throw, idempotent (any edge fires once) |
54+
| Tab-count visibility | none — user could accumulate hundreds of tabs without warning | soft warn at 50 (activity entry), action toast at 200 (top 5 by RAM + Close-selected + Snooze), single-tab >4 GB triggers immediate toast |
55+
| Diagnostic command | not available | `$B memory` (text + `--json`), `/memory` endpoint (SSE-session-cookie gated), sidebar footer with adaptive backoff |
56+
| Net change in `server.ts` (SSE refactor) | 132 lines of inline ReadableStream wiring across two endpoints | 23 lines, both endpoints route through one helper |
57+
| Test pins for the leak class | none specific | 6 new test files, 45 new tests; static-grep tripwire fails CI on regression |
58+
59+
### What this means for builders
60+
61+
The next time you leave a gbrowser session running for days, the Bun side holds its RSS flat instead of churning on per-response Buffer allocations. If a tab does go rogue, the sidebar footer shows you in real time — `RSS: 5.6 GB · 12 tabs`, color-coded — and a 200-tab toast surfaces the top RAM consumers with one-click close before you hit the OS OOM killer. If the next OOM still fires, `$B memory` is there to give it receipts instead of theory: Activity Monitor says 160 GB; the diagnostic tells you which process tree, which tabs, and which in-memory structures are holding it. Every code path the diagnostic measures is also bounded — modificationHistory at 200, console/network/dialog buffers at 50K via the existing CircularBuffer, SSE subscribers via the new cleanup contract — so the bookkeeping itself can't leak.
62+
63+
### Itemized changes
64+
65+
#### Added
66+
- **`$B memory` command** in `browse/src/memory-command.ts` — text mode with sorted top-10 tabs + "and N more" tail; `--json` mode for programmatic consumers and the sidebar footer poll.
67+
- **`/memory` HTTP endpoint** in `browse/src/server.ts` — same SSE-session-cookie auth model as `/activity/stream`. Deliberately NOT extending `/health` (which already leaks AUTH_TOKEN in headed mode per TODOS.md "Audit /health token distribution").
68+
- **`BrowserManager.getMemorySnapshot()`** — collects Bun process memory + per-tab JS heap via `Performance.getMetrics` (lazy per tracked page, swallows target-died errors) + Chromium process tree via `Browser.newBrowserCDPSession()` + `SystemInfo.getProcessInfo`.
69+
- **`browse/src/memory-snapshot.ts`** — shared types (`MemorySnapshot`, `MemoryTabSnapshot`, `MemoryProcess`, `MemoryStructureStats`) plus `formatBytes()` renderer (4 tiers, 2 decimals at GB).
70+
- **`withCdpSession(page, fn)`** and **`getOrCreateCdpSession(page, cache)`** in `browse/src/cdp-bridge.ts` — lifecycle helpers for one-shot and cached CDP work. Every direct `newCDPSession` call site now routes through one of them.
71+
- **`createSseEndpoint(req, config)`** in `browse/src/sse-helpers.ts` — owns the SSE cleanup contract (abort + enqueue-throw + heartbeat-throw, all idempotent). Built-in lone-surrogate sanitization on every JSON.stringify.
72+
- **Sidebar footer RSS readout** in `extension/sidepanel.{html,js,css}` — polls `/memory` every 30s with 5-minute backoff if response time exceeds 2s. Color-coded thresholds: orange at 2 GB Bun RSS or 50 tabs, red at 8 GB or 200 tabs.
73+
- **Tab guardrail UX** in `extension/sidepanel.js` — top-5-by-RAM toast at 200 tabs OR any single tab over 4 GB JS heap, with checkboxes + Close-selected (via `$B closetab`) + Snooze persisted in `chrome.storage.session`. Snooze bumps the thresholds so the toast stays hidden until the user accumulates more tabs or one tab grows another 2 GB.
74+
- **Static-grep tripwire** (`browse/test/cdp-session-cleanup.test.ts`) — fails CI if any source file outside `cdp-bridge.ts` calls `newCDPSession(...)` directly.
75+
- **45 new tests across 6 files** pinning the leak-fix invariants: CDP session lifecycle (8), SSE cleanup contract (6), modificationHistory cap + evicted-aware error (7), tab guardrail fires-once + re-arms (6), body-materialization reproducer (1), `$B memory` formatter + byte renderer + JSON entry (17).
76+
- **4 follow-up entries in `TODOS.md`** (P2: MV3 SW memory profile, P2: native + GPU memory breakdown, P3: single-context CDP listener via `Target.setAutoAttach`, P3: real-Chromium peak-RSS reproducer for periodic tier).
77+
78+
#### Changed
79+
- **`wirePageEvents.requestfinished` no longer materializes response bodies.** Pre-fix: `await res.body()` allocated a Bun `Buffer` of the full response on every fetch just to read `.length`. Post-fix: `req.sizes()` pulls the structured byte count from `Network.loadingFinished` without body fetch. Accurate for chunked transfer, gzip-encoded responses, and streaming media.
80+
- **`modificationHistory` capped at 200 entries with FIFO eviction.** `undoModification` error now reports `"No modification at index N. History has 200 entries (most recent 200 only — M earlier entries evicted at the cap)."` when the requested index is out of range AND the buffer has overflowed.
81+
- **`/activity/stream` and `/inspector/events` refactored through `createSseEndpoint`.** Both endpoints collapse from ~45 lines of inline `ReadableStream` wiring to ~8 lines of helper config; behavior preserved bit-for-bit.
82+
- **`memory` command classified under the `Server` category** in `COMMAND_DESCRIPTIONS` so it appears in the generated SKILL.md tables alongside `status` / `restart` / `handoff`.
83+
84+
#### For contributors
85+
- Plan completion audit: 12 of 17 plan items DONE, 2 CHANGED (deliberate scope decisions documented in the relevant commits — `req.sizes()` swap simpler than a single-context CDP listener; tab guardrail action toast wired through `$B closetab` instead of a `chrome.tabs.remove` bridge), 1 deferred to periodic tier (UI E2E tests).
86+
- Coverage audit: 44% pre-diagnostic-tests → ~62% after adding the formatter coverage. Strong paths (CDP session lifecycle, body materialization, history cap, tab guardrail, SSE cleanup) all at 100% with invariant tests. Extension UI tests deferred (no extension test harness in this repo today).
87+
- The CDP-session cleanup tripwire is the most reusable artifact here — any future addition of CDP work should route through the two helpers. Trying to call `newCDPSession` outside `cdp-bridge.ts` fails CI immediately with a pointer to the right helper.
88+
3689
## [1.48.0.0] - 2026-05-26
3790

3891
## **Agents stop dropping AskUserQuestion options when there are 5+.** A new canonical preamble rule + runtime gate makes Conductor's 4-option cap a split-or-batch decision, not a silent trim.

CLAUDE.md

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -294,6 +294,26 @@ response in `server.ts`, read
294294
`browse/test/server-sanitize-surrogates.test.ts` pins the wiring with invariant
295295
tests, so bypasses fail CI.
296296

297+
**SSE endpoint helper** (v1.51.0.0+). New SSE endpoints in `server.ts` MUST route
298+
through `createSseEndpoint(req, config)` from `browse/src/sse-helpers.ts`. The
299+
helper owns the cleanup contract (abort + enqueue-throw + heartbeat-throw, all
300+
idempotent) and bakes in `sanitizeLoneSurrogates` on every JSON.stringify, so
301+
new subscribers can't accidentally regress either invariant. Inline
302+
`ReadableStream` wiring leaked subscribers when the TCP connection died without
303+
firing `req.signal.abort` (Chromium MV3 service-worker suspend, intermediate
304+
proxy half-close). `/activity/stream`, `/inspector/events`, and `/memory`
305+
(SSE-eligible) all route through it. `browse/test/sse-helpers.test.ts` pins the
306+
cleanup contract.
307+
308+
**CDP session lifecycle** (v1.51.0.0+). Direct `page.context().newCDPSession(page)`
309+
calls outside `browse/src/cdp-bridge.ts` fail CI via the static-grep tripwire in
310+
`browse/test/cdp-session-cleanup.test.ts`. Use `withCdpSession(page, async (s) => {...})`
311+
for one-shot CDP work (try/finally detach) or `getOrCreateCdpSession(page, cache)`
312+
for cached sessions tied to a page's lifetime (close-detach via `Map<page, session>`).
313+
Three sites migrated: cdp-bridge frame events, write-commands archive capture,
314+
cdp-inspector. The helpers prevent the per-session leak class where successful-path
315+
detach happened but error-path detach was missed.
316+
297317
**Setup symlink hardening** (v1.38.0.0+). Every link site in `setup` MUST route
298318
through the `_link_or_copy SRC DST` helper near the `IS_WINDOWS` detection. On
299319
Windows without Developer Mode, plain `ln -snf` produces frozen file copies that

SKILL.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -963,6 +963,7 @@ Refs are invalidated on navigation — run `snapshot` again after `goto`.
963963
| `disconnect` | Disconnect headed browser, return to headless mode |
964964
| `focus [@ref]` | Bring headed browser window to foreground (macOS) |
965965
| `handoff [message]` | Open visible Chrome at current page for user takeover |
966+
| `memory [--json]` | Snapshot Bun heap + per-tab JS heap + Chromium process tree + bounded buffer sizes. JSON output with --json. |
966967
| `restart` | Restart server |
967968
| `resume` | Re-snapshot after user takeover, return control to AI |
968969
| `state save|load <name>` | Save/load browser state (cookies + URLs) |

TODOS.md

Lines changed: 135 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,140 @@
11
# TODOS
22

3+
## gbrowser memory follow-ups (filed via /plan-eng-review + /codex on the v1.49 leak-fix PR)
4+
5+
These four items came out of the memory-leak investigation that shipped
6+
the `$B memory` diagnostic + the four leak fixes. They were
7+
deliberately deferred from that PR (already 14 commits / ~12 files);
8+
each stands alone and any one could ship independently.
9+
10+
### P2: MV3 extension service worker memory profile
11+
12+
**What:** The `/memory` endpoint snapshot enumerates pages but does
13+
not enumerate the gstack baked-in extension's service-worker target.
14+
A long-running MV3 service worker can leak through retained DOM
15+
snapshots, message ports that never close, alarms that re-arm, and
16+
caches that grow without bound. The diagnostic should call
17+
`Target.getTargets` with a filter for `service_worker` and include
18+
each one in `tabs[]` (or a sibling `serviceWorkers[]` array) with the
19+
same `Performance.getMetrics` data.
20+
21+
**Why:** Codex's outside-voice review on the eng-review surfaced this
22+
class of leak (the extension is part of the gbrowser process tree but
23+
invisible to today's snapshot). Until we surface it, a SW leak shows
24+
up only in the parent process RSS with no per-target attribution.
25+
26+
**Pros:** Closes the per-target attribution gap for the
27+
single-most-likely future leak source (our own extension).
28+
**Cons:** Extension SW lifecycle is asymmetric vs page lifecycle;
29+
auto-attach + filter is one more piece of CDP plumbing.
30+
31+
**Context:** Codex finding #4 on the eng-review outside voice. Not
32+
in scope of the v1.49 PR; deliberately deferred to keep the PR to
33+
the four highest-confidence leak fixes.
34+
35+
**Priority:** P2. **Effort:** M.
36+
37+
---
38+
39+
### P2: Native + GPU memory breakdown in `$B memory`
40+
41+
**What:** `$B memory` shows Bun RSS + per-tab JS heap + Chromium
42+
process tree (PIDs + types + CPU time) but the per-process RSS is
43+
absent — `SystemInfo.getProcessInfo` doesn't expose RSS and the eng
44+
review (D2 USE_CDP) explicitly chose CDP over shelling to `ps`. The
45+
honest next step is to surface what CDP DOES give for the other
46+
memory categories: `Memory.getDOMCounters` per target (node + listener
47+
counts), `SystemInfo.getInfo` for GPU memory, `Memory.getAllTimeSamplingProfile`
48+
for a sampled native estimate.
49+
50+
**Why:** Codex's outside-voice review flagged that
51+
`Performance.getMetrics` misses native memory, GPU memory, video
52+
buffers, Skia, network cache, extension process RSS, and
53+
browser-process RSS — all the categories where a 160 GB leak would
54+
actually live. A diagnostic that misses the categories where the
55+
leak class lives undersells itself.
56+
57+
**Pros:** Per-process category breakdown closes the gap between
58+
"Activity Monitor says 160 GB" and what the diagnostic shows.
59+
**Cons:** Each CDP method has its own quirks; this is a real
60+
implementation pass, not a one-line addition.
61+
62+
**Context:** Codex finding #5 on the eng-review outside voice. Not
63+
in scope of the v1.49 PR; deliberately deferred.
64+
65+
**Priority:** P2. **Effort:** M.
66+
67+
---
68+
69+
### P3: Single-context CDP listener for Network.loadingFinished
70+
71+
**What:** `wirePageEvents` attaches a `page.on('requestfinished')`
72+
listener PER PAGE. The D10 fix removed the body-materialization leak
73+
inside that listener but kept the per-page listener architecture
74+
(7 listeners attached per tab — close, framenavigated, dialog,
75+
console, request, response, requestfinished). The stretch goal from
76+
D10 was to replace the per-page `requestfinished` listener with a
77+
single context-level CDP listener via
78+
`Target.setAutoAttach({autoAttach: true, waitForDebuggerOnStart: false,
79+
flatten: true})` and a browser-wide `Network.loadingFinished` event
80+
handler.
81+
82+
**Why:** Going from N to 1 listener for the request-size capture is
83+
structurally the right architecture and removes one piece of per-tab
84+
memory pressure. The body-materialization fix already addressed the
85+
acute leak; this is the architectural cleanup that prevents similar
86+
leaks in the same class.
87+
88+
**Pros:** One listener per browser instead of one per tab.
89+
**Cons:** `Target.setAutoAttach` plumbing is more code than the
90+
straight per-page listener; the marginal memory win is small on top
91+
of the body-fetch fix that already landed.
92+
93+
**Context:** D10 stretch goal on the eng-review. The minimal-risk
94+
fix shipped in v1.49 (replaces `await res.body()` with
95+
`await req.sizes()`, preserving the per-page listener); this is the
96+
architectural follow-up.
97+
98+
**Priority:** P3. **Effort:** M-L.
99+
100+
---
101+
102+
### P3: Real-Chromium peak-RSS reproducer (periodic tier)
103+
104+
**What:** The gate-tier reproducer
105+
(`browse/test/memory-leak-reproducer.test.ts`) pins the invariant
106+
that `res.body()` is never called during a burst of
107+
`requestfinished` events. It uses a fake page; it does NOT spin up a
108+
real Chromium nor measure peak Bun RSS during a real concurrent fetch
109+
burst. A periodic-tier follow-up should: spin up a real headless
110+
Chromium, navigate to a fixture page that concurrently fetches 500
111+
mixed responses (small JSON, 100 KB images, 10 MB chunked,
112+
gzip-compressed 2 MB), sample `process.memoryUsage().heapUsed` every
113+
100 ms during the burst, assert `peak_heap < 200 MB above baseline`
114+
AND `post-gc_heap < 30 MB above baseline`. Also include a single-tab
115+
WebGL canvas variant that grows to >4 GB and asserts the per-tab RSS
116+
toast fires.
117+
118+
**Why:** Codex flagged that the leak's real failure mode is transient
119+
amplification under concurrent burst, not retained leak — a steady-state
120+
heap test misses it. The fake-page gate-tier test catches the
121+
listener-architecture regression; the periodic real-browser test
122+
catches the actual peak-RSS class.
123+
124+
**Pros:** Closes the "did we actually demonstrate the OOM is fixed"
125+
question with hard numbers. Feeds the ANGLE_B_NUMBERS CHANGELOG
126+
release-summary table.
127+
**Cons:** Periodic tier costs minutes of CI time and money per run;
128+
real-browser memory tests are inherently flaky.
129+
130+
**Context:** Codex outside-voice finding on the eng-review; D7
131+
ANGLE_B_NUMBERS CHANGELOG framing needs this reproducer's numbers
132+
before /ship time.
133+
134+
**Priority:** P3. **Effort:** M.
135+
136+
---
137+
3138
## design daemon: follow-ups (filed v1.45.0.0 via /ship review army)
4139

5140
### ✅ DONE (v1.45.0.0): Tighten daemon test coverage

VERSION

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
1.49.0.0
1+
1.53.0.0

browse/SKILL.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -921,6 +921,7 @@ $B prettyscreenshot --cleanup --scroll-to ".pricing" --width 1440 ~/Desktop/hero
921921
| `disconnect` | Disconnect headed browser, return to headless mode |
922922
| `focus [@ref]` | Bring headed browser window to foreground (macOS) |
923923
| `handoff [message]` | Open visible Chrome at current page for user takeover |
924+
| `memory [--json]` | Snapshot Bun heap + per-tab JS heap + Chromium process tree + bounded buffer sizes. JSON output with --json. |
924925
| `restart` | Restart server |
925926
| `resume` | Re-snapshot after user takeover, return control to AI |
926927
| `state save|load <name>` | Save/load browser state (cookies + URLs) |

0 commit comments

Comments
 (0)