Skip to content

Drain deferred disposeBrowserContext on next createBrowserContext (#2472 follow-up)#2476

Closed
staylor wants to merge 1 commit into
lightpanda-io:mainfrom
staylor:fix/2472-deferred-dispose
Closed

Drain deferred disposeBrowserContext on next createBrowserContext (#2472 follow-up)#2476
staylor wants to merge 1 commit into
lightpanda-io:mainfrom
staylor:fix/2472-deferred-dispose

Conversation

@staylor
Copy link
Copy Markdown
Contributor

@staylor staylor commented May 15, 2026

Follow-up to #2472 / #2474. The frame-ID PR fixed the cross-BrowserContext target-ID collision. This one fixes a second Cannot have more than one browser context at a time failure that the same Playwright workload hits when the CDP connection is long-lived.

Problem

CDP.disposeBrowserContext (src/cdp/CDP.zig:378) defers cleanup when a script eval is on the stack. The existing comment is right about why -- a CDP message drained inside HttpClient.syncRequest arrives mid-evaluate, and tearing down the bc here would free Session / V8 / inspector state the unwinding script frame is about to dereference. The comment also said cleanup was "deferred to CDP.deinit at connection close, by which time eval has unwound."

That assumption holds for short-lived CDP connections that close after each browser session. It breaks for long-lived ones. Playwright's chromium.connectOverCDP keeps the websocket open across many BrowserContexts: once a dispose hits the deferred branch, browser_context stays non-null, and the next createBrowserContext rejects with error.AlreadyExists -> Cannot have more than one browser context at a time for the rest of the connection. There is no path to recovery short of disconnecting and reconnecting, which defeats the point of pooling.

Reproduction

Minimal Node + Playwright + a slow <script src> server:

import { chromium } from "playwright-core";
import http from "node:http";
import { spawn } from "node:child_process";

// Slow script server: holds /slow.js open for 3 s so syncRequest is in flight.
const server = http.createServer((req, res) => {
  if (req.url === "/slow.js") {
    setTimeout(() => res.end("window.__loaded=true;"), 3_000);
    return;
  }
  res.writeHead(200, { "content-type": "text/html" });
  res.end(`<!doctype html><script src="/slow.js"></script>`);
});
await new Promise((r) => server.listen(0, "127.0.0.1", r));
const PAGE_URL = `http://127.0.0.1:${server.address().port}/`;

const lp = spawn("lightpanda", ["serve", "--host", "127.0.0.1", "--port", "9229"], { stdio: "inherit" });
await new Promise((r) => setTimeout(r, 1_500));
const browser = await chromium.connectOverCDP("ws://127.0.0.1:9229");

for (let i = 1; i <= 3; i++) {
  const ctx = await browser.newContext();
  const page = await ctx.newPage();
  page.goto(PAGE_URL).catch(() => {});           // start sync /slow.js fetch
  await new Promise((r) => setTimeout(r, 500));  // close mid-fetch
  await ctx.close();                             // -> deferred dispose
  await new Promise((r) => setTimeout(r, 4_000)); // wait long enough for eval to unwind
}

Before this PR (against main):

cycle 1 close() returned
FAILED: browser.newContext: Protocol error (Target.createBrowserContext):
  Cannot have more than one browser context at a time

After:

cycle 1 close() returned
cycle 2 close() returned
cycle 3 close() returned
ALL CYCLES OK

Fix

Three small changes in src/cdp/CDP.zig:

  1. New pending_dispose: bool flag on BrowserContext. Set in the deferred branch of disposeBrowserContext instead of just returning. Marks the bc as logically gone but physically still alive while the eval frame above us unwinds.

  2. New private flushPendingDispose helper. Re-checks anyScriptEvaluating() and runs the same bc.deinit + closeSession + browser_context = null the non-deferred path would have if eval is now off the stack. Returns false if eval is still on the stack so the caller can surface that to the client.

  3. createBrowserContext calls flushPendingDispose before checking AlreadyExists. If the previous deferred dispose can be drained, the new context proceeds normally. If eval is still on the stack at the next create attempt (very tight cycle), the create still rejects with AlreadyExists -- waiting for eval to finish is the client's problem, but recovery now happens automatically once it does.

  4. isValidSessionId treats pending_dispose == true as no bc. Commands on the old session_id now reject with Unknown sessionId instead of silently routing into a half-alive context.

Session.removePage has the same deferred pattern but isn't in the client-visible recovery path (Playwright never calls page-level dispose, only context-level), so this PR doesn't touch it. Filing separately if it turns out to bite something.

Tests

zig build test: 653 / 653 pass (652 existing + 1 new).

Added cdp.target: createBrowserContext flushes deferred dispose once eval unwinds:

  • Forces is_evaluating = true on the active page's frame.
  • Asserts disposeBrowserContext returns success and flips pending_dispose.
  • Asserts createBrowserContext rejects with AlreadyExists while eval is still on the stack.
  • Drops is_evaluating = false (eval has unwound).
  • Asserts the next createBrowserContext flushes the pending dispose and returns a distinct bc.

Verified the test catches the bug: reverting the CDP.zig changes (keeping only the test) produces a single-test failure, the rest of the suite still passes.

Relationship to other PRs

Independent of both -- can land in any order. With all three, Playwright connectOverCDP becomes safe to use as a long-lived pooled connection.

`CDP.disposeBrowserContext` defers cleanup when a script eval is on the
stack (a CDP message drained inside `HttpClient.syncRequest` arrives
mid-evaluate; tearing down the bc here would free Session/V8/inspector
state the unwinding script frame is about to dereference). The
existing comment said cleanup was "deferred to CDP.deinit at connection
close, by which time eval has unwound."

That assumption holds for short-lived CDP connections that close after
each browser session. It breaks for long-lived ones (Playwright's
`chromium.connectOverCDP` keeps the websocket open across many
contexts): once a dispose hits the deferred branch, `browser_context`
stays non-null, and the next `createBrowserContext` rejects with
`error.AlreadyExists` -> `Cannot have more than one browser context
at a time` for the rest of the connection.

Fix: mark the bc with a new `pending_dispose` flag in the deferred
branch instead of just returning, and drain that pending dispose on
the next `createBrowserContext` once eval has unwound. The drain runs
the same `bc.deinit` + `closeSession` + `browser_context = null` the
non-deferred path would have. If eval is still on the stack at the
next create attempt (very tight cycle), the create still rejects with
AlreadyExists -- waiting for eval to finish is the client's problem,
but recovery now happens automatically once it does.

Also updates `isValidSessionId`: a pending-dispose bc is logically
gone from the client's POV, so commands on its old session_id reject
like any other unknown session instead of silently routing into a
half-alive context.

Reproduction (`playwright-core@1.58.2`) -- a slow `<script src>` load
forces `disposeBrowserContext` into the deferred branch, then a
later `newContext` (after the script has finished) recovers cleanly:

  for (let i = 1; i <= 3; i++) {
    const ctx = await browser.newContext();
    const page = await ctx.newPage();
    page.goto(SLOW_PAGE_URL).catch(() => {});
    await new Promise((r) => setTimeout(r, 500));   // close mid-fetch
    await ctx.close();
    await new Promise((r) => setTimeout(r, 4000));  // let eval unwind
  }

Before: cycle 2's `newContext()` throws
  `Cannot have more than one browser context at a time`.
After: 3/3 cycles complete cleanly.

Tests: 653 / 653 pass. Added regression coverage in
`cdp.target: createBrowserContext flushes deferred dispose once eval
unwinds` -- forces `is_evaluating = true`, asserts dispose defers and
flips `pending_dispose`, asserts `createBrowserContext` rejects while
eval is still on the stack, then drops `is_evaluating` and asserts
the next `createBrowserContext` flushes the pending dispose and
returns a distinct bc.

Notes:
- `Session.removePage` has the same deferred pattern but isn't in the
  client-visible recovery path (Playwright never calls page-level
  dispose, only context-level), so this PR doesn't touch it.
- Independent of lightpanda-io#2472 (frame-ID generator scope) and lightpanda-io#2399 (Playwright
  STARTUP-session promotion). The three together close the path for
  Playwright `connectOverCDP` to recycle BrowserContexts safely.
@karlseguin
Copy link
Copy Markdown
Collaborator

Closing. Fixed (or at least significantly improved) by #2495

@karlseguin karlseguin closed this May 21, 2026
@github-actions github-actions Bot locked and limited conversation to collaborators May 21, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants