Promote synthetic STARTUP session for Playwright connectOverCDP#2399
Promote synthetic STARTUP session for Playwright connectOverCDP#2399staylor wants to merge 1 commit into
Conversation
Lightpanda's CDP server advertises a synthetic STARTUP target +
session in setAutoAttach so drivers don't block waiting for a target
to appear. The dispatcher then blindly {}-acked everything that
arrived with sessionId="STARTUP" except Page.getFrameTree, on the
assumption the driver would call Target.createBrowserContext +
Target.createTarget itself before sending real work.
That assumption holds for puppeteer-core (it does call
createBrowserContext + createTarget on a real session_id) but breaks
playwright-core: chromium.connectOverCDP auto-attaches to whatever
target is advertised and immediately drives it (Page.enable,
Page.navigate, ...) on sessionId="STARTUP", never calling
createBrowserContext / createTarget itself. With the old dispatcher,
Page.navigate returned {} and no Page.frameNavigated / loadEventFired
events ever fired, so page.goto timed out indefinitely.
Reproducer: drive lightpanda serve with playwright-core's
chromium.connectOverCDP and call page.goto on any URL.
Fix:
* dispatchStartupCommand now lazily promotes the synthetic STARTUP
session into a real BrowserContext + Target whose session_id is
the literal string "STARTUP" the first time a STARTUP-tagged
command actually requires real page state, then routes through
dispatchCommand normally. After promotion, isValidSessionId
accepts "STARTUP" and every subsequent command flows through the
standard handlers.
Target.* and Runtime.runIfWaitingForDebugger are explicitly held
out: Puppeteer sends Target.setAutoAttach and
Runtime.runIfWaitingForDebugger with sessionId="STARTUP"
*between* puppeteer.connect() and its own
Target.createBrowserContext call. Promoting on those would steal
the bc out from under Puppeteer's createBrowserContext (which
would then error with "Cannot have more than one browser
context at a time"). Silently {}-acking them, as before, keeps
Puppeteer's flow intact.
Once a bc exists with session_id != "STARTUP" (i.e. a real
session_id was assigned by createTarget + doAttachtoTarget),
further STARTUP-tagged commands are rejected with -32001
"Unknown sessionId" rather than silently no-oped, since they're
now stale.
* promoteStartupSession (new in domains/target.zig) mirrors the
bootstrap portion of createTarget but skips the
Target.targetCreated / Target.attachedToTarget events: the driver
already received Target.attachedToTarget in setAutoAttach and a
duplicate would make Playwright think two sessions exist for one
target and try to drive both.
* The synthetic targetId / browserContextId in setAutoAttach were
"TID-STARTUP" / "BID-STARTUP". They are now "FID-0000000001" /
"BID-1" — the same strings the first real Target / BrowserContext
will be assigned (Session.nextFrameId returns 1 first; the bc id
generator returns BID-1 first). After promotion, every event
Lightpanda emits (Page.frameNavigated, Runtime.executionContextCreated,
etc.) carries IDs that line up with what the driver already
recorded from the synthetic event. Page.getFrameTree's synthetic
placeholder was updated to match (LID-STARTUP -> LID-0000000001).
* doAttachtoTarget now reuses the literal string "STARTUP" as
session_id (and suppresses the duplicate Target.attachedToTarget
event) when setAutoAttach previously advertised the synthetic
STARTUP session and bc.session_id is still null. This handles the
Puppeteer flow: createTarget would otherwise generate a fresh
SID-1 and emit a second attachedToTarget for the same targetId,
causing Puppeteer to try to initialize Page state over both
sessions in parallel; the STARTUP one would then fail every
command with "Unknown sessionId" because bc.session_id had been
overwritten to SID-1.
Verified:
* playwright-core 1.59.1 chromium.connectOverCDP + page.goto on
https://www.allbirds.com/products/mens-wool-runners now returns
status=200 and fires framenavigated / domcontentloaded / load
events. Server stays alive across 10 sequential runs.
* puppeteer-core 24.42.0 puppeteer.connect() + createBrowserContext
+ newPage().goto() still works end-to-end and extracts the
expected <title>. (Server then segfaults on disconnect from a
separate worker re-entrancy bug fixed in lightpanda-io#2398.)
* 523/523 unit tests pass; the existing "cdp: STARTUP sessionId"
test was rewritten to assert the new lazy-promote behavior, and
two new tests cover the rejection paths (bc with non-STARTUP
session_id, bc with no session_id at all).
|
Hello @staylor, first thank you for the PR. Do you have a concrete example where the current behavior blocks you? ie. you can't create the first browser context by yourself? While I agree w/ your initial analysis and the issue w/ clients trying to use the default browser context/target, I'm not sure about supporting lazy load on auto-attach. As you mentioned, our idea is to offer an empty browser first and delegate the responsibility to create new browser context and new page itself. Lazy load is smart, but complex since you have different paths depending on the client (which can change in the future). And it doesn't really work in the case of a Playwright script using So the main problem is we don't have clear paths between Playwright scripts using the default BC and the ones creating a clean new BC. That's why we have to explain to Stagehand's users to create a new page while the official doc reuses the existing one. // Important: in the official documentation, Stagehand uses the default
// existing page. But Lightpanda requires explicit page creation
// instead.
const page = await stagehand.context.newPage();Even if we could create a default BC, we don't want to penalize all Playwright users by creating two if they don't use the default one. That's why I'm in the position where I think it's better to ask users to create context manually instead of trying to use the default one. But at least 1 thing could be improved: it would be better to return an explicit error than silently accept the command on WDYT? |
Complementary to LP.setSubframeLoading (preceding commit): exposes
the same iframe-skip behavior as a CLI option that applies to all
sessions in the process. Useful for:
* the 'fetch' subcommand (no CDP driver to call LP.setSubframeLoading)
* 'serve' deployments where the operator wants iframes off by
default for every connecting client (the LP method can still
re-enable per-session if needed)
* Playwright's chromium.connectOverCDP, which can't reliably issue
custom CDP methods on Lightpanda today: BrowserContext.newCDPSession
and Browser.newBrowserCDPSession both attach a new CRSession that
collides with the STARTUP-session reuse from lightpanda-io#2399, triggering a
Playwright internal assertion. With --disable-subframes set on the
server, Playwright doesn't need to issue any custom CDP \u2014 every
session inherits subframes-off and the executionContextId churn
from lightpanda-io#2400 never trips.
Verified:
serve --disable-subframes + plain puppeteer-core goto
[ok] goto status=200 elapsed=6354ms frameAttached=0
fetch --disable-subframes --dump html https://www.allbirds.com/...
exit=0
html bytes: 1021562
title: <title>Allbirds Wool Runners, Men's | ...</title>
iframe count in dumped html: 2 (still in DOM, just not loaded)
521/521 unit tests pass.
|
this is less important for-me-personally now, since I just shifted to using Puppeteer instead - I'll come back to this if someone needs to confirm Playwright behavior |
Summary
Makes Lightpanda's CDP server work with
playwright-core'schromium.connectOverCDP. Previously, Playwright would auto-attach to the syntheticSTARTUPtarget Lightpanda advertises inTarget.setAutoAttach, drive it withPage.enable/Page.navigate/ etc. onsessionId="STARTUP", and time out forever waiting forPage.frameNavigatedevents that never came — the dispatcher silently{}-acked every STARTUP-tagged command exceptPage.getFrameTree.Reproduction
puppeteer-corewas unaffected because it uses a different protocol shape:puppeteer.connect()→browser.createBrowserContext()→context.newPage(), which sendsTarget.createBrowserContext+Target.createTarget(no sessionId) and then drives the new target on a real session_id assigned bydoAttachtoTarget. It never sends a STARTUP-tagged work command.Root cause
Lightpanda's CDP server is built around the assumption "browser starts empty, driver creates everything."
Target.setAutoAttachadvertises a synthetic placeholder target withsessionId="STARTUP"so drivers don't block waiting for a real target to appear. The comment insetAutoAttachmakes the assumption explicit: "Hopefully, the first thing they'll do is create a real BrowserContext and progress from there."Playwright's
chromium.connectOverCDPdoes the opposite. It assumes a real Chrome on the other end and auto-attaches to whatever target is advertised, then immediately drives that target. The dispatcher'sdispatchStartupCommand:silently drops
Page.enable,Page.navigate,Network.enable,Runtime.enable,Page.setLifecycleEventsEnabled,Page.addScriptToEvaluateOnNewDocument,Emulation.*, etc. AfterPage.navigate, noPage.frameNavigated/lifecycleEvent/loadEventFiredevents ever fire andpage.gototimes out.Fix
Three coordinated changes:
1. Lazy-promote on first STARTUP work command.
dispatchStartupCommandnow creates a realBrowserContext+Targetwhosesession_idis the literal string"STARTUP"the first time a STARTUP-tagged command needs real page state, then routes through the normal dispatcher. After promotion,isValidSessionIdaccepts"STARTUP"and every subsequent command flows through the standard handlers.Target.*andRuntime.runIfWaitingForDebuggerare explicitly held out: Puppeteer sendsTarget.setAutoAttachandRuntime.runIfWaitingForDebuggerwithsessionId="STARTUP"betweenpuppeteer.connect()and its ownTarget.createBrowserContext. Promoting on those would steal the bc out from under Puppeteer'screateBrowserContext(which would then error with"Cannot have more than one browser context at a time"). Silently{}-acking them, as before, keeps Puppeteer's flow intact.Once a bc exists with
session_id != "STARTUP"(i.e. a real session_id was assigned bycreateTarget+doAttachtoTarget), further STARTUP-tagged commands are rejected with-32001 "Unknown sessionId"rather than silently no-oped, since they're now stale.2. Synthetic IDs match the real first frame / context. The synthetic
targetId/browserContextIdinsetAutoAttachwere"TID-STARTUP"/"BID-STARTUP". They are now"FID-0000000001"/"BID-1"— the same strings the first real Target / BrowserContext will be assigned (Session.nextFrameIdreturns 1 first; the bc id generator returnsBID-1first). After promotion, every event Lightpanda emits (Page.frameNavigated,Runtime.executionContextCreated, etc.) carries IDs that line up with what the driver already recorded from the synthetic event.Page.getFrameTree's synthetic placeholder was updated to match (LID-STARTUP→LID-0000000001). Without this, Playwright's firstgetFrameTreeresponse carried a differentframe.idthan thetargetIdit had just learned fromTarget.attachedToTarget, and Playwright marked the original main frame as detached —page.gotothen errored synchronously with"Frame has been detached. (after 1ms)".3.
doAttachtoTargetreuses the literal"STARTUP"session_id when the synthetic STARTUP session was already advertised. A new flagcdp.startup_session_advertisedis set whensetAutoAttachemits the synthetic event, and consumed bydoAttachtoTargetthe next time it would otherwise generate a fresh session_id. Without this, Puppeteer's flow ended up with twoTarget.attachedToTargetevents for the sametargetId(one withsessionId="STARTUP", then another withsessionId="SID-1"aftercreateTargetran). Puppeteer treats those as two separate sessions and tries to initialize a Page over each; the STARTUP one then fails every command with"Unknown sessionId"becausebc.session_idhad been overwritten toSID-1.A new helper
promoteStartupSessionindomains/target.zigmirrors the bootstrap portion ofcreateTargetbut skips theTarget.targetCreated/Target.attachedToTargetevents (the driver already receivedTarget.attachedToTargetinsetAutoAttachand a duplicate would re-trigger Playwright's confusion).Verification
playwright-core1.59.1chromium.connectOverCDP+page.gotoonhttps://www.allbirds.com/products/mens-wool-runnersnow returnsstatus=200and firesframenavigated/domcontentloaded/loadevents. Server stays alive across 10 sequential runs.puppeteer-core24.42.0puppeteer.connect()+createBrowserContext+newPage().goto()still works end-to-end and extracts the expected<title>and ~922 KB body.cdp: STARTUP sessionIdtest was rewritten to assert the new lazy-promote behavior, and two new tests cover the rejection paths (bc with non-STARTUP session_id, bc with no session_id at all).lightpanda serveprocess: 9 successful (status=200), 1 Playwright timeout (network flake on the 10th run), server stayed alive throughout.Notes / out of scope
Playwright now navigates successfully but
page.title()returns""andpage.evaluate(() => document.title)errors with"Execution context was destroyed, most likely because of a navigation."That's Lightpanda'sPage.createIsolatedWorld/Runtime.executionContextCreatedflow not re-binding Playwright's utility-world context after navigation — a separate gap, not fixed here. The user-facingpage.gotopath works.Independent of #2398. Either order of landing is fine.