Skip to content

Add Lightpanda as alternative to Chrome for the browser automation tool#1028

Open
krichprollsch wants to merge 8 commits intonextlevelbuilder:devfrom
krichprollsch:lightpanda
Open

Add Lightpanda as alternative to Chrome for the browser automation tool#1028
krichprollsch wants to merge 8 commits intonextlevelbuilder:devfrom
krichprollsch:lightpanda

Conversation

@krichprollsch
Copy link
Copy Markdown

@krichprollsch krichprollsch commented Apr 24, 2026

Summary

Adds Lightpanda as an opt-in alternative to Chrome for the browser automation tool. Lightpanda is a low-memory CDP-compatible headless browser, but its connection model differs from Chrome — it requires one CDP connection per tab (each connection is its own browser server-side) instead of multiplexing tabs/contexts over a single WS.

  • pkg/browser/ — new Backend selector (chrome | lightpanda) with /json/version auto-detection. Lightpanda path mints a fresh CDP connection per tab (Target.createBrowserContext + Target.createTarget per conn). Tab tracking, shutdown, and the idle-page reaper all branch on backend via a shared closeManagedPageLocked helper. Tenant isolation on Lightpanda is implicit (every conn = fresh browser).
  • Page.captureScreenshot guard — Lightpanda returns a placeholder image, so the screenshot tool action returns a clean error directing the agent to the snapshot (AX tree) action.
  • Config — new BrowserToolConfig.Backend field + GOCLAW_BROWSER_BACKEND env var. Empty value triggers auto-detection. Chrome remains the default.
  • docker-compose.lightpanda.yml — sidecar overlay running lightpanda/browser:latest, wires both env vars.
  • docs/browser-backends.md — compatibility matrix vs Chrome (screenshot, multi-tab, cookie sharing, auto-reconnect) + minimum-Lightpanda-version note.
  • Integration tests (//go:build integration, skipped unless LIGHTPANDA_CDP_URL is set) — golden path, AX-snapshot + Eval, multi-tenant isolation, backend reporting, screenshot guard.

Closes #223.

Lightpanda quirks worked around

Live testing surfaced three behaviors that needed code-side handling:

Quirk Workaround in this PR
Lightpanda numbers targets per-browser; every conn's first target is FID-0000000001, so multi-tab maps collide Synthesize globally-unique lp-N keys (via Manager.nextLpTabSeq); upstream targetID is only used inside openTabLightpandaLocked
page.Info() returns valid data once post-open, then errors on subsequent calls — ListTabs was silently dropping tabs Cache URL + Title in pageInfos at OpenTab; ListTabs reads from cache
rod.Browser.Close() calls Browser.close which Lightpanda doesn't implement (UnknownMethod -31998); WS drops cleanly anyway Swallow the error to silence the noisy log line

Minimum Lightpanda version

The AX-tree snapshot path requires Lightpanda with lightpanda-io/browser#2232 merged (fixes Accessibility.getFullAXTree to return nodeId as a string per CDP spec, instead of a JSON number). Earlier Lightpanda images cause Snapshot to fail with a typed-decode error.

Verified working with lightpanda/browser:latest (digest sha256:00aa5c68...).

Test plan

  • go build ./... and go build -tags sqliteonly ./... pass.
  • go vet ./pkg/browser/... clean.
  • Existing browser unit tests pass (go test ./pkg/browser/...).
  • docker compose -f docker-compose.yml -f docker-compose.lightpanda.yml config validates.
  • Integration suite passes against a live Lightpanda sidecar:
    docker run -d --rm --name lp -p 9222:9222 lightpanda/browser:latest \
      lightpanda serve --host 0.0.0.0 --port 9222
    LIGHTPANDA_CDP_URL=ws://localhost:9222 \
      go test -tags integration -count=1 -run Lightpanda ./tests/integration/
    All five tests pass: SingleTenant_Golden, Snapshot_AndEval (AX tree returns refs + interactive nodes; Eval("() => document.title") returns the page title), MultiTenant_Isolation, Backend_ReportedCorrectly, Screenshot_BlockedByToolGuard.
  • Manual: bring up the full stack with the Lightpanda overlay, verify the agent can opennavigatesnapshotactevaluateclose end-to-end, and that screenshot surfaces the guard error.
  • Existing Chrome flow still works with docker-compose.browser.yml (Backend auto-detects to chrome from /json/version); covered by the existing browser unit tests.

Notes for review

  • Pinned to lightpanda/browser:latest for parity with the Chrome overlay; consider pinning to a fixed version once Lightpanda publishes stable tags.
  • The compose file invokes the lightpanda binary explicitly (command: ["lightpanda", "serve", ...]) because the image's default entrypoint isn't lightpanda.
  • The compose file's healthcheck uses /dev/tcp via sh. If Lightpanda's image is scratch-based without a shell, drop the healthcheck and switch depends_on to condition: service_started.
  • Eval requires a function form (e.g. () => document.title) — same as Chrome via go-rod's Page.Eval. Bare expressions fail on both backends because go-rod's wrapper does (USER_JS).apply(this, arguments).
  • No upstream tab listing on Lightpanda — ListTabs reads the manager's local map. Anything that previously relied on m.browser.Pages() to discover tabs opened out-of-band won't find them on Lightpanda (acceptable: only goclaw should be opening tabs in its sidecar).
  • No auto-reconnect on Lightpanda. A dead WS = that tab is gone server-side; the user/agent gets a clear "tab not found" / "tab closed" and reopens.
  • MaxPages (default 5) on Lightpanda now bounds CDP connections per tenant, not just open tabs. Default is fine; bump if you have heavy parallel-tab agents on Lightpanda.

Introduces an explicit backend selector for the browser automation
tool ("chrome" or "lightpanda"). Empty means auto-detect, which will
be wired in a later commit. Chrome remains the default behavior.
Lightpanda requires one CDP connection per tab (vs Chrome's shared-WS
model). Add a Backend selector in pkg/browser/, branched lifecycle, and
per-tab connection tracking via pageConns.

- Backend type (chrome | lightpanda), WithBackend option, Backend()
  accessor. Auto-detected from /json/version when unset.
- Lightpanda OpenTab mints a fresh conn, calls
  Target.createBrowserContext (required per-conn) + createTarget.
- ListTabs / getPage are local-map only on Lightpanda (no /json/list,
  no auto-reconnect).
- Shared closeManagedPageLocked helper used by CloseTab, evict, and
  the idle-page reaper.
- Screenshot tool action errors out on Lightpanda (returns a
  placeholder image upstream; route the agent to snapshot instead).
Pass cfg.Tools.Browser.Backend to browser.WithBackend when set, so the
"chrome" or "lightpanda" selector from config / GOCLAW_BROWSER_BACKEND
reaches the manager. Empty value still triggers /json/version
auto-detection in browser.Start().
Build-tag gated (//go:build integration), skipped unless
LIGHTPANDA_CDP_URL env is set. Covers:

- Golden path: open → snapshot → list → close
- Multi-tenant isolation: each tenant sees only its own tabs;
  cross-tenant CloseTab is rejected
- Backend() reports lightpanda after Start
- Screenshot tool action returns an IsError result mentioning
  lightpanda + snapshot
- docker-compose.lightpanda.yml: opt-in overlay running
  lightpanda/browser:latest, wires GOCLAW_BROWSER_REMOTE_URL and
  GOCLAW_BROWSER_BACKEND so the manager picks the right code path.
- docs/browser-backends.md: compatibility matrix vs Chrome (screenshot,
  multi-tab, cookie sharing, etc.) and guidance on when to pick which.
Live testing surfaced three issues:

- Lightpanda numbers targets per-browser, and each conn is its own
  browser, so every tab gets the same upstream targetID
  ("FID-0000000001"). Synthesize globally-unique "lp-N" keys for our
  internal map so multi-tenant tab tracking works.
- page.Info() returns valid data once post-open then errors on
  subsequent calls, which made ListTabs silently drop tabs. Cache URL
  and Title at OpenTab time and read from the cache in ListTabs.
- rod.Browser.Close() calls Browser.close which Lightpanda doesn't
  implement; the WS drops cleanly anyway. Swallow the error to quiet
  the noisy log line.

Two Lightpanda upstream bugs are documented in docs/browser-backends.md
and exercised by TestLightpanda_KnownUpstreamGaps:
  1. Accessibility.getFullAXTree returns nodeId as a JSON number
     (CDP spec: string)
  2. Runtime.evaluate rejects go-rod's function-apply wrapper
…mmand

- Eval works on Lightpanda when called with go-rod's expected function
  form (e.g. "() => document.title"); only bare expressions fail, and
  they fail on Chrome too. Fix the integration gap test and docs that
  incorrectly attributed this to a Lightpanda bug.
- docker-compose.lightpanda.yml: add the missing
  command: lightpanda serve ... — the image's entrypoint isn't
  lightpanda, so without an explicit command the sidecar wouldn't
  start.
…coverage

Lightpanda upstream merged the AX-tree nodeId fix in
lightpanda-io/browser#2232. The TestLightpanda_KnownUpstreamGaps canary
fired on the latest image, so:

- Rename to TestLightpanda_Snapshot_AndEval and assert the snapshot
  returns refs + non-empty text (instead of asserting it fails).
- Flip AX-snapshot in the compatibility matrix from gap to fully
  supported. Replace the "Known upstream gap" section with a
  "Minimum Lightpanda version" note pointing at the upstream PR.
@krichprollsch
Copy link
Copy Markdown
Author

Re-verified end-to-end with the upstream Lightpanda fix landed.

lightpanda-io/browser#2232 merged earlier today, which fixes Accessibility.getFullAXTree to return nodeId as a string per CDP spec (it was emitting a JSON number, which made go-rod's typed decoder choke). With that in place, the AX-snapshot path — the agent's primary "see the page" mechanism — now works on Lightpanda.

Updated the integration suite accordingly:

  • Renamed TestLightpanda_KnownUpstreamGapsTestLightpanda_Snapshot_AndEval and flipped it from "assert this fails" to "assert this returns refs + populated text".
  • Compatibility matrix in docs/browser-backends.md: AX snapshot row is now ✅ on both backends. The former "Known upstream gap" section is replaced with a "Minimum Lightpanda version" note pointing at #2232.

Re-ran the suite against lightpanda/browser:latest (digest sha256:00aa5c68...):

PASS  TestLightpanda_SingleTenant_Golden
PASS  TestLightpanda_Snapshot_AndEval        — 2 refs, 1 interactive, 153 chars
PASS  TestLightpanda_MultiTenant_Isolation
PASS  TestLightpanda_Backend_ReportedCorrectly
PASS  TestLightpanda_Screenshot_BlockedByToolGuard

Sample snapshot output for https://example.com:

- rootwebarea "Example Domain"
        - heading "Example Domain" [ref=e1]
        - paragraph
        - paragraph
          - link "Learn more" [ref=e2]

This unblocks real agent-driven scraping on Lightpanda — open → snapshot → act → evaluate all work. Screenshot remains gated (Lightpanda returns a placeholder image; the tool layer routes the agent to snapshot instead, which is the better channel for the LLM anyway).

Ready for review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Improvement] Lightpanda as brower

1 participant