Summary
Calling Fetch.enable (CDP) on a Lightpanda session that was started with --http-proxy deadlocks every subsequent navigation. Specifically, Page.navigate returns successfully but the document never finishes loading -- subresource Network.requestWillBeSent events stop firing, and the navigation hangs until the client-side timeout.
This is the CDP-layer root cause behind a behavior that puppeteer-core's page.setRequestInterception(true) exposes -- puppeteer enables the Fetch domain under the hood when interception is requested.
Reproduction
CDP-level (no puppeteer):
- Start lightpanda:
lightpanda serve --http-proxy http://127.0.0.1:8080 --port 9222 --host 127.0.0.1
(any HTTP CONNECT proxy on 127.0.0.1:8080 reproduces -- a local mitmproxy / squid is fine; it doesn't have to actually reach the upstream)
- Connect a CDP client, attach to a target with
flatten: true.
- Send
Fetch.enable with patterns: [{ urlPattern: "*" }] (or even Fetch.enable {} with no patterns).
- Send
Page.navigate { url: "https://example.com" }.
Observed: Page.navigate returns a frame id; no Page.frameStoppedLoading, no Page.loadEventFired, no further Network.requestWillBeSent after the initial document request. The navigation hangs until the client-side timeout.
Without --http-proxy: Fetch.enable works correctly -- subresource interception fires, navigation completes normally.
Without Fetch.enable: --http-proxy works correctly -- pages load through the proxy in the expected time.
The deadlock requires both at once.
puppeteer-core surface
// Same shape, easier to reproduce locally
const browser = await puppeteer.connect({ browserWSEndpoint: 'ws://127.0.0.1:9222' });
const page = await browser.newPage();
await page.setRequestInterception(true); // calls Fetch.enable internally
await page.goto('https://example.com'); // hangs until timeout
I empirically measured this against https://www.allbirds.com/products/womens-wool-runners-dapple-grey through a residential-proxy chain: direct fetch through Lightpanda (with --http-proxy set) completes in ~4.4 s; the same page with setRequestInterception(true) enabled hits the navigation timeout (30 s) and fails.
Why this matters
Fetch.enable is the standard way to filter outbound resource requests at the browser level (block image / font / media / tracking-domain hosts before they fetch). For automation use cases this is critical:
- Scraping at scale: blocking unnecessary subresources cuts page load time 2-5x and saves substantial proxy bandwidth.
- The
--http-proxy flag is also critical -- routing through residential proxies / proxy classifiers is a common production setup.
When the two can't coexist, callers have to choose: skip the bandwidth filter (heavier pages, more third-party JS executes -- including widgets like Yotpo that inject hundreds of <style> elements per interaction), or skip the proxy (loses the routing model entirely). Neither is acceptable for production scrape paths.
Suspected mechanism
I haven't traced this in the Lightpanda source, but the symptom (subresource events stop firing, document never completes) suggests something in the CDP Fetch.enable path either:
- Doesn't propagate the upstream proxy configuration to the intercepted-and-resumed request, so resumed requests stall at connect time, or
- Holds a lock / pending-completion handle that the proxy round-trip doesn't release, or
- Expects the
Fetch.requestPaused / Fetch.continueRequest round-trip to complete on the direct request path and breaks when the request is going through the proxy
Happy to instrument and bisect if a maintainer can point me at the right entry point in src/cdp/domains/fetch* (or wherever the Fetch domain lives).
Environment
- Lightpanda:
1.0.0-nightly.6240+37391687
- OS: macOS 14, arm64
- Test driver: puppeteer-core via CDP at
ws://127.0.0.1:9222
Related
- Affects scraper setups that need both per-request filtering and proxy routing.
- Documented as a known constraint in our codebase (we skip resource filtering on Lightpanda only, with a multi-paragraph comment pointing at this exact deadlock) -- happy to share the workaround code if it's useful as a reference for what surface client code uses.
Summary
Calling
Fetch.enable(CDP) on a Lightpanda session that was started with--http-proxydeadlocks every subsequent navigation. Specifically,Page.navigatereturns successfully but the document never finishes loading -- subresourceNetwork.requestWillBeSentevents stop firing, and the navigation hangs until the client-side timeout.This is the CDP-layer root cause behind a behavior that puppeteer-core's
page.setRequestInterception(true)exposes -- puppeteer enables the Fetch domain under the hood when interception is requested.Reproduction
CDP-level (no puppeteer):
lightpanda serve --http-proxy http://127.0.0.1:8080 --port 9222 --host 127.0.0.1(any HTTP CONNECT proxy on
127.0.0.1:8080reproduces -- a local mitmproxy / squid is fine; it doesn't have to actually reach the upstream)flatten: true.Fetch.enablewithpatterns: [{ urlPattern: "*" }](or evenFetch.enable {}with no patterns).Page.navigate { url: "https://example.com" }.Observed:
Page.navigatereturns a frame id; noPage.frameStoppedLoading, noPage.loadEventFired, no furtherNetwork.requestWillBeSentafter the initial document request. The navigation hangs until the client-side timeout.Without
--http-proxy:Fetch.enableworks correctly -- subresource interception fires, navigation completes normally.Without
Fetch.enable:--http-proxyworks correctly -- pages load through the proxy in the expected time.The deadlock requires both at once.
puppeteer-core surface
I empirically measured this against
https://www.allbirds.com/products/womens-wool-runners-dapple-greythrough a residential-proxy chain: direct fetch through Lightpanda (with--http-proxyset) completes in ~4.4 s; the same page withsetRequestInterception(true)enabled hits the navigation timeout (30 s) and fails.Why this matters
Fetch.enableis the standard way to filter outbound resource requests at the browser level (block image / font / media / tracking-domain hosts before they fetch). For automation use cases this is critical:--http-proxyflag is also critical -- routing through residential proxies / proxy classifiers is a common production setup.When the two can't coexist, callers have to choose: skip the bandwidth filter (heavier pages, more third-party JS executes -- including widgets like Yotpo that inject hundreds of
<style>elements per interaction), or skip the proxy (loses the routing model entirely). Neither is acceptable for production scrape paths.Suspected mechanism
I haven't traced this in the Lightpanda source, but the symptom (subresource events stop firing, document never completes) suggests something in the CDP
Fetch.enablepath either:Fetch.requestPaused/Fetch.continueRequestround-trip to complete on the direct request path and breaks when the request is going through the proxyHappy to instrument and bisect if a maintainer can point me at the right entry point in
src/cdp/domains/fetch*(or wherever the Fetch domain lives).Environment
1.0.0-nightly.6240+37391687ws://127.0.0.1:9222Related