Skip to content

[kernel-1116] Add CDP Monitor#213

Merged
archandatta merged 63 commits intomainfrom
archand/kernel-1116/cdp-foundation
May 5, 2026
Merged

[kernel-1116] Add CDP Monitor#213
archandatta merged 63 commits intomainfrom
archand/kernel-1116/cdp-foundation

Conversation

@archandatta
Copy link
Copy Markdown
Contributor

@archandatta archandatta commented Apr 13, 2026

Introduces the foundational layer of the CDP monitor as a standalone reviewablechunk. No Monitor struct wiring, just the primitives that everything else builds on.

  • types.go: CDP wire format (cdpMessage), all event type constants, internal state structs (networkReqState, targetInfo, CDP param shapes).

  • util.go: Console arg extraction, MIME allow-list (isCapturedMIME), resource type filter (isTextualResource), per-MIME body size caps (bodyCapFor), UTF-8-safe body truncation (truncateBody).

  • computed.go: State machine for the three derived events: network_idle (500ms debounce after all requests finish), layout_settled (1s after page_load with no layout shifts), navigation_settled (fires once all three flags converge). Timer invalidation via navSeq prevents stale AfterFunc callbacks from publishing for a previous navigation.

  • domains.go: isPageLikeTarget predicate (pages and iframes get Page.* / PerformanceTimeline.*; workers don't), bindingName constant, interaction.js embed.

  • interaction.js: Injected script tracking clicks, keydowns, and scroll-settled events via the __kernelEvent CDP binding.


Note

High Risk
Large new CDP capture subsystem that streams network/console data (including headers/bodies) and manages reconnect/timers across goroutines, which is complex and can impact stability, performance, and sensitive-data exposure if misconfigured.

Overview
Adds a full cdpmonitor implementation that connects to Chrome’s DevTools WebSocket, auto-attaches to targets, translates CDP notifications into typed events.Events (console, network, page, interaction), and publishes monitor lifecycle events (disconnect/reconnect/init-failed) with capped-exponential reconnect backoff.

Introduces per-session computed state machines (network_idle, page_layout_settled, page_navigation_settled), interaction tracking via embedded interaction.js with rate-limiting and sensitive-input suppression, and screenshot capture via ffmpeg with size downscaling and 2s rate limiting.

Updates API wiring to pass a slog logger into cdpmonitor.New, abstracts ApiService.cdpMonitor behind an interface for easier stubbing in tests, and adds extensive fixtures/tests to validate CDP wire-type roundtrips and monitor behavior (events, redirects, detach handling, TTL sweeps, and reconnects).

Reviewed by Cursor Bugbot for commit 7cc19d8. Bugbot is set up for automated code reviews on this repo. Configure here.

This binary is tracked on main and was incidentally deleted earlier on
this branch. Restoring it keeps the 13.4MB binary out of this PR's diff.
Removing the tracked binary from main should be done in a separate PR.
@archandatta archandatta requested a review from rgarcia April 23, 2026 19:35
@rgarcia
Copy link
Copy Markdown
Contributor

rgarcia commented Apr 27, 2026

I did a manual pass against this branch with the review harness, doing a single navigation to kernel.sh. Event log from that run: https://gist.github.com/rgarcia/cc2c9bb92b536d7f6659985868a3c56b

A few observations/suggestions from that run:

The event stream needs richer CDP identity metadata before this is reviewable as a browser logging foundation. The monitor already receives enough data to tie network events to a target/frame/navigation, but the published event drops most of it. Network.requestWillBeSent includes requestId, loaderId, documentURL, and frameId, and the monitor also has sessionID -> targetInfo with targetId/targetType. Today we only publish cdp_session_id, method/url/headers/etc. This makes redirects look like duplicate requests and makes it hard to group requests by tab/frame/navigation.

Could we include at least:

  • request_id
  • loader_id
  • frame_id
  • document_url
  • target_id
  • target_type
  • redirect metadata, or a clear marker when requestWillBeSent represents a redirect continuation

Relatedly, the computed events (network_idle, layout_settled, navigation_settled) currently publish no session/frame/target metadata, and the state machine appears global to the monitor. That makes these events impossible to attribute to a tab from the event stream, and multiple tabs/navigations could interfere with each other. I’d expect computed state to be scoped per page target/session, or at least to carry the current top-level navigation context initialized from Page.frameNavigated (session_id, target_id, frame_id, loader_id, URL, nav sequence) and publish that context on all synthetic events.

Without this, consumers can’t reliably answer “which tab/page/navigation produced this event?”, which seems core to the usefulness of the CDP monitor.

One note from the log: the first three document network_request events are a redirect chain (http://kernel.sh/ -> https://kernel.sh/ -> https://www.kernel.sh/), not exact duplicates. Including request_id and redirect metadata would make that much clearer in the event stream.

@archandatta
Copy link
Copy Markdown
Contributor Author

@rgarcia feedback addressed! the fields like layout_settled and navigation_settled are connected with loader id:

layout_settled.data.loader_id
      == network_request.data.loader_id (all requests from that navigation)
      == network_response.data.loader_id

Comment thread server/lib/cdpmonitor/handlers.go
@kernel kernel deleted a comment from cursor Bot Apr 29, 2026
@archandatta archandatta force-pushed the archand/kernel-1116/cdp-foundation branch from 16d37eb to 83a164b Compare April 29, 2026 14:20
for _, ev := range evs {
s.publish(ev)
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing dead check in onDOMContentLoaded

Low Severity

onDOMContentLoaded unconditionally sets s.navDOMLoaded = true without checking s.dead first. Every other state-mutating method (onRequest, onLoadingFinished, onPageLoad, onLayoutShift) guards with if s.dead { return } before modifying state. While pendingNavigationSettled independently checks dead preventing event emission, this inconsistency mutates a stopped state machine, which could mask bugs if future code reads navDOMLoaded without also checking dead.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 83a164b. Configure here.

@rgarcia
Copy link
Copy Markdown
Contributor

rgarcia commented Apr 29, 2026

I ran another manual pass with the review harness, this time navigating to Hacker News, then opening a new tab and navigating to kernel.sh. Event data: https://gist.github.com/rgarcia/0a60c17ece22e9394bdd9118475a0f5c

Observations:

  1. there does not seem to be a consistent "raw CDP passthrough" envelope inside event.data.
    data.event shows up for layout_shift because PerformanceTimeline.timelineEventAdded CDP params are shaped as { event: ... }, not because Kernel has a convention that raw CDP lives under event. Other events project/rename fields directly into data, e.g. network_request.data.request_id, loader_id, frame_id, etc.
    That makes casing hard to reason about and looks sloppy / inconsistent. Right now consumers see a mix of:

    • raw CDP camelCase fields like frameId, nodeId, layoutShiftDetails
    • Kernel-projected snake_case fields like frame_id, resource_type, request_id
    • raw-ish CDP timestamps in some page events
    • synthetic Kernel fields in computed events

    I think we should probably choose one clear convention of always projecting event data into a Kernel-owned snake_case schema to keep the event data clean and consistent

  2. Relatedly, I think server/lib/cdpmonitor/README.md should grow a consumer-facing schema section. Right now it has a useful taxonomy/internals overview, but not enough for someone consuming the stream to understand fields like request_id, loader_id, frame_id, target_id, cdp_session_id, initiator_type, or the timestamp fields. A short CDP data model primer would help a lot

  3. On timestamps: top-level event.ts is Unix microseconds, but dom_content_loaded.data.timestamp and page_load.data.timestamp are CDP monotonic timestamps in seconds. That is useful CDP data, but it is very easy to confuse with the event timestamp. Maybe rename/project it as something like cdp_timestamp?

  4. dom_content_loaded and page_load still only have the CDP timestamp in data. They do get source metadata for the CDP session/target, but unlike navigation, network_idle, layout_settled, and navigation_settled, they do not carry loader_id, frame_id, URL, or nav_seq in the event payload. Since the monitor now tracks nav context, can we stamp that same context onto these page lifecycle events too?

  5. screenshot still emits as local_process with only the PNG payload. Since screenshots are triggered by Page.loadEventFired / exceptions, it would be very useful for them to include context like trigger_event, session_id, target_id, frame_id, loader_id, URL, and nav_seq. Otherwise screenshots are hard to correlate with the page/navigation that caused them.

  6. No event was fired when I opened a new tab. Not sure what CDP exposes here but I think we should produce something here. Subsequent events were fired once I navigated, which is good.

Overall the new loader/frame/request/target additions are a big improvement. I think the remaining work is mostly making the schema consistent and documented enough for downstream consumers.

@rgarcia
Copy link
Copy Markdown
Contributor

rgarcia commented Apr 29, 2026

Couple of other things:

  1. These should likely carry current navigation context (session_id, target_id, target_type, frame_id, loader_id, URL, nav_seq) in data, not just metadata:
    • interaction_click
    • interaction_key
    • scroll_settled
    • console_log
    • console_error
    • dom_content_loaded
    • page_load
    • layout_shift
    • screenshot
    Partially addressed:
    • network_request, network_response, network_loading_failed: now mostly good.
    • network_idle, layout_settled, navigation_settled: now good, they carry nav context.
    • navigation: has frame_id, loader_id, URL, but not target_id / target_type in data; those are only in metadata.

  2. Event naming is inconsistent. Current event types:
    • console_log, console_error
    • network_request, network_response, network_loading_failed
    • navigation, dom_content_loaded, page_load, layout_shift
    • network_idle, layout_settled, navigation_settled
    • interaction_click, interaction_key, scroll_settled
    • screenshot, monitor_*
    The category already says network, page, interaction, etc., so prefixes like network_ and interaction_ are somewhat redundant. But if event type is intended to be globally unique without category, then prefixes make sense. The issue is inconsistency:
    • interaction_click / interaction_key have interaction_, but scroll_settled does not.
    • network_idle has network_, but layout_settled and navigation_settled do not have page_.
    • page_load has page_, but navigation, layout_shift, layout_settled, navigation_settled do not.
    I think we should choose a convention here. Since these are logs/events that people will grep/query outside typed clients, I’d lean toward globally namespaced names, e.g. network_request, page_navigation, interaction_click, interaction_scroll_settled

Comment thread server/lib/cdpmonitor/handlers.go
Comment thread server/lib/cdpmonitor/handlers.go Outdated
@archandatta archandatta force-pushed the archand/kernel-1116/cdp-foundation branch from d64437e to 916f275 Compare May 1, 2026 15:49
@rgarcia
Copy link
Copy Markdown
Contributor

rgarcia commented May 4, 2026

new test run where I navigated to hacker news, then opened a new tab and did some kernel.sh things, then went back to the HN tab and clicked into an openai blog post: https://gist.github.com/rgarcia/a0c7670d3c3850a7fdab7daeb6460611

The final navigation took five mins to produce page_navigation_settled... I think page_navigation_settled should probably not depend on network_idle. In this case, one fetch to https://ab.chatgpt.com/v1/initialize?... never produced loadingFinished/loadingFailed, so network_idle did not fire until the 5-minute pending-request TTL swept it. That made page_navigation_settled arrive five minutes after the page was visibly usable.

Maybe we define page_navigation_settled around page lifecycle + visual stability, e.g. page_load plus page_layout_settled, and keep network_idle as a separate event.

The other thing I noticed is that SPA/client-side route changes did not trigger events (on kernel.sh i clicked into the pricing page to simulate this). I think it's fine to treat inferring these as a follow-up because correctly distinguishing real route changes from prefetch/background fetches is harder.

@archandatta
Copy link
Copy Markdown
Contributor Author

Here is the latest run -> https://gist.github.com/archandatta/b7a841c0837afa27780177680228c878. Updated to not wait network_idle. Noted the SPA/client-side route changes issue will look into for the next pass!

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

There are 5 total unresolved issues (including 4 from previous reviews).

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit bbc3c91. Configure here.

Comment thread server/lib/cdpmonitor/cdp_proto.go
@archandatta archandatta merged commit 3b4ec09 into main May 5, 2026
12 of 13 checks passed
@archandatta archandatta deleted the archand/kernel-1116/cdp-foundation branch May 5, 2026 17:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants