"Record, replay, and understand what's behind the pixels"
A Chrome extension + MCP server that gives AI coding assistants (Claude Code, Cursor, etc.) timestamped visual-API correlation — the ability to understand which API calls feed which UI elements, record browser sessions (human or AI-driven), and replay them with a synchronized API timeline.
Inspired by Undertale. Use pixel-art style branding/logo (think: a small pixel-art detective peeking under a lifted pixel tile, revealing network data flowing underneath, 8-bit color palette).
- Origin Story & Motivation
- What Makes UnderPixel Different
- Competitive Landscape
- Architecture
- Key Dependencies
- Feature Scope
- What Gets Captured: Network vs Screenshots
- API Dependency Graph Algorithm
- MCP Tool Surface
- Extension UI
- Installation UX
- Scalability Plan
- Build Phases
- GitHub Repo Setup
- Credits & Licensing
- Design Decisions Log
The idea started from a personal use case: wanting Claude Code to open a company OKR system website, fetch OKRs, and save them to a doc platform with improvement ideas appended.
The initial approach was to use a browser automation tool (dev-browser), but that required teammates to install extra tooling. Since Chrome extensions can call APIs with cookies/headers auto-attached, the idea shifted to: build a lightweight Chrome extension that captures API details and sends them to Claude Code for processing.
This evolved into a broader vision: not just capture network calls, but correlate them with what the user sees on screen — timestamped visual-API correlation that no existing tool provides.
The core differentiator is timestamped visual-API correlation. Existing tools treat network capture and visual capture as separate, unlinked streams. UnderPixel bundles them:
Snapshot Bundle @ T=1712345678000:
- screenshot: PNG
- dom_state: rrweb incremental snapshot
- api_calls: [
{ url, method, status, requestHeaders, requestBody,
responseHeaders, responseBody, startTime, endTime },
...
]
- trigger: "fetch response: GET /api/okrs"
- correlation: "DOM element #okr-table updated with data from GET /api/okrs"
Secondary differentiators:
- Works in your real browser — no special Chrome flags, no separate profiles. Cookies and auth just work. (chrome-devtools-mcp requires
--remote-debugging-portwith a separate profile, or Chrome 144+--autoConnect) - Records AI agent actions — when Claude Code navigates/clicks/fills via MCP, UnderPixel silently records everything. Users can replay what the AI did, with full API details. This is an audit/observability angle nobody else offers.
- Focused tool surface — ~12 MCP tools instead of 27 (mcp-chrome). Opinionated, not a Swiss army knife.
- Session replay with API timeline — rrweb-player with synchronized API call panel. Visual product, not just a CLI pipe.
| Tool | Stars | Network Bodies | Screenshots | Works in Real Browser | Visual-API Correlation | Status |
|---|---|---|---|---|---|---|
| Claude in Chrome (Anthropic official) | N/A | No | Yes | Yes | No | Active (beta) |
| ChromeDevTools/chrome-devtools-mcp (Google) | ~32.8k | Yes | Yes | No (needs flags/separate profile) | No | Very Active |
| hangwin/mcp-chrome | ~11.1k | Yes (Debugger mode) | Yes | Yes | No | Active |
| AgentDeskAI/browser-tools-mcp | ~7.2k | Partial | Yes | Yes | No | Abandoned |
| Saik0s/mcp-browser-use | ~917 | Yes (auto-identifies key calls) | Partial | No | Partial (skills concept) | Active |
| benjaminr/chrome-devtools-mcp | ~293 | Yes (filterable) | No | No | No | Active |
| Eddym06/chrome-devTools-advanced-mcp | ~4 | Best (HAR, replay, WebSocket) | Yes | No | No | Active |
| nicobailon/surf-cli | ~373 | Yes + replay | Yes (annotated) | Yes | No | Active (not MCP) |
| UnderPixel (this project) | — | Yes | Yes | Yes | Yes | Building |
| Capability | mcp-chrome | UnderPixel |
|---|---|---|
| Network capture with response bodies | Yes (Debugger mode) | Yes (chrome.debugger, referencing mcp-chrome patterns) |
| Screenshots | Yes | Yes (captureVisibleTab, referencing mcp-chrome patterns) |
| Network-to-DOM correlation | No | Yes |
| DOM mutation tracking | No | Yes (rrweb) |
| Visual change detection | No | Yes (2-layer system: rrweb + pixelmatch) |
| Timeline/timestamp correlation | No | Yes |
| Session replay | No | Yes (rrweb-player) |
| API dependency graph | No | Yes |
| AI action audit trail | No | Yes |
| Session export/share | No | Yes (.underpixel files) |
| Request cap | 100 hard limit | Configurable, IndexedDB-backed |
| Tool count | 27 | ~12 (focused) |
- mcp-chrome is a browser automation Swiss army knife. UnderPixel is a focused understanding tool.
- mcp-chrome's network and visual captures are completely separate silos with no correlation.
- mcp-chrome has no DOM recording, no replay, no visual change detection, no dependency graphing.
- UnderPixel builds on mcp-chrome's infrastructure (MIT licensed) but adds the correlation layer as a first-class feature.
+----------------------------------------------------------+
| Chrome Extension |
| |
| Content Script |
| +- rrweb.record() -> DOM events stream |
| | (also serves as DOM change signal — smart mutation |
| | batching built in, no separate MutationObserver) |
| +- PerformanceObserver -> layout-shift signals |
| |
| Background Service Worker |
| +- chrome.debugger API -> network capture |
| | (request/response headers + bodies) |
| +- Correlation Engine -> match by timestamp |
| | "API response T=1200 -> DOM mutations T=1250" |
| +- Screenshot Gate |
| | rrweb events + layout-shift -> pixelmatch |
| +- Native Messaging client -> sends to bridge |
| +- Data Storage (IndexedDB) -> sessions, snapshots |
| |
| Popup |
| +- Toggle capture on/off, filter settings |
| |
| Offscreen Document |
| +- Canvas image processing (hash, diff) |
| |
| Extension Page (replay.html, opened as chrome tab) |
| +- rrweb-player (left pane) |
| +- API timeline (right pane, synced by timestamp) |
| +- API dependency graph view |
| |
+----------------------------+------------------------------+
|
Native Messaging
|
+----------------------------+------------------------------+
| Bridge (underpixel-bridge, npm package) |
| +- stdio <-> Native Messaging translator |
| +- Auto-registers as Chrome Native Messaging host |
| +- ~100-200 lines, intentionally dumb pipe |
+----------------------------+------------------------------+
|
stdio (MCP JSON-RPC)
|
+----------------------------+------------------------------+
| Claude Code / Any MCP Client |
| Calls MCP tools, does analysis |
+----------------------------------------------------------+
Key architectural decisions:
- All logic lives in the Chrome extension. The bridge is a dumb pipe — it proxies MCP tool calls to the extension via Native Messaging and returns results.
- Updating the extension (via Web Store auto-update) updates the logic; the npm bridge package rarely needs updating.
- The extension holds all state (IndexedDB + chrome.storage.local) — no syncing issues.
- Per-session MCP transports: each MCP client gets its own
StreamableHTTPServerTransport+McpServerinstance, matching the official SDK pattern.
| Library | License | Purpose | Why This One |
|---|---|---|---|
| rrweb | MIT | DOM snapshot + incremental recording + replay | 17k stars, mature, smart mutation batching (only records final value per batch, discards transient nodes) |
| rrweb-player | MIT | Session replay UI component | Built into rrweb ecosystem, has play/pause/seek |
| mcp-chrome | MIT | Reference implementation (not an npm dependency). We study and reference their patterns for: Debugger API network capture, screenshot pipeline, Native Messaging bridge architecture, Streamable HTTP MCP server | 11k stars, battle-tested patterns for the hard infrastructure problems |
| @modelcontextprotocol/sdk | MIT | MCP server implementation | Official SDK |
| pixelmatch | ISC | Pixel-level image comparison for screenshot gate | 150 lines, zero deps, stable algorithm, runs on raw ImageData in browser |
| elkjs | EPL-2.0 | Graph layout for API dependency DAG | 2k stars, computes node positions from edge list. For extension UI only, not v1 priority |
Removed from consideration:
blockhash-core— removed. rrweb's event stream + PerformanceObserver already filter 90%+ of noise at Layer 1. Adding a perceptual hash layer is over-engineering. Last updated ~2019.mutation-summary— removed. rrweb already does smart mutation batching (only records final value per batch, discards transient nodes). Running a parallel MutationObserver is redundant. Last updated ~2017.
| API | Purpose |
|---|---|
chrome.debugger |
Network capture with full request/response bodies (CDP: Network.requestWillBeSent, Network.responseReceived, Network.getResponseBody) |
chrome.tabs.captureVisibleTab |
Screenshots. Rate limited to 2 calls/sec (hard Chrome limit since v92) |
chrome.offscreen |
Offscreen document for canvas-based image processing (service workers can't use DOM/Canvas) |
chrome.contextMenus |
Right-click menu items (future use) |
chrome.runtime.connectNative |
Native Messaging to bridge |
PerformanceObserver("layout-shift") |
Browser-native visual change signal |
requestIdleCallback |
Detect when page is idle/stable |
IndexedDB |
Store session data, rrweb events, network captures |
- Network capture with full details — request/response headers, bodies, timing, via chrome.debugger API
- DOM recording — rrweb full snapshot + incremental diffs
- Timestamped visual-API correlation — bundle network events with DOM changes and screenshots by timestamp proximity
- Smart screenshot capture — 2-layer gate system (rrweb events + stability wait -> pixelmatch diff)
- Replay UI — rrweb-player in extension tab page with synced API call timeline panel
- API dependency graph — auto-detect call chains via value propagation tracking
- MCP server — ~12 focused tools for Claude Code / any MCP client
- Session export/share —
.underpixelfiles (gzipped JSON: rrweb events + network + screenshots) - Auto-generate API documentation — from captured sessions, generate endpoint docs with auth flow, params, response shape
- Performance annotations — slow API calls highlighted, waterfall visualization, time-to-interactive markers
- AI action recording — silently records when Claude Code drives the browser, enabling replay + audit
- User controls — popup toggle on/off, filter settings
- Browser control — navigate, click, fill, scroll (from mcp-chrome, minimal set)
- "Explain This Page" right-click — excluded because MCP is pull-based (Claude Code calls tools, can't receive push). Could revisit when Claude Code adds push/notification support. Workaround exists (queue + poll) but too janky for v1.
- Bookmarks, history search, file upload/download — mcp-chrome has these but they're outside UnderPixel's focus
- GIF recording — mcp-chrome feature, not relevant
- Performance tracing — mcp-chrome feature, outside scope (performance annotations are simpler and sufficient)
- Safari support — completely different extension model (Xcode/Swift), not worth it
Important distinction: Network capture and screenshot capture are independent concerns with different strategies.
All network calls are always recorded via chrome.debugger (CDP). This is cheap (just metadata + bodies in IndexedDB) and is the foundation for correlation, dependency graphing, and API documentation.
Default filter: XHR/fetch only (excludes images, CSS, JS, fonts, media). User can configure:
- Include/exclude static resources
- Include/exclude specific domains
- Exclude analytics/tracking domains (configurable blocklist, sensible defaults like Google Analytics, Mixpanel, etc.)
Network capture is not gated or throttled — every matching request is recorded with full details.
Screenshots are expensive (captureVisibleTab is rate-limited to 2 calls/sec by Chrome) and large (100KB-1MB each). The 2-layer gate decides when a screenshot is worth taking.
Originally designed as a 4-layer system (DOM triage -> stability wait -> perceptual hash -> pixel diff). Simplified after realizing:
- rrweb already does smart mutation batching (only records final values, discards transient nodes) — no need for a separate MutationObserver + mutation-summary library
- rrweb's event stream naturally serves as the "something changed" signal — no need for a separate DOM triage layer
- blockhash-core (perceptual hashing) adds a layer between "something changed" and "did pixels change" that isn't worth the complexity — if Layer 1 says something changed and the page is stable, just run pixelmatch directly
Change signals (any of these sets a dirty flag):
- rrweb emits incremental snapshot events (DOM changed)
PerformanceObserver("layout-shift")fires (elements moved)- URL/hash changed (navigation — always capture, skip Layer 2)
- API response received (XHR/fetch, filtered — only if rrweb also reports DOM mutations within the debounce window)
Stability gate (wait for all of these before proceeding):
- Layout-shift events have stopped
transitionend/animationendfired (CSS animations settled)requestIdleCallbacktriggered (browser is idle)
Debounce: dirty flag checked every 500ms. Multiple triggers within that window = one check.
captureVisibleTab
-> pixelmatch against previous screenshot
-> changedPixels / totalPixels > threshold (configurable, default ~1%)
-> If significant, SAVE the screenshot + create correlated bundle
-> If not significant, skip (DOM changed but pixels didn't)
| Setting | Default | Description |
|---|---|---|
maxScreenshotsPerSession |
100 | Hard cap per capture session (per capture start/stop cycle). Prevents runaway storage on long-lived pages. When reached, only on-demand screenshots via MCP tool are allowed. |
screenshotInterval |
500ms | Minimum time between screenshots (debounce). Cannot exceed Chrome's 2/sec hard limit regardless. |
pixelDiffThreshold |
0.01 (1%) |
Pixel diff ratio threshold for pixelmatch comparison. Screenshots are only saved when the changed pixel ratio exceeds this value. Set to 0 to save every screenshot that passes Layer 1. |
screenshotsEnabled |
true | Master toggle. User can disable auto-screenshots entirely and rely only on on-demand capture via MCP tool or rrweb DOM replay. |
Note on defaults: These are starting guesses — tune based on real-world testing across different site types (dashboards, SPAs, form-heavy apps, content pages). The important thing is that they're configurable.
Note: On-demand screenshots via the underpixel_screenshot() MCP tool always work regardless of these limits — these settings only control the automatic smart capture.
Simple value propagation tracking. No external library needed for the algorithm itself.
function extractTrackableValues(responseBody) {
const values = new Set();
// Walk JSON recursively
JSON.walk(responseBody, (key, value) => {
if (typeof value === 'string') {
if (value.length > 20) values.add(value); // Tokens, long strings
if (value.match(/^eyJ/)) values.add(value); // JWT patterns
if (value.match(/^[0-9a-f-]{36}$/i)) values.add(value); // UUIDs
}
if (typeof value === 'number' && key.match(/id$/i)) {
values.add(String(value)); // Numeric IDs
}
});
return values;
}
function findDependencies(completedRequests) {
const edges = [];
for (let i = 0; i < completedRequests.length; i++) {
const source = completedRequests[i];
const trackableValues = extractTrackableValues(source.responseBody);
for (let j = i + 1; j < completedRequests.length; j++) {
const target = completedRequests[j];
const searchSpace = [
target.url,
target.headers?.authorization,
JSON.stringify(target.requestBody),
].join(' ');
for (const value of trackableValues) {
if (searchSpace.includes(value)) {
edges.push({
from: source.url,
to: target.url,
via: value.substring(0, 20) + '...',
type: guessType(value), // "bearer_token", "id", "session"
});
break;
}
}
}
}
return edges;
}- 50 API calls -> 1,225 pair comparisons -> < 10ms
- 200 API calls -> 19,900 comparisons -> < 100ms
- Scales fine for real-world sessions
For the extension UI, use elkjs to compute layout positions from the edge list, render with SVG or Canvas. This is a v2/v3 UI feature — for v1, returning the edge list as JSON to Claude Code is sufficient.
~12 focused tools, organized by purpose:
| Tool | Description |
|---|---|
underpixel_correlate(query) |
"What API feeds the user table?" — forward path (text search on URLs + response bodies), reverse path (DOM element → correlated APIs via rrweb snapshots), and value-level correlation (DOM text values → specific JSON response fields). Supports CSS selectors, attribute queries ([src="..."]), and free text. |
underpixel_timeline(startTime?, endTime?, limit?) |
Returns chronological correlation bundles with API + visual state |
underpixel_snapshot_at(timestamp) |
Closest screenshot + active API calls at a specific moment |
| Tool | Description |
|---|---|
underpixel_capture_start(filter?) |
Start recording network + DOM + visual state |
underpixel_capture_stop() |
Stop capture, return correlated summary |
underpixel_api_calls(filter?) |
Query captured API calls with full details (headers, bodies, timing) |
underpixel_api_dependencies() |
Auto-detected API call chain / dependency graph |
| Tool | Description |
|---|---|
underpixel_screenshot(selector?) |
On-demand screenshot (viewport, full page, or element) |
underpixel_dom_text(selector) |
Current text content of elements |
underpixel_replay(timeRange) |
Opens replay tab in browser, returns session data |
| Tool | Description |
|---|---|
underpixel_navigate(url) |
Go to page (new tab or update existing) |
underpixel_interact(action) |
Click, fill, scroll, type, press key |
underpixel_page_read(filter?) |
Accessibility tree of visible elements (filter: 'all' or 'interactive') |
The Chrome extension opens a full tab (chrome-extension://EXTENSION_ID/replay.html) for the replay interface. Built with Svelte 5 (legacy/Svelte 4 syntax) and a "Cozy Pixel RPG" theme.
+------------------------------+-------------------------+
| | ▼ Page Load |
| rrweb-player | GET /api/config 0.1s |
| (interactive replay) | GET /api/user 0.3s |
| | |
| [synced playback with | ▼ User Clicked "OKRs" |
| event-based timeline] | GET /api/okrs 1.2s |
| | 200 - 3 items |
| | |
+------------------------------+ ▼ Form Submit |
| << > >> 1x ===*====== | POST /api/log 0.1s |
+------------------------------+-------------------------+
Features:
- Left pane: rrweb-player with play/pause/seek controls
- Right pane: Event-based API timeline — calls grouped by UI events (
EventSection), not flat list - Svelte store (
replay-store.ts) syncs player currentTime with timeline highlighting - Search/filter across API calls
- Detail panel for inspecting request/response headers and bodies
- Export button (planned — .underpixel file)
- API dependency graph view (planned, using elkjs)
Follows the same proven pattern as mcp-chrome.
# npm
npm install -g underpixel-bridge
# pnpm
pnpm config set enable-pre-post-scripts true
pnpm install -g underpixel-bridge
# If automatic registration fails (pnpm):
underpixel-bridge registerThe bridge auto-registers itself as a Chrome Native Messaging host via a postinstall script.
- Download latest extension from GitHub Releases
- Open Chrome, go to
chrome://extensions/ - Enable "Developer mode"
- Click "Load unpacked" and select the downloaded extension folder
- Click the extension icon, then click "Connect" to see MCP configuration
(Once stable, publish to Chrome Web Store for one-click install.)
Streamable HTTP (recommended):
{
"mcpServers": {
"underpixel": {
"type": "streamableHttp",
"url": "http://127.0.0.1:PORT/mcp"
}
}
}stdio (alternative):
{
"mcpServers": {
"underpixel": {
"command": "npx",
"args": ["-y", "underpixel-bridge"]
}
}
}Works with Claude Code, Claude Desktop, Cursor, VS Code Copilot, Windsurf, or any MCP client.
The MCP protocol is client-agnostic. The bridge speaks stdio JSON-RPC. Works with:
- Claude Code
- Claude Desktop
- Cursor
- VS Code Copilot
- Windsurf
- Any future MCP client
No extra work needed — this is free from the architecture choice.
| Browser | Effort | Notes |
|---|---|---|
| Chrome | Now | Primary target |
| Edge | Near-free | Same Chromium APIs, same Web Store |
| Arc, Brave, Opera | Near-free | Chromium-based |
| Firefox | Medium (v2) | WebExtensions ~90% compatible. Main gap: chrome.debugger doesn't exist, use browser.devtools.network instead. Native Messaging slightly different manifest. |
| Safari | Hard | Not planned. Different extension model entirely (Xcode/Swift). |
Key for cross-browser: abstract browser-specific APIs behind interfaces from day one:
interface NetworkCapture {
start(filter: CaptureFilter): void;
stop(): CapturedData;
}
// Chrome implementation uses chrome.debugger
// Firefox implementation uses browser.devtools.network| Concern | Solution |
|---|---|
| Memory bloat from long sessions | Stream rrweb events to IndexedDB, not memory |
| Large response bodies | Store in IndexedDB, return summaries to MCP, full body on-demand |
| Query performance | Index by timestamp + URL pattern in IndexedDB |
| Export file size | Compress .underpixel files with gzip (rrweb events compress ~10:1) |
| Request cap | Configurable (unlike mcp-chrome's hard 100 limit) |
Goal: Network capture + correlation + MCP tools working end-to-end.
- ✅ Project scaffold — Chrome extension (Manifest V3, WXT) + bridge npm package (Fastify + Streamable HTTP)
- ✅ Network capture —
chrome.debuggerAPI for full request/response/headers/body capture, IndexedDB storage with body-ref separation - ✅ rrweb integration —
rrweb.record()in MAIN world content script, events batched and stored in IndexedDB - ✅ Correlation engine — timestamp-based matching with configurable window (default 500ms), produces CorrelationBundle records
- ✅ Basic screenshot —
captureVisibleTabon-demand (JPEG, 50% quality, IndexedDB storage) - ✅ Native Messaging bridge — stdio translator with auto-registration, supports both Streamable HTTP and stdio MCP transport
- ✅ MCP tools — all 8 core tools implemented:
capture_start,capture_stop,api_calls,screenshot,navigate,interact,page_read,correlate - ✅ Basic popup — toggle capture on/off with live stats (API calls, screenshots, correlations)
Bonus (implemented ahead of schedule): timeline, snapshot_at, dom_text, replay, api_dependencies tools also complete. Correlate tool includes attribute-value search (src, href, alt, etc.) and value-level correlation (traces DOM text to specific API response JSON fields).
Deliverable: User can tell Claude Code "go to X page, capture network, tell me what API feeds the user list" and get a correlated answer.
Goal: Visual change detection + replay interface.
- ✅ 2-layer screenshot gate —
ScreenshotGate(dirty flag + debounce + limits) feedsScreenshotPipeline(capture + pixelmatch diff via offscreen document). Navigation bypasses diff. Configurable interval, max count, and pixel diff threshold (default 0.01 = 1%). - ✅ Offscreen document — canvas-based pixelmatch via message protocol (
{ type: 'pixel-diff', previous, current }→{ diffRatio }) - ✅ Replay page —
replay.htmlwith rrweb-player left pane + API timeline right pane, built with Svelte 5 (legacy/Svelte 4 syntax). Cozy Pixel RPG theme. - ✅ Event-based timeline redesign — API calls grouped by UI events (
EventSection), not flat list. Svelte store (replay-store.ts) syncs player time with timeline. - ✅ MCP tools —
timeline,snapshot_at,replayall implemented - ✅ DOM text tool —
underpixel_dom_text(selector)uses TreeWalker for safe text extraction (avoids serialization risk)
Deliverable: User can replay browser sessions with synchronized event-based API timeline. Smart screenshots captured automatically on significant visual changes.
Goal: API chain detection + session sharing.
- ✅ Value propagation algorithm — extracts JWTs, UUIDs, hex tokens, high-entropy strings, numeric IDs from responses; searches in subsequent request URLs, auth headers, and bodies. Implemented in
tools/core.ts+json-utils.ts. - ✅ MCP tool —
api_dependencies()returns typed edge list withDependencyEdge(from,to,via,valueType) - ✅ Session export —
exportSession()insrc/replay/lib/export.ts: reads all IDB stores, re-inlines response bodies, appliesExportOptions(mask headers, strip bodies/screenshots), compresses viaCompressionStream('gzip'), triggers browser download as.underpixelfile. - ✅ Session import —
importSession()insrc/replay/lib/import.ts: decompresses, validates (validateBundle), re-keys all session IDs to avoid collisions (rekeyBundle), splits large bodies back intoresponseBodiesstore, writes all stores in a single IDB transaction. - ✅ Export/Import UI in replay page — ExportModal with options (mask headers, strip bodies/screenshots), import button with file picker, toast notifications, imported session indicators in SessionPicker.
Deliverable: Claude Code can query API auth flows. Users can export and share sessions.
Goal: Diff, auto-docs, performance, polish.
- Auto-generate API documentation — from captured sessions, generate endpoint docs with auth flow, params, response shape (Claude Code refines into OpenAPI spec)
- Performance annotations — overlay on replay: slow API calls highlighted red, waterfall visualization, parallel vs sequential request markers
- Dependency graph UI — visual DAG in extension page using elkjs
- Filter improvements — filter by domain, status code, resource type, URL pattern
- Polish — error handling, edge cases, loading states
- Edge support — test and publish to Edge Add-ons store
- Firefox port — replace
chrome.debuggerwithbrowser.devtools.network, adjust Native Messaging manifest - Browser API abstraction layer — if not done already
- Community features — based on user feedback
- Name:
underpixel - Description: Chrome extension + MCP server — record, replay, and understand what's behind the pixels. Timestamped visual-API correlation for Claude Code and any MCP client.
- Topics:
chrome-extension,claude-code,mcp,mcp-server,devtools,network-debugging,api-monitoring,rrweb,browser-automation,developer-tools
underpixel/
├── extension/ # Chrome extension (WXT project)
│ ├── wxt.config.ts # WXT + Vite config, manifest generation
│ ├── entrypoints/
│ │ ├── background.ts # Service worker (orchestrator)
│ │ ├── content.ts # ISOLATED world content script (bridge)
│ │ ├── content-recorder.ts # MAIN world content script (rrweb)
│ │ ├── popup/ # Extension popup (toggle, settings, MCP config)
│ │ ├── replay/ # Replay page (Svelte 5, rrweb-player + API timeline)
│ │ └── offscreen/ # Canvas-based image processing (pixelmatch)
│ ├── lib/
│ │ ├── network/
│ │ │ ├── capture.ts # CDP network capture (chrome.debugger)
│ │ │ └── cdp-session.ts # Ref-counted debugger attach/detach
│ │ ├── correlation/
│ │ │ ├── engine.ts # Timestamp-based correlation bundles
│ │ │ └── dom-walker.ts # rrweb snapshot DOM searching
│ │ ├── screenshot/
│ │ │ ├── gate.ts # 2-layer screenshot decision logic
│ │ │ └── pipeline.ts # Capture + pixelmatch diff pipeline
│ │ ├── recording/
│ │ │ └── event-batcher.ts # Batched rrweb event persistence (200ms)
│ │ ├── storage/
│ │ │ └── db.ts # IndexedDB schema + helpers (via idb)
│ │ └── tools/
│ │ ├── registry.ts # Tool name -> handler mapping
│ │ ├── core.ts # correlate, timeline, snapshot_at, replay, api_dependencies
│ │ ├── network.ts # capture_start/stop, api_calls
│ │ └── browser.ts # navigate, interact, page_read, screenshot, dom_text
│ └── src/replay/ # Svelte components for replay UI
│ ├── stores/replay-store.ts # Svelte store (syncs player + timeline)
│ └── lib/ # Helpers (event-sections, export, format, search, etc.)
├── bridge/ # NPM package: underpixel-bridge
│ ├── src/
│ │ ├── cli.ts # Entry point (Native Messaging + auto-start)
│ │ ├── native-host.ts # Length-prefixed JSON stdio protocol
│ │ └── server.ts # Fastify HTTP server (MCP routes, per-session transports)
│ └── scripts/
│ ├── register.ts # Write NativeMessagingHosts manifest
│ ├── postinstall.ts # npm postinstall auto-registration
│ ├── run_host.sh # Unix wrapper (Node.js discovery)
│ └── run_host.bat # Windows wrapper
├── packages/
│ └── shared/ # Shared types between extension + bridge
│ └── src/
│ ├── types.ts # All data types, enums, interfaces
│ ├── tool-schemas.ts # MCP tool definitions (JSON Schema)
│ └── constants.ts # Host name, default port, config defaults
├── docs/ # Design docs + per-feature specs/plans
├── CLAUDE.md
└── LICENSE # MIT
# UnderPixel
> Record, replay, and understand what's behind the pixels
[badges: Chrome Web Store, npm, license, stars]
[One-paragraph description]
[GIF/screenshot of replay UI with API timeline]
## What it does
[3 bullet points with visuals]
## Quick Start
[2-step install: extension + MCP config]
## Features
[Feature list with screenshots]
## How it works
[Architecture diagram]
## MCP Tools Reference
[Tool table]
## Acknowledgments
[Credits to mcp-chrome and rrweb]
MIT — matches both mcp-chrome and rrweb.
## Acknowledgments
UnderPixel builds on the excellent work of:
- [mcp-chrome](https://github.com/hangwin/mcp-chrome) by hangwin —
browser MCP infrastructure, network capture, screenshot pipeline
- [rrweb](https://github.com/rrweb-io/rrweb) —
DOM recording and replay
Both are MIT licensed. UnderPixel adds timestamped visual-API
correlation on top of their foundations.Why: mcp-chrome already solved Native Messaging bridge, network capture (dual WebRequest + Debugger backends), screenshot pipeline, full-page stitching, browser automation. Rebuilding that is months. rrweb solved efficient DOM recording with smart mutation batching. Both are MIT licensed. UnderPixel's novel contribution is the correlation layer.
Why: Extension auto-updates via Web Store. NPM package rarely needs updating. No state syncing issues. Single source of truth.
Why: Manifest V3 service workers cannot bind to network ports (no HTTP server, no WebSocket server). MCP requires either accepting incoming connections (Streamable HTTP) or being spawned as a subprocess (stdio). A separate bridge process is required. Every existing tool (Claude in Chrome, mcp-chrome, BrowserMCP) uses this pattern.
Why: captureVisibleTab is rate-limited to 2/sec. Interval-based wastes the budget on unchanged states. Event-driven (API response -> DOM mutation -> stability -> hash check) captures only meaningful changes.
Why: Originally designed as 4 layers (DOM triage -> stability wait -> perceptual hash -> pixel diff). Simplified because rrweb already handles smart mutation batching — it only records final values per batch and discards transient nodes, making a separate MutationObserver + mutation-summary library redundant. blockhash-core (perceptual hashing, last updated ~2019) added complexity between "something changed" and "did pixels change" that wasn't justified. Final design: Layer 1 uses rrweb's event stream + PerformanceObserver as change signal + stability gate (free, already running), Layer 2 uses pixelmatch for pixel diff confirmation (~10ms). Simple, fewer dependencies, rrweb does the heavy lifting.
Why: Focused > comprehensive. Users don't need bookmarks, history search, GIF recording, performance tracing from a correlation tool. Fewer tools = less token overhead in MCP tool definitions = more context for actual work.
Why: Inspired by Undertale (pixel art aesthetic for branding). Evocative ("what's under the pixels") rather than descriptive. Short, memorable, works as package name (underpixel), repo name, extension name. Brand keywords (chrome, claude-code, mcp) go in repo description and GitHub topics, not the name — names age poorly with brand ties.
Why: MCP is pull-based — Claude Code calls tools, extension can't push to Claude Code. Workarounds exist (queue + poll, clipboard, file drop) but all feel janky. Revisit when MCP or Claude Code adds push/notification support.
Why: Long sessions with hundreds of API calls + rrweb events will exhaust memory. IndexedDB handles large datasets, persists across service worker restarts (Manifest V3 service workers have 30s idle timeout, 5min activity limit unless Native Messaging is active), and enables query by timestamp/URL pattern.
Why: Simple rule — group events within a configurable window (e.g., 500ms). "API response at T=1200ms + DOM mutations at T=1220ms + screenshot at T=1300ms = one correlated bundle." ~50 lines of logic. No complex data flow analysis needed for v1. The LLM (Claude Code) can do deeper reasoning on top of the correlated data.
Why: The chrome.webRequest API can capture request headers/bodies but cannot access response content. Since "what data did this API return" is core to correlation, Debugger mode is required. Tradeoff: shows "Chrome is being controlled by automated test software" banner and conflicts with DevTools if open simultaneously. This is acceptable — mcp-chrome has the same limitation and 11k users live with it.
Why: mcp-chrome is a Chrome extension, not a reusable library. We study and reference their implementation patterns (Debugger API capture, Native Messaging bridge, screenshot stitching, Streamable HTTP MCP server) and write our own code following similar approaches. Their code is MIT licensed. rrweb, on the other hand, IS an npm dependency (npm install rrweb) used directly.
Why: IndexedDB is built into every browser — no library needed, no installation. It's the standard way Chrome extensions store large/structured data. Handles rrweb event streams, network capture data, and screenshots without exhausting memory. Persists across service worker restarts (important for Manifest V3's 30s idle timeout).
Why: npm install -g + load unpacked extension + MCP config is the established flow users of similar tools expect. Initially considered a simpler npx auto-download approach, but mcp-chrome's method is more robust (explicit global install, supports both Streamable HTTP and stdio transport, manual registration fallback for pnpm).