Summary
Enable the Chrome DevTools Protocol (CDP) on the Electron app inside the devcontainer (and optionally expose it off-container in dev/CI) so that AI agents driving the app can use accessibility-tree / DOM-level control (Playwright CLI, raw CDP, etc.) instead of pixel-based control via xdotool + screenshots.
Context
The devcontainer currently gives an agent a working but coarse control surface:
- Drive input via
DISPLAY=:99 xdotool …
- Observe state via
import -window root /tmp/shot.png + load image into model context
This works, but every observation costs a screenshot in the model's context. For a long agent loop (e.g. the experimental visual bug-fix flow on experiment/bug-fix-visual), token spend on images dominates the cost.
CDP gives an agent the same control surface that Chrome DevTools uses internally:
- Accessibility tree with element refs (
[button ref=e12] \"Submit\")
- Click/fill/press by ref, not by pixel coordinate
- Network/console/runtime inspection without screenshots
- DOM/AX snapshots are plain text — cheap, structured, robust against CSS changes
Electron exposes CDP exactly like Chrome — pass --remote-debugging-port=<N> (or app.commandLine.appendSwitch('remote-debugging-port', N) programmatically), then any CDP client can attach.
Proposed work
-
Enable CDP on Electron in dev mode.
In scripts/devcontainer-entrypoint.sh, add --remote-debugging-port=9223 to the pnpm start invocation (or to an Electron-side switch). Pick a port distinct from the noVNC port (currently 6080) and document it.
-
Forward the port off the container.
In .devcontainer/devcontainer.json, add 9223 to forwardPorts (and add a runArgs entry mirroring the existing \"-p\", \"\${localEnv:CDP_HOST_PORT}:9223\" if we want host-port control, similar to NOVNC_HOST_PORT). Default to localhost-only — never bind publicly.
-
Document the workflow in the devcontainer-dev skill (.claude/skills/devcontainer-dev/SKILL.md):
- How to confirm CDP is up:
curl http://localhost:9223/json
- Recommended client: Playwright CLI (
npx playwright open --connect-over-cdp http://localhost:9223) or chromium.connectOverCDP(...) for scripts
- Shared-control caveat: user clicks and agent clicks can race
- Tier ladder: try AX/DOM first, fall back to xdotool/screenshots only when semantics aren't enough
-
Optional: ship a small helper script (scripts/devcontainer-cdp.sh) that wraps the common one-liners (navigate, snapshot, click <ref>, fill <ref> <value>) so agents have a tight, well-bounded interface — analogous to how xdotool is the agent's input verb today.
Why this matters now
The experimental visual bug-fix agent (experiment/bug-fix-visual, PR #2120) is the immediate consumer. That experiment is currently stalled on an unrelated issue (claude-code-action workflow validation), but once it can run, screenshot-driven repro will be its dominant cost. Adding CDP turns that into AX-tree-driven repro for most of the loop, with screenshots reserved for genuine pixel bugs.
References
- Electron CDP docs: enabling
--remote-debugging-port
- Playwright accessibility-tree snapshot output (text refs like
[button ref=e12])
- Existing devcontainer skill:
.claude/skills/devcontainer-dev/SKILL.md (already mentions this as "Future: CDP access")
Out of scope
- Changing the production agent (
_bug-fix-agent.yml) to use CDP — that's a follow-up once CDP is wired.
- Adding CDP to release builds. Dev/devcontainer only for now.
Summary
Enable the Chrome DevTools Protocol (CDP) on the Electron app inside the devcontainer (and optionally expose it off-container in dev/CI) so that AI agents driving the app can use accessibility-tree / DOM-level control (Playwright CLI, raw CDP, etc.) instead of pixel-based control via
xdotool+ screenshots.Context
The devcontainer currently gives an agent a working but coarse control surface:
DISPLAY=:99 xdotool …import -window root /tmp/shot.png+ load image into model contextThis works, but every observation costs a screenshot in the model's context. For a long agent loop (e.g. the experimental visual bug-fix flow on
experiment/bug-fix-visual), token spend on images dominates the cost.CDP gives an agent the same control surface that Chrome DevTools uses internally:
[button ref=e12] \"Submit\")Electron exposes CDP exactly like Chrome — pass
--remote-debugging-port=<N>(orapp.commandLine.appendSwitch('remote-debugging-port', N)programmatically), then any CDP client can attach.Proposed work
Enable CDP on Electron in dev mode.
In
scripts/devcontainer-entrypoint.sh, add--remote-debugging-port=9223to thepnpm startinvocation (or to an Electron-side switch). Pick a port distinct from the noVNC port (currently 6080) and document it.Forward the port off the container.
In
.devcontainer/devcontainer.json, add9223toforwardPorts(and add arunArgsentry mirroring the existing\"-p\", \"\${localEnv:CDP_HOST_PORT}:9223\"if we want host-port control, similar toNOVNC_HOST_PORT). Default to localhost-only — never bind publicly.Document the workflow in the
devcontainer-devskill (.claude/skills/devcontainer-dev/SKILL.md):curl http://localhost:9223/jsonnpx playwright open --connect-over-cdp http://localhost:9223) orchromium.connectOverCDP(...)for scriptsOptional: ship a small helper script (
scripts/devcontainer-cdp.sh) that wraps the common one-liners (navigate,snapshot,click <ref>,fill <ref> <value>) so agents have a tight, well-bounded interface — analogous to howxdotoolis the agent's input verb today.Why this matters now
The experimental visual bug-fix agent (
experiment/bug-fix-visual, PR #2120) is the immediate consumer. That experiment is currently stalled on an unrelated issue (claude-code-actionworkflow validation), but once it can run, screenshot-driven repro will be its dominant cost. Adding CDP turns that into AX-tree-driven repro for most of the loop, with screenshots reserved for genuine pixel bugs.References
--remote-debugging-port[button ref=e12]).claude/skills/devcontainer-dev/SKILL.md(already mentions this as "Future: CDP access")Out of scope
_bug-fix-agent.yml) to use CDP — that's a follow-up once CDP is wired.