Date: 2026-03-17
Workflow execution via Playwright launches a SEPARATE Chromium browser with no cookies/sessions. This defeats the purpose for sites requiring auth (LinkedIn Recruiter, etc.) and triggers bot detection.
The Chrome extension itself replays workflow steps in the SAME browser where they were recorded. No Playwright needed. Same cookies, same sessions, same everything.
YOUR Chrome (same browser, same cookies, same session)
┌─────────────────────────────────────────────────────────┐
│ background.ts (ExecutionEngine - orchestrator) │
│ - Manages step queue & state machine │
│ - Survives page navigations (service worker) │
│ - Sends one step at a time to content script │
│ - Re-injects content script after navigation │
│ - Captures screenshots for self-healing │
│ │
│ content-executor.ts (NEW - executes steps on page) │
│ - Finds elements: target_text → CSS → XPath │
│ - Executes: click(), fill(), keypress via real DOM │
│ - Reports success/failure back to background │
│ - Highlights element being acted on │
│ │
│ sidepanel (execution progress UI) │
│ - "Run in Browser" button │
│ - Step-by-step progress with status indicators │
└─────────────────────────────────────────────────────────┘
│ HTTP (only when healing needed)
▼
┌─────────────────────────────────────────────────────────┐
│ Python Backend (:8000) │
│ POST /api/ext-execute/heal │
│ - Receives screenshot + failed step │
│ - LLM diagnoses what changed on the page │
│ - Returns corrected selectors │
└─────────────────────────────────────────────────────────┘
The user may have multiple browsers open (2 Chrome, Firefox, Safari).
- The extension ONLY sees its own Chrome instance
- During recording, every event captures
tabIdandwindowId - At replay: check if original tab exists → use it. Original window? → find matching tab. Otherwise → ask user.
windowIddistinguishes between multiple Chrome windows
- User clicks "Run in Browser" in sidepanel
- Background resolves target tab (recorded tabId/windowId or active tab)
- For each step in workflow:
a. navigation:
chrome.tabs.update(tabId, {url})→ wait for load → inject executor b. click/input/key_press: send step to content-executor → wait for result c. scroll: send to content-executor →window.scrollTo() - After each step that might cause navigation:
- Monitor
chrome.tabs.onUpdatedfor URL change - Wait for
status: "complete"→ re-inject executor →EXECUTOR_READYhandshake
- Monitor
- On step failure: capture screenshot → send to backend heal endpoint → retry with corrected selectors
IDLE → LOADING → EXECUTING → WAITING_FOR_NAV → HEALING → COMPLETED/FAILED
Priority order (same as recording, but in reverse):
- target_text (semantic): Scan interactive elements, match by textContent/aria-label/placeholder/label
- cssSelector:
document.querySelector(cssSelector) - xpath:
document.evaluate(xpath)
- click:
element.focus()→element.click()(or full MouseEvent sequence for React/Vue) - input: Native value setter → InputEvent → change event
- key_press: KeyboardEvent('keydown') + ('keyup')
- scroll:
window.scrollTo(x, y)
When a click causes page navigation:
- Content script on old page DIES (Chrome destroys it)
- Background service worker SURVIVES — it monitors
chrome.tabs.onUpdated - New page finishes loading → background re-injects content-executor.ts
- Content script sends
EXECUTOR_READY→ background sends next step
When a step fails:
- Content-executor reports failure with error to background
- Background captures screenshot via
chrome.tabs.captureVisibleTab() - Background POSTs screenshot + step context to
POST /api/ext-execute/heal - Backend runs StepHealer LLM diagnosis → returns corrected selectors
- Background sends corrected step to content-executor for retry
- Max 3 retries before marking step as failed
- Create
content-executor.tswith element finding + step execution - Register in wxt.config.ts
- Message protocol: EXECUTE_STEP / STEP_RESULT / EXECUTOR_READY
- Add ExecutionEngine class to background.ts
- Step queue, state machine, tab tracking
- Navigation detection + content script re-injection
- Screenshot capture for healing
- Create ext_execution_router.py
- Adapt StepHealer to accept pre-captured screenshots
- Wire up in api.py
- Execution progress view
- "Run in Browser" button in dashboard
- Tab/window selector
- Dynamic content waits (MutationObserver)
- iframes
- New tabs/popups during execution
- Service worker sleep prevention (MV3 30s idle timeout)
extension/src/entrypoints/content.ts— Has reusable functions: extractSemanticInfo(), getXPath(), getEnhancedCSSSelector()extension/src/entrypoints/background.ts— Orchestrator to extendextension/src/lib/workflow-types.ts— Step type definitionsworkflows/workflow_use/healing/step_healer.py— Self-healing to adaptworkflows/backend/recorder_router.py— Pattern for new execution router