|
| 1 | +# e2e-harness — Headless e2e Control Plane |
| 2 | + |
| 3 | +How an agent (or CI) drives a **real** wizard run end-to-end — the **real TUI**, |
| 4 | +no browser, no keystrokes — and captures what it rendered. Both e2e routes share |
| 5 | +one idea: run the real `startTUI` (the real ink render) and drive its store by |
| 6 | +**state manipulation**, then capture the real rendered screen from a PTY. |
| 7 | + |
| 8 | +> If you're an agent that just wants to run and explore the wizard, use the |
| 9 | +> `exploring-the-wizard` skill |
| 10 | +> ([`.claude/skills/exploring-the-wizard/SKILL.md`](../.claude/skills/exploring-the-wizard/SKILL.md)). |
| 11 | +> This doc is the _how it works_ underneath. |
| 12 | +
|
| 13 | +## The pieces |
| 14 | + |
| 15 | +This whole harness lives in `e2e-harness/` at the repo root — deliberately OUT of |
| 16 | +`src/` so none of it is part of the wizard's production source (nothing in `src/` |
| 17 | +imports it; the tsdown bundle never includes it). |
| 18 | + |
| 19 | +``` |
| 20 | +e2e-harness/ |
| 21 | + wizard-ci-driver.ts WizardCiDriver — read_state / perform_action over the store |
| 22 | + action-registry.ts screen → the actions legal on it (+ NO_ACTION_SCREENS) |
| 23 | + e2e-profile.ts WizardE2eProfile + decideE2eAction — the scripted walk policy |
| 24 | + profiles.ts per-program profiles + profileFor(programId) |
| 25 | + tui-capture.ts run a command in a PTY (node-pty) + read its real screen (@xterm/headless) |
| 26 | +scripts/ |
| 27 | + tui-host.no-jest.ts the real-TUI host: startTUI + WizardCiDriver, MODE=fixed | serve |
| 28 | + tui-snapshots.no-jest.ts CI route: host(fixed) in a PTY → per-screen real-TUI snapshots |
| 29 | + wizard-ci-mcp.no-jest.ts agent route: MCP server proxying host(serve) |
| 30 | +``` |
| 31 | + |
| 32 | +The driver reads and mutates the **real** `WizardStore` that the TUI renders from: |
| 33 | +the router resolves the active screen from session state, every action goes |
| 34 | +through a store setter, and the render is a pure projection of that state. So |
| 35 | +manipulating the store makes the real TUI react — the driver and the renderer |
| 36 | +share one store and never conflict; you never touch the TUI's input. |
| 37 | + |
| 38 | +## Auth without a browser |
| 39 | + |
| 40 | +The real TUI runs `ci: true`, and auth is satisfied by **state manipulation**: |
| 41 | +`getOrAskForProjectData({ ci: true, apiKey })` resolves the phx personal key into |
| 42 | +credentials, and `store.setCredentials(...)` sets them — the same bearer path an |
| 43 | +OAuth token takes, so the auth screen advances with no browser and no keystrokes. |
| 44 | +(`run_agent` does the same bootstrap as part of the real integration.) |
| 45 | + |
| 46 | +## The two routes |
| 47 | + |
| 48 | +- **CI snapshots** — `tui-snapshots.no-jest.ts` spawns `tui-host` (`MODE=fixed`) |
| 49 | + in a PTY. The host self-drives the fixed profile (`decideE2eAction`) through the |
| 50 | + real agent run and signals each key moment; the parent writes the real rendered |
| 51 | + screen to `SNAP_OUT/NN-<screen>.txt` (including the run screen's progression). |
| 52 | +- **Agent** — `wizard-ci-mcp.no-jest.ts` is a stdio MCP server that spawns |
| 53 | + `tui-host` (`MODE=serve`) and proxies: `read_state` / `perform_action` / |
| 54 | + `run_agent` forward over a unix socket; `render_screen` returns the real |
| 55 | + captured frame. The agent decides each screen itself. |
| 56 | + |
| 57 | +## Things that bite |
| 58 | + |
| 59 | +1. **Running inside an agent session.** Host env (`CLAUDECODE`, `ANTHROPIC_*`, |
| 60 | + `CLAUDE_CODE_*`) makes the wizard's spawned agent defer auth to the host → |
| 61 | + `apiKeySource: none` → 401. The harness strips these for the child. A plain CI |
| 62 | + shell never has them. |
| 63 | +2. **A project-scoped key needs its project id.** Pass the team's `--project-id` |
| 64 | + (or `POSTHOG_WIZARD_PROJECT_ID`), or bootstrap 403s on project-data fetch. |
| 65 | +3. **Never run on a real fixture.** Always a throwaway copy. |
| 66 | +4. **`run_agent` is minutes long and creates real resources** (a dashboard + |
| 67 | + insights) each run; the agent log is one shared file — never run two at once. |
| 68 | +5. **node-pty's spawn-helper.** When the package is extracted without running its |
| 69 | + build script (pnpm skips it), the prebuilt `spawn-helper` loses its execute |
| 70 | + bit and `pty.spawn` fails with `posix_spawnp failed`. `tui-capture.ts` restores |
| 71 | + it best-effort on each spawn. |
| 72 | + |
| 73 | +## Changing what the run does |
| 74 | + |
| 75 | +Per-program UI choices live in the harness (`profiles.ts`, keyed by program id) — |
| 76 | +not on the program config — so this machinery stays out of production source. Edit |
| 77 | +the program's entry (typed by `WizardE2eProfile`); the host asks |
| 78 | +`decideE2eAction(state, profile)` what to commit on each screen. The (screen → |
| 79 | +decision) trace is snapshot-tested offline in `__tests__/` (`jest -u` to update). |
| 80 | + |
| 81 | +## Visual-regression snapshots (the workbench flow) |
| 82 | + |
| 83 | +[wizard-workbench](https://github.com/PostHog/wizard-workbench) runs the CI route |
| 84 | +for real-run visual regression: each test definition runs `tui-snapshots`, the |
| 85 | +real-TUI screens are rasterized to a side-by-side baseline-vs-current review, and |
| 86 | +run-to-run differences are surfaced for a human, not asserted away. See |
| 87 | +`services/wizard-ci/` there. |
0 commit comments