Skip to content

Commit f173b2a

Browse files
committed
e2e: browse/promote authoring loop and cross-OS packaged-desktop targets
Two additions to the e2e setup sharing one idea: the interfaces you develop the product with are the same ones the generated tests drive, and every run is watchable. Authoring loop (src/journey/, scripts/cli.ts): - `bun run cli browse <target> <step>` drives the live web UI one step at a time (real logged-in Chromium), each step replaying the whole flow and printing the page's controls plus a screenshot. Steps span the browser, a terminal (`run`), and HTTP (`request`), interleaved in one session. - `bun run cli promote <target> "<name>"` turns the recorded journey into a committed scenario() and runs it. One Step DSL is the source of truth for both live execution and codegen, so the generated test drives the same surfaces the exploration drove and cannot quietly diverge. Cross-OS packaged desktop (setup/desktop-*, desktop-vm/, src/vm/desktop.ts): - desktop-macos / desktop-linux / desktop-windows run the real electron-builder bundle inside a guest, drive it over a CDP tunnel, and film the console into runs/<target>/ alongside test.ts and step screenshots. One shared scenario and driver; only launch and capture differ per OS: macOS: autologin Aqua session, launchctl asuser, screencapture linux: Xvfb + openbox, xdotool window resize, ffmpeg x11grab windows: dockur (QEMU) interactive session, QEMU screendump - macOS and Linux auto-provision a tart guest and build the bundle locally (the executor binary cross-compiles via BUN_TARGET); Windows attaches to a dockur host configured through E2E_DESKTOP_WIN_* env. The desktop targets are not in the default test chain and skip honestly without a guest. Also: - Force tart SSH to password-only (PubkeyAuthentication=no, IdentitiesOnly=yes) so a loaded SSH agent does not exhaust the guest's MaxAuthTries, an intermittent failure the existing cli-{os} lanes also hit. - build-sidecar keys the executable-bit chmod on the build target, not the host, so a windows-target cross-build no longer fails looking for a unix executor binary.
1 parent eb9ed89 commit f173b2a

16 files changed

Lines changed: 1702 additions & 3 deletions

apps/desktop/scripts/build-sidecar.ts

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -45,7 +45,11 @@ await rm(EXECUTOR_OUT_DIR, { recursive: true, force: true });
4545
await mkdir(EXECUTOR_OUT_DIR, { recursive: true });
4646
await cp(sourceBinDir, EXECUTOR_OUT_DIR, { recursive: true });
4747

48-
if (process.platform !== "win32") {
48+
// Restore the unix executable bit — keyed on the TARGET, not the host. A
49+
// windows-target cross-build (BUN_TARGET=bun-windows-x64 on macOS/linux) stages
50+
// `executor.exe`, which needs no bit; chmod'ing a non-existent `executor` there
51+
// would ENOENT.
52+
if (!targetPackage.includes("windows")) {
4953
await chmod(join(EXECUTOR_OUT_DIR, "executor"), 0o755);
5054
}
5155

e2e/AGENTS.md

Lines changed: 74 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -130,6 +130,80 @@ When handing results to the user, follow the evidence contract in the root
130130
[AGENTS.md](../AGENTS.md) (direct run links + a live instance + what to try);
131131
[RUNNING.md](../RUNNING.md) has the current sharing/demo mechanics.
132132

133+
## Authoring from a live browser (`browse``promote`)
134+
135+
You don't have to hand-write a browser scenario. Drive a running instance's web
136+
UI one step at a time, then turn the recorded journey into a committed scenario.
137+
The generated test drives the same Browser surface the exploration drove, so it
138+
is the real test, not a transcript of one — develop the flow, then crystallize
139+
it.
140+
141+
```sh
142+
cd e2e
143+
bun run cli up cloud # a live instance to develop against
144+
bun run cli browse cloud goto / # each step REPLAYS the whole flow from a
145+
bun run cli browse cloud click link Policies # clean browser and prints the page's controls
146+
bun run cli browse cloud at-url /policies # (role · name) + a screenshot, so the next
147+
bun run cli browse cloud see "No policies yet" # step is written against what's actually there
148+
bun run cli promote cloud "Policies · a fresh workspace has none"
149+
```
150+
151+
Each `browse` replays every step so far, so what you are building is, at every
152+
moment, exactly what `promote` emits — a step that doesn't reproduce fails here,
153+
not in CI. Steps: `goto <path>`, `click <role> <name>`, `click-text <text>`,
154+
`fill <field> <value>`, `press <key>`, and the assertions `see <text>` /
155+
`at-url <substring>`. `--label "…"` names a step (it becomes the `step(...)`
156+
group); `browse <target> show | undo | reset` manages the journey.
157+
158+
`promote` writes `<target>/<slug>.gen.test.ts` and runs it against the live
159+
instance, producing the usual run artifacts (session.mp4, step screenshots,
160+
trace). A journey with no assertion is refused — a scenario must prove
161+
something. From then on the file is an ordinary scenario: edit it, add API/MCP
162+
checks, drop the `.gen` once it's yours. The journey itself lives in
163+
`.dev/<target>.journey.json` (gitignored), not the repo.
164+
165+
## Desktop targets (the app on real OSes, filmed)
166+
167+
The packaged desktop app runs as its own targets, each landing in its own
168+
`runs/<target>/` bucket with a video. One shared scenario (`desktop-vm/`) and the
169+
shared driver (`src/vm/desktop.ts`) + setup plumbing (`setup/desktop-vm.ts`); one
170+
project + globalsetup per guest OS.
171+
172+
- **`desktop-packaged`** — the real electron-builder bundle on THIS machine's
173+
display (the supervised-daemon attach path). Needs a logged-in GUI session.
174+
- **`desktop-macos` / `desktop-linux`** — the same bundle inside a guest VM,
175+
driven over CDP from the host and filmed. The globalsetup boots the guest
176+
(tart), builds + pushes the bundle, brings the app up with
177+
`--remote-debugging-port`, forwards it, and the scenario connects + drives +
178+
records. Provisioned automatically — or attach to a running guest with
179+
`E2E_DESKTOP_VM_IP=<ip>`:
180+
181+
```sh
182+
vitest run --project desktop-macos # or desktop-linux
183+
```
184+
185+
The guests run tart `--no-graphics` (no host window, never steals focus) but
186+
still have a usable display:
187+
188+
- **macOS**: the base image's autologin reaches a real Aqua session
189+
(WindowServer/Dock/Finder). Launch the app INTO it with `sudo launchctl asuser
190+
<uid> …` (a plain SSH spawn lands in a non-GUI session); the unsigned arm64
191+
bundle is ad-hoc `codesign`'d in the guest; `screencapture` films it.
192+
- **linux**: no window server, so the app renders into an `Xvfb` display with a
193+
minimal WM (`openbox` — without it the electron window never maps); the window
194+
maps tiny (10x10) so the globalsetup `xdotool`-resizes it to fill, and ffmpeg
195+
`x11grab` films it. `--no-sandbox` (the chrome-sandbox needs setuid root).
196+
197+
Base images (`admin`/`admin`): `executor-macos-base` (cirruslabs sequoia, autologin)
198+
and `executor-linux-base` (cirruslabs ubuntu + Xvfb/ffmpeg/openbox/xdotool +
199+
electron runtime libs). The bundle's `executor` binary is cross-compiled for the
200+
guest (`BUN_TARGET`), and electron-builder's `dir` target assembles the unpacked
201+
app on macOS — so both bundles build on this Mac.
202+
203+
Note: `desktop-packaged`'s `guiAvailable()` probe (`launchctl managername`) reads
204+
"Background" over SSH even when Aqua is up, so it's host-only; the VM targets gate
205+
on a CDP page target instead.
206+
133207
## Discovering endpoints
134208

135209
- The full OpenAPI spec: `curl http://127.0.0.1:<cloud port>/api/openapi.json`
Lines changed: 75 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,75 @@
1+
// The PACKAGED desktop app, on camera, inside a GUI guest — driven over CDP from
2+
// the host. ONE scenario shared by every desktop-<os> project (desktop-macos,
3+
// desktop-linux): the same bundle and CDP driver, proving it renders on a guest
4+
// OS and filming the actual console. The desktop-<os> globalsetup boots the
5+
// guest, launches the app, forwards its --remote-debugging-port (E2E_DESKTOP_CDP_PORT)
6+
// and publishes the guest IP; this scenario connects, drives, and records. The
7+
// run lands in runs/<target>/ (its own per-OS bucket). Without a guest it skips
8+
// honestly, like desktop-packaged without a display.
9+
import { writeFileSync } from "node:fs";
10+
import { join } from "node:path";
11+
12+
import { expect, it } from "@effect/vitest";
13+
import { Effect } from "effect";
14+
15+
import { scenario } from "../src/scenario";
16+
import { RunDir } from "../src/services";
17+
import { CdpPage, pageWsUrl, recordGuestScreen } from "../src/vm/desktop";
18+
19+
const NAME = "Desktop (packaged, in a VM) · the bundle renders its console";
20+
const cdpPort = process.env.E2E_DESKTOP_CDP_PORT;
21+
const guestIp = process.env.E2E_DESKTOP_VM_IP;
22+
const recSeconds = Number(process.env.E2E_DESKTOP_REC_SECONDS ?? "12");
23+
const os: "macos" | "linux" | "windows" =
24+
process.env.E2E_TARGET === "desktop-windows"
25+
? "windows"
26+
: process.env.E2E_TARGET === "desktop-linux"
27+
? "linux"
28+
: "macos";
29+
30+
const run = async (runDir: string) => {
31+
const cdp = await CdpPage.connect(await pageWsUrl(Number(cdpPort)));
32+
try {
33+
await cdp.command("Runtime.enable");
34+
await cdp.command("Page.enable");
35+
36+
// Film the console while we drive it (OS-aware capture lands a playable mp4).
37+
const recording = recordGuestScreen(
38+
guestIp as string,
39+
recSeconds,
40+
join(runDir, "session.mp4"),
41+
os,
42+
);
43+
44+
// Reaching the nav proves the packaged bundle booted and connected to its
45+
// daemon on this OS.
46+
await cdp.waitForText("Integrations", 60_000).catch(() => cdp.waitForText("Settings", 60_000));
47+
writeFileSync(join(runDir, "01-console-rendered.png"), await cdp.screenshot());
48+
49+
const body = await cdp.command<{ result?: { value?: string } }>("Runtime.evaluate", {
50+
expression: "document.body.innerText",
51+
returnByValue: true,
52+
});
53+
expect(body.result?.value ?? "", "the packaged console rendered its nav").toContain(
54+
"Integrations",
55+
);
56+
57+
await recording;
58+
} finally {
59+
cdp.close();
60+
}
61+
};
62+
63+
if (!cdpPort || !guestIp) {
64+
it.skip(`${NAME} (needs a desktop guest — set E2E_DESKTOP_VM_IP or run the desktop-<os> project)`, () => {});
65+
} else {
66+
// Literal name (not NAME) so the run's test.ts review artifact captures it.
67+
scenario(
68+
"Desktop (packaged, in a VM) · the bundle renders its console",
69+
{ timeout: 180_000 },
70+
Effect.gen(function* () {
71+
const runDir = yield* RunDir;
72+
yield* Effect.promise(() => run(runDir));
73+
}),
74+
);
75+
}

0 commit comments

Comments
 (0)