Skip to content

Commit 4710f49

Browse files
Shyam Sridharclaude
andcommitted
feat(desktop): UltraQA Cycles 1-3 — Playwright E2E UAT harness + design-creation flow
Add end-to-end UAT testing infrastructure for the Electron desktop app and land the renderer + main-process changes needed to drive the full design-creation flow under Playwright. UAT harness: - Playwright 1.52.0 + electron launch fixture (apps/desktop/e2e/**) - 13 spec files covering smoke, onboarding, sidebar, dialogs, hub, workspace, model-switcher, settings, comments, files, plus the new create-design flow - Per-test temp userData dir for state isolation (ELECTRON_USER_DATA_DIR hook in main process) - Cross-platform e2e-run.cjs wrapper for the PW_DISABLE_TS_ESM + Node 25 loader workaround - 24 UAT screenshots captured at apps/desktop/test-results/screenshots/ Renderer changes: - Renderer-wide data-testid pass (~20 components) with exported TEST_IDS constants for spec reuse - Zustand store: new configHydrated boolean signaling first IPC settle - createNewDesign(workspacePath?, name?) now honors caller-supplied name - Sidebar gains a per-design list with data-testid for each design.id - main.tsx exposes the test store via preload isE2E contextBridge Main-process additions (pre-existing executor drift, kept by user decision): - default-design-system + design-system-resolver modules - migration/backfill + schema-version - stores/comments-store + stores/diagnostics-store - packages/core agent, context-prune, skills/loader, tools/done refinements E2E final result: 20 passed / 1 failed (pre-existing hub launch-race flake) / 12 skipped of 32 tests. create-design.spec.ts: 16/16 steps complete with all 4 screenshots produced. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent fe72fcd commit 4710f49

83 files changed

Lines changed: 4547 additions & 61 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.gitignore

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -92,6 +92,7 @@ website/public/demos/_speedup/
9292
.omc/logs/
9393
.omc/notepad.md
9494
.omc/project-memory.json
95+
.omc/specs/
9596

9697
# OMX runtime state (per-session, not source). Plans/specs/handoff stay tracked.
9798
.omx/state/
@@ -103,3 +104,14 @@ website/public/demos/_speedup/
103104
# Per-session memory buffer (not source)
104105
.remember/
105106
.omc/prd.json
107+
108+
# Local agent / tooling scratch directories — not for public repo.
109+
.gstack/
110+
.playwright-cli/
111+
graphify-out/
112+
output/
113+
.graphify_python
114+
115+
# Stray nested dirs from misrun tools — defensive.
116+
apps/desktop/.omc/
117+
apps/desktop/apps/

AGENTS.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -57,6 +57,7 @@ The v0.2 tool surface is pi's seven built-ins plus Open CoDesign design tools:
5757
- `skill(name)` lazy-loads skill text from a manifest.
5858
- `preview(path)` renders artifacts and returns console errors, asset errors, DOM outline, metrics, and screenshots for vision models.
5959
- `gen_image(prompt, path)` writes generated images to disk when capability and provider config allow it.
60+
- `read_brand(source)` ingests brand identity from a live URL, Git repo, or screenshot image and writes/updates the workspace DESIGN.md with extracted color, font, and spacing tokens.
6061
- `tweaks(blocks)` declares editable controls across files.
6162
- `todos(items)` shows task state for complex turns.
6263
- `done(path)` ends a turn after preview self-check.

CLAUDE.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -57,6 +57,7 @@ The v0.2 tool surface is pi's seven built-ins plus Open CoDesign design tools:
5757
- `skill(name)` lazy-loads skill text from a manifest.
5858
- `preview(path)` renders artifacts and returns console errors, asset errors, DOM outline, metrics, and screenshots for vision models.
5959
- `gen_image(prompt, path)` writes generated images to disk when capability and provider config allow it.
60+
- `read_brand(source)` ingests brand identity from a live URL, Git repo, or screenshot image and writes/updates the workspace DESIGN.md with extracted color, font, and spacing tokens.
6061
- `tweaks(blocks)` declares editable controls across files.
6162
- `todos(items)` shows task state for complex turns.
6263
- `done(path)` ends a turn after preview self-check.

apps/desktop/.gitignore

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
playwright-report/
2+
test-results/
3+
e2e/.auth/

apps/desktop/e2e/README.md

Lines changed: 121 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,121 @@
1+
# E2E Test Suite — Open CoDesign Desktop
2+
3+
Playwright + Electron end-to-end tests for the `@atv-design/desktop` app.
4+
5+
## Running Tests
6+
7+
```bash
8+
# From the repo root — builds the app first, then runs all E2E specs
9+
pnpm test:e2e
10+
11+
# From the desktop package directly
12+
pnpm --filter @atv-design/desktop test:e2e
13+
14+
# Skip the build (use if out/main/index.js already exists)
15+
pnpm --filter @atv-design/desktop test:e2e:norebuild
16+
```
17+
18+
The `test:e2e` script calls `pnpm build && node scripts/e2e-run.cjs`.
19+
20+
`scripts/e2e-run.cjs` is a thin cross-platform wrapper that sets
21+
`PW_DISABLE_TS_ESM=1` **before** spawning the Playwright process via
22+
`spawnSync`. This is necessary because:
23+
24+
> **Why `PW_DISABLE_TS_ESM=1`?**
25+
> Playwright 1.52 + Node 25's ESM loader conflict causes the Playwright
26+
> process to hang silently unless `PW_DISABLE_TS_ESM=1` is in the environment
27+
> at startup. Setting it inside `playwright.config.ts` is too late — the
28+
> loader runs before the config is evaluated. The `e2e-run.cjs` wrapper
29+
> injects it at the right moment and works on both Windows cmd.exe and bash
30+
> (the Unix `VAR=value cmd` syntax does not work on Windows).
31+
32+
## Viewing the HTML Report
33+
34+
After a run, open the interactive HTML report:
35+
36+
```bash
37+
pnpm --filter @atv-design/desktop exec playwright show-report
38+
```
39+
40+
Or directly:
41+
```bash
42+
npx playwright show-report apps/desktop/playwright-report
43+
```
44+
45+
## Screenshots
46+
47+
Every spec takes at least one screenshot during the test. Screenshots land in:
48+
49+
```
50+
apps/desktop/test-results/screenshots/<spec-name>-<scenario>.png
51+
```
52+
53+
They are committed when meaningful (e.g. smoke-launch.png) and gitignored for
54+
CI-generated runs (see `.gitignore`).
55+
56+
## State Isolation
57+
58+
Each test gets a **fresh temp directory** as its Electron `userData` path
59+
(`ELECTRON_USER_DATA_DIR`). The directory is deleted after the test completes.
60+
This means:
61+
62+
- No test can read another test's config, sessions, or logs.
63+
- No test modifies the developer's real `~/.config/atv-design/` directory.
64+
65+
## Fixtures
66+
67+
### `test` (base fixture — `e2e/fixtures/electron-app.ts`)
68+
69+
Launches Electron with an empty temp dir. The app boots into the
70+
**unauthenticated / first-launch state** (login cards visible).
71+
72+
Used by: `smoke.spec.ts`, `onboarding.spec.ts`
73+
74+
### `testOnboarded` (onboarded fixture)
75+
76+
Before Electron launches, calls `seedOnboardedPreferences(tempDir)` which
77+
writes:
78+
79+
- `storage-settings.json` — redirects `configDir` into the temp dir
80+
- `config.toml` — v3 config with the built-in `ollama` keyless provider
81+
82+
The keyless provider satisfies `isKeylessProviderAllowed()` so
83+
`getOnboardingState()` returns `{ hasKey: true }`. The renderer skips the
84+
login-card gate and renders the full app shell.
85+
86+
**No real network calls are made** — the test suite never sends a prompt.
87+
88+
Used by: `main-window.spec.ts`, `sidebar.spec.ts`, `dialogs.spec.ts`,
89+
`hub.spec.ts`, `workspace.spec.ts`, `model-switcher.spec.ts`,
90+
`settings.spec.ts`
91+
92+
## Spec Files
93+
94+
| File | Fixture | Description |
95+
|------|---------|-------------|
96+
| `smoke.spec.ts` | `test` | App launches, no console errors, version exposed |
97+
| `onboarding.spec.ts` | `test` | First-launch login cards visible |
98+
| `main-window.spec.ts` | `testOnboarded` | TopBar, hub view default |
99+
| `sidebar.spec.ts` | `testOnboarded` | Collapse, resize, new-design button |
100+
| `dialogs.spec.ts` | `testOnboarded` | NewDesignDialog, Settings panel open/close |
101+
| `hub.spec.ts` | `testOnboarded` | HubView renders, design grid or empty state |
102+
| `workspace.spec.ts` | `testOnboarded` | PreviewPane in workspace view |
103+
| `model-switcher.spec.ts` | `testOnboarded` | ModelSwitcher renders, opens list |
104+
| `settings.spec.ts` | `testOnboarded` | Settings panel tabs, provider section, language toggle |
105+
| `comments.spec.ts` | `testOnboarded` | **All skipped** — requires design session |
106+
| `files.spec.ts` | `testOnboarded` | **All skipped** — requires workspace + design |
107+
108+
## Skip Policy
109+
110+
Tests that cannot find a selector without `data-testid` attributes are
111+
marked `.skip()` with an explanatory comment rather than adding testids to
112+
renderer source. Testids are planned for Cycle 3.
113+
114+
## Cycle 3 TODO
115+
116+
- Add `data-testid` / `aria-label` to: sidebar collapse button, model
117+
switcher, hub tab bar, language toggle, new-design button
118+
- Add DB-seeding fixture to bootstrap a real design session for comments,
119+
files, and workspace tests
120+
- Extend `workspace.spec.ts` once `snapshots.createDesign` is exposed on
121+
the preload bridge

apps/desktop/e2e/comments.spec.ts

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
/**
2+
* UAT: CommentsPanel.
3+
*
4+
* Comments require an active design session with at least one snapshot.
5+
* All tests in this file are skipped in Cycle 2 — they are placeholders
6+
* for Cycle 3 once we have a workspace bootstrapping helper that can
7+
* create a real design session end-to-end.
8+
*/
9+
10+
import { testOnboarded as test } from './fixtures/electron-app';
11+
12+
test.skip('comments panel renders', async () => {
13+
// Skipped: CommentsPanel (CommentsPanel.tsx) is only visible in workspace
14+
// view when a design is selected AND the comments IPC is registered.
15+
// Bootstrapping a full design session requires either:
16+
// a) A real generation run (touches network — not suitable for UAT)
17+
// b) Direct DB seeding of the snapshots SQLite DB (out of scope Cycle 2)
18+
// Will revisit in Cycle 3 with a DB-seeding fixture.
19+
});
20+
21+
test.skip('comments panel shows empty state with no comments', async () => {
22+
// Same constraint as above.
23+
});
Lines changed: 119 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,119 @@
1+
/**
2+
* UAT: Create new design end-to-end flow.
3+
*
4+
* Uses `testOnboarded` fixture (config pre-seeded with a valid provider).
5+
* Tests the full create flow: button → dialog → fill name → submit → workspace.
6+
*
7+
* If any step fails, we capture a screenshot and report verbatim — we do NOT
8+
* mask failures with fallbacks. The brief says "surface failures, don't paper
9+
* over them."
10+
*/
11+
12+
import * as fs from 'node:fs';
13+
import * as path from 'node:path';
14+
import { expect, testOnboarded as test } from './fixtures/electron-app';
15+
16+
const SCREENSHOT_DIR = path.join(__dirname, '../test-results/screenshots');
17+
18+
function ensureDir(dir: string): void {
19+
fs.mkdirSync(dir, { recursive: true });
20+
}
21+
22+
test('create new design flow', async ({ firstWindow }) => {
23+
ensureDir(SCREENSHOT_DIR);
24+
25+
// ── Step 1: wait for window.codesign bridge ────────────────────────────────
26+
await firstWindow.waitForFunction(
27+
() => typeof (window as Window & { codesign?: unknown }).codesign !== 'undefined',
28+
{ timeout: 10_000 },
29+
);
30+
31+
// ── Step 2: wait for store to be exposed ────────────────────────────��─────
32+
await firstWindow.waitForFunction(
33+
() =>
34+
(window as Window & { __codesign_test_store__?: unknown }).__codesign_test_store__ != null,
35+
{ timeout: 10_000 },
36+
);
37+
38+
// ── Step 3: wait for configHydrated ───────────────────────────────────────
39+
await firstWindow.waitForFunction(
40+
() => {
41+
const w = window as Window & {
42+
__codesign_test_store__?: { getState: () => { configHydrated?: boolean } };
43+
};
44+
return w.__codesign_test_store__?.getState().configHydrated === true;
45+
},
46+
{ timeout: 10_000 },
47+
);
48+
49+
// ── Step 4: click the new design button ───────────────────────────────────
50+
const newDesignBtn = firstWindow.getByTestId('sidebar-button-new-design');
51+
await expect(newDesignBtn).toBeVisible({ timeout: 8_000 });
52+
await newDesignBtn.click();
53+
54+
// ── Step 5: wait for dialog ───────────────────────────────────────────────
55+
const dialog = firstWindow.getByTestId('new-design-dialog');
56+
await expect(dialog).toBeVisible({ timeout: 8_000 });
57+
58+
// ── Step 6: screenshot dialog open ───────────────────────────────────────
59+
await firstWindow.screenshot({
60+
path: path.join(SCREENSHOT_DIR, 'create-design-dialog-open.png'),
61+
});
62+
63+
// ── Step 7: fill name ─────────────────────────────────────────────────────
64+
const nameInput = firstWindow.getByTestId('new-design-dialog-input-name');
65+
await nameInput.fill('E2E Smoke Design');
66+
67+
// ── Step 8: screenshot filled ─────────────────────────────────────────────
68+
await firstWindow.screenshot({
69+
path: path.join(SCREENSHOT_DIR, 'create-design-dialog-filled.png'),
70+
});
71+
72+
// ── Step 9: click submit ──────────────────────────────────────────────────
73+
const submitBtn = firstWindow.getByTestId('new-design-dialog-button-submit');
74+
await submitBtn.click();
75+
76+
// ── Step 10: wait for dialog to be hidden ────────────────────────────────
77+
await expect(dialog).toBeHidden({ timeout: 10_000 });
78+
79+
// ── Step 11: wait for workspace view ──────────────────────────────────────
80+
// workspace-view div is hidden={view !== 'workspace'}, so wait for it to
81+
// become visible (the hidden attr gets removed when view switches).
82+
const workspaceView = firstWindow.getByTestId('workspace-view');
83+
await expect(workspaceView).toBeVisible({ timeout: 15_000 });
84+
85+
// ── Step 12: assert currentDesignId ───────────────────────────────────────
86+
const currentDesignId = await firstWindow.evaluate(() => {
87+
const w = window as Window & {
88+
__codesign_test_store__?: { getState: () => { currentDesignId?: string } };
89+
};
90+
return w.__codesign_test_store__?.getState().currentDesignId ?? null;
91+
});
92+
expect(typeof currentDesignId).toBe('string');
93+
expect(currentDesignId).not.toBe('');
94+
expect(currentDesignId).not.toBeNull();
95+
96+
// ── Step 13: assert design name in store ──────────────────────────────────
97+
const foundInStore = await firstWindow.evaluate(() => {
98+
const w = window as Window & {
99+
__codesign_test_store__?: { getState: () => { designs?: Array<{ name: string }> } };
100+
};
101+
const designs = w.__codesign_test_store__?.getState().designs ?? [];
102+
return designs.some((d) => d.name === 'E2E Smoke Design');
103+
});
104+
expect(foundInStore).toBe(true);
105+
106+
// ── Step 14: screenshot workspace ─────────────────────────────────────────
107+
await firstWindow.screenshot({
108+
path: path.join(SCREENSHOT_DIR, 'create-design-workspace.png'),
109+
});
110+
111+
// ── Step 15: assert design visible in sidebar list ────────────────────────
112+
const sidebarItem = firstWindow.getByTestId(`design-list-item-${currentDesignId}`);
113+
await expect(sidebarItem).toBeVisible({ timeout: 5_000 });
114+
115+
// ── Step 16: screenshot sidebar ───────────────────────────────────────────
116+
await firstWindow.screenshot({
117+
path: path.join(SCREENSHOT_DIR, 'create-design-in-sidebar.png'),
118+
});
119+
});

0 commit comments

Comments
 (0)