Skip to content

Commit 01c1e0d

Browse files
committed
fix(decompose): address 4 Major review findings on PR OpenCoworkAI#241
1. Decompose loop success now triggers on first clean pass. The prompt previously required BOTH verifiers to return {verified, needs_review}, but the deterministic verifier only emits {ok, needs_iteration}. Forced an unnecessary extra iteration even when deterministic parity already passed. Fix: prompt now uses each verifier's actual vocabulary — success := deterministic.status === 'ok' && visual.status ∈ {verified, needs_review, unavailable} Updated both EN and ZH prompts in decomposePrompt.ts. 2. Visual verifier now actually has a source image at runtime. `verify_ui_kit_visual_parity({slug})` defaults to `source.png`, but `createRuntimeTextEditorFs` only seeded `index.html` + frames + skills from FRAME_TEMPLATES + DESIGN_SKILLS. Image attachments lived in `promptContext.attachments` but were never persisted to the agent's virtual FS. The visual judge silently degraded to `unavailable` on every normal run. Fix: `createRuntimeTextEditorFs` now accepts `sourceAttachments` and seeds `source.png` from the first image attachment's `imageDataUrl`. The runtime call site at runGenerate threads `input.attachments` through. 3. Judge/render failures now fall back to structured `unavailable`. `renderUiKit()` (Playwright) and `judgeVisualParity()` (vision LLM) were awaited without try/catch. Empty/non-JSON judge replies threw, text-only models threw, headless render crashes threw — all bubbled up and broke the agent loop instead of returning the documented `status: 'unavailable'` path. Fix: wrap both awaits in try/catch returning `unavailableReport()` with the underlying error message. Logged at info level for trace visibility. 4. Changeset no longer claims `Closes OpenCoworkAI#225`. PR template says use `Closes` only for fully resolved issues. This diff stops at emitting a `ui_kits/<slug>/` handoff bundle and explicitly tells the agent NOT to continue into the prototype flow. Phase 2 (cross-page flows, state machines, prototype orchestration) is separate work. Fix: changeset now says `Refs OpenCoworkAI#225 (Phase 1 of …)` and notes Phase 2 is tracked separately. Verification: - npx tsc --noEmit -p packages/core - npx tsc --noEmit -p apps/desktop - both clean (0 errors)
1 parent c1861fd commit 01c1e0d

4 files changed

Lines changed: 48 additions & 12 deletions

File tree

.changeset/decompose-to-ui-kit.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,4 +6,4 @@
66

77
Add **Decompose to UI Kit** — one-click in the chat sidebar emits a `ui_kits/<slug>/` folder shaped for coding-agent handoff (`index.html` + `components/*.tsx` + `tokens.css` + `manifest.json` + `README.md`). Built-in deterministic + vision verifiers self-check parity using a 12-question boolean rubric (`parityScore = passCount / totalChecks`, no LLM-fabricated floats) and re-iterate on gaps. Per-decompose cost surfaces inline as a toast.
88

9-
Closes Phase 1 of #225.
9+
Refs #225 (Phase 1 of the requested image → componentization → prototype workflow). Phase 2 (cross-page flows, state machines, prototype orchestration) is tracked separately.

apps/desktop/src/main/index.ts

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -280,6 +280,15 @@ interface CreateRuntimeTextEditorFsOptions {
280280
previousHtml: string | null;
281281
sendEvent: (event: AgentStreamEvent) => void;
282282
logger: Pick<CoreLogger, 'error'>;
283+
/**
284+
* Image attachments from `preparePromptContext`. The first image (if any) is
285+
* persisted into the agent's virtual FS as `source.png` so that
286+
* `verify_ui_kit_visual_parity({slug})` can read it via its default
287+
* `sourceImagePath`. Without this, the visual judge silently degrades to
288+
* `status: 'unavailable'` even when the host has wired up the judge
289+
* callback (review finding #2 on PR #241).
290+
*/
291+
sourceAttachments?: ReadonlyArray<{ imageDataUrl?: string }>;
283292
}
284293

285294
export function createRuntimeTextEditorFs({
@@ -289,6 +298,7 @@ export function createRuntimeTextEditorFs({
289298
previousHtml,
290299
sendEvent,
291300
logger,
301+
sourceAttachments,
292302
}: CreateRuntimeTextEditorFsOptions) {
293303
const baseCtx = { designId: designId ?? '', generationId } as const;
294304
const fsMap = new Map<string, string>();
@@ -301,6 +311,13 @@ export function createRuntimeTextEditorFs({
301311
for (const [name, content] of DESIGN_SKILLS) {
302312
fsMap.set(`skills/${name}`, content);
303313
}
314+
// Seed source.png from the first image attachment so the visual verifier
315+
// can read it via its default `sourceImagePath: 'source.png'`. Stored as a
316+
// data URL to match `verify_ui_kit_visual_parity`'s expected format.
317+
const firstSourceImage = sourceAttachments?.find((a) => Boolean(a.imageDataUrl));
318+
if (firstSourceImage?.imageDataUrl) {
319+
fsMap.set('source.png', firstSourceImage.imageDataUrl);
320+
}
304321

305322
function emitFsUpdated(filePath: string, content: string): void {
306323
if (designId === null) return;
@@ -510,6 +527,9 @@ function registerIpcHandlers(db: Database | null): void {
510527
logger: logIpc,
511528
previousHtml,
512529
sendEvent,
530+
// Pipe image attachments through so `source.png` is seeded for
531+
// verify_ui_kit_visual_parity (PR #241 review fix #2).
532+
sourceAttachments: input.attachments,
513533
});
514534
const cfg = getCachedConfig();
515535
const imageConfig = cfg ? resolveImageGenerationConfig(cfg) : null;

apps/desktop/src/renderer/src/hooks/decomposePrompt.ts

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -43,9 +43,9 @@ export const DECOMPOSE_PROMPT_ZH = `把刚才那个设计拆成一个 ui_kits/<s
4343
5. 调 verify_ui_kit_visual_parity({slug}) 拿视觉判定 (vision LLM judge, 12 个 boolean check)
4444
- 如果返回 status="unavailable", host 没接 judge callback, 跳过这一步用 step 4 的结果做决定
4545
- 如果返回了, 看 checks[].passed + reason, 失败的 check 就是要修的点
46-
6. 综合两份 report:
47-
- 两个都 status ∈ {verified, needs_review} (12/12 或 11/12 个 check 过): 直接调 done
48-
- 任一为 needs_iteration / failed: 把两边的 gaps 合并去重 + 失败 check 的 reason 一起作为反馈, 重新调一次 decompose_to_ui_kit
46+
6. 综合两份 report (注意: 两个 verifier 的 status 词汇不同):
47+
- 成功条件: deterministic.status === 'ok' 且 visual.status ∈ {verified, needs_review, unavailable} → 直接调 done
48+
- 任一失败: deterministic.status === 'needs_iteration' 或 visual.status ∈ {needs_iteration, failed} → 把两边的 gaps 合并去重 + 失败 check 的 reason 一起作为反馈, 重新调一次 decompose_to_ui_kit
4949
7. 最多迭代两轮. 第二轮验证完不管 score 多少都调 done.
5050
8. done 的 summary 必须诚实写出:
5151
- 结构化 verifier 的 passCount/totalChecks + status
@@ -68,9 +68,9 @@ export const DECOMPOSE_PROMPT_EN = `Decompose the design you just produced into
6868
5. Call verify_ui_kit_visual_parity({slug}) — vision-LLM judge with the 12 standard boolean checks (layout / color / typography / content / components dimensions). Each check is yes/no with a reason. parityScore = passCount/12 (derived deterministically).
6969
- If it returns status="unavailable", the host hasn't injected the judge callback. Proceed with step 4's deterministic report alone.
7070
- If it returns successfully, read each checks[].passed + reason. Failed checks are the things to fix.
71-
6. Reconcile both reports:
72-
- Both status ∈ {verified, needs_review} (12/12 or 11/12 checks passed): call done
73-
- Either status === 'needs_iteration' or 'failed': merge + dedup gaps from both reports + the failed checks' reasons, re-call decompose_to_ui_kit addressing them
71+
6. Reconcile both reports (NOTE: the two verifiers use DIFFERENT status vocabularies):
72+
- Success: deterministic.status === 'ok' AND visual.status ∈ {verified, needs_review, unavailable} → call done
73+
- Iterate: deterministic.status === 'needs_iteration' OR visual.status ∈ {needs_iteration, failed} → merge + dedup gaps from both reports + the failed checks' reasons, re-call decompose_to_ui_kit addressing them
7474
7. Iterate at most TWICE. After the second verify, call done regardless of score.
7575
8. The done summary MUST honestly report:
7676
- deterministic verifier passCount/totalChecks + status

packages/core/src/tools/verify-ui-kit-visual-parity.ts

Lines changed: 21 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -308,11 +308,27 @@ export function makeVerifyUiKitVisualParityTool(
308308
mediaType: parseMediaType(sourceFile.content),
309309
};
310310

311-
logger.info('[verify_ui_kit_visual_parity] step=render', { slug: params.slug });
312-
const candidateImg = await renderUiKit(decomposed.content, signal);
313-
314-
logger.info('[verify_ui_kit_visual_parity] step=judge', { slug: params.slug });
315-
const judgeResult = await judgeVisualParity(sourceImg, candidateImg, signal);
311+
// Render + judge are external best-effort calls (Playwright headless +
312+
// vision-LLM). If either throws (text-only model, malformed JSON,
313+
// headless render crash, abort), we degrade to `unavailable` instead
314+
// of bubbling the error and breaking the agent loop. This matches the
315+
// tool's documented contract — review fix #3 on PR #241.
316+
let candidateImg: VisualParityImageRef;
317+
let judgeResult: Awaited<ReturnType<typeof judgeVisualParity>>;
318+
try {
319+
logger.info('[verify_ui_kit_visual_parity] step=render', { slug: params.slug });
320+
candidateImg = await renderUiKit(decomposed.content, signal);
321+
logger.info('[verify_ui_kit_visual_parity] step=judge', { slug: params.slug });
322+
judgeResult = await judgeVisualParity(sourceImg, candidateImg, signal);
323+
} catch (error) {
324+
const message = error instanceof Error ? error.message : String(error);
325+
logger.info('[verify_ui_kit_visual_parity] step=unavailable', {
326+
slug: params.slug,
327+
reason: message,
328+
});
329+
const report = unavailableReport(`render or judge failed: ${message}`);
330+
return { content: [{ type: 'text', text: report.summary }], details: report };
331+
}
316332

317333
const checks = normalizeChecks(judgeResult.checks ?? []);
318334
const passCount = checks.filter((c) => c.passed).length;

0 commit comments

Comments
 (0)