Skip to content

Commit e10ef3b

Browse files
committed
fix(decompose): address bot follow-up #2 (abort leak, dedup checks, maxTokens)
Bot review on commit fe42888 flagged 3 items, all real, all fixed: 1. [Minor] AbortSignal listener leak in render-ui-kit.ts. `signal.removeEventListener('abort', onAbort)` ran only on the did-finish-load success path. did-fail-load, the loadURL catch, and the hard timeout all left the listener registered until the abort actually fired (or never). Fix: centralized cleanup inside `finish()` itself — clears the timeout AND removes the abort listener on every exit path (success, fail, catch, timeout, abort). One source of truth, no path can forget. 2. [Minor] Duplicate STANDARD_CHECKS in judge-visual-parity.ts vs verify-ui-kit-visual-parity.ts. Both held identical 12-check sets; if one was edited the other would silently drift. Fix: imported STANDARD_VISUAL_PARITY_CHECKS from @open-codesign/core (already exported at packages/core/src/index.ts:59). The host prompt only needs {id, question}, so we project the core's {id, dimension, question} down via map. Single source of truth. 3. [Nit] Hardcoded `maxTokens: 8000` in the vision judge call. Worst-case-safety buffer that wastes budget on cheaper models (Gemini Flash, GPT-4o-mini) and doesn't help correctness — actual judge output (12 short {passed, reason} entries + 1-2 sentence summary) lands around 1.5k tokens. Fix: dropped to 4096 with an inline comment explaining the envelope. Fits under every routed model's output cap (Gemini Flash ~8k, Sonnet ~8k, GPT-4o ~16k) with comfortable headroom. Verification: - apps/desktop: pnpm typecheck (node + web tsconfig) clean - packages/core: pnpm typecheck clean - pnpm lint (biome) clean Bot also confirmed all previous review items resolved on this commit: ✅ fork image URLs → main branch ✅ changesets consolidated to one ✅ case-insensitive base64 check ✅ scope creep (.github/* changes) no longer in diff
1 parent fe42888 commit e10ef3b

2 files changed

Lines changed: 29 additions & 73 deletions

File tree

apps/desktop/src/main/judge-visual-parity.ts

Lines changed: 19 additions & 61 deletions
Original file line numberDiff line numberDiff line change
@@ -10,66 +10,18 @@
1010
* plumbing.
1111
*/
1212

13-
import type { JudgeVisualParityFn, VisualParityImageRef } from '@open-codesign/core';
14-
15-
const STANDARD_CHECKS: Array<{ id: string; question: string }> = [
16-
{
17-
id: 'layout.column_count_match',
18-
question: 'Does the candidate have the same number of major columns / regions as the source?',
19-
},
20-
{
21-
id: 'layout.region_positions_match',
22-
question:
23-
'Are major regions (header / sidebar / main / right rail / footer) in the same positions as the source?',
24-
},
25-
{
26-
id: 'layout.hierarchy_preserved',
27-
question: 'Is the visual hierarchy (heading > subhead > body > footer) preserved?',
28-
},
29-
{
30-
id: 'color.accent_color_match',
31-
question:
32-
'Is the primary accent color visually equivalent to the source (same hue family, similar saturation)?',
33-
},
34-
{
35-
id: 'color.palette_consistency_match',
36-
question:
37-
'Does the overall palette feel match the source (warm/cool, saturated/muted, contrast level)?',
38-
},
39-
{
40-
id: 'typography.font_family_match',
41-
question:
42-
'Does the font family character (serif / sans / mono) match the source for each text role?',
43-
},
44-
{
45-
id: 'typography.heading_hierarchy_match',
46-
question: 'Are heading weights and sizes stepped similarly (H1 vs body vs caption)?',
47-
},
48-
{
49-
id: 'content.text_labels_present',
50-
question:
51-
'Are all visible text labels from the source present in the candidate (nav items, headings, button text)?',
52-
},
53-
{
54-
id: 'content.all_sections_present',
55-
question:
56-
'Are all distinct sections from the source present in the candidate (not just one missing region)?',
57-
},
58-
{
59-
id: 'components.repeated_pattern_count_match',
60-
question:
61-
'Does the candidate have approximately the same count of repeated patterns (cards / list items / nav links) as the source?',
62-
},
63-
{
64-
id: 'components.component_structure_match',
65-
question:
66-
'Do repeated components have the same internal anatomy (header + body + footer pieces)?',
67-
},
68-
{
69-
id: 'components.icon_motif_match',
70-
question: 'Are icons / glyphs in the same style (line vs filled, monochrome vs colored)?',
71-
},
72-
];
13+
import {
14+
type JudgeVisualParityFn,
15+
STANDARD_VISUAL_PARITY_CHECKS,
16+
type VisualParityImageRef,
17+
} from '@open-codesign/core';
18+
19+
// Use the canonical check list from core. Previously this file kept its own
20+
// duplicate copy; that risked drift if one was updated without the other
21+
// (review finding on PR #241). The host prompt only needs id + question, so
22+
// we project the core's `{id, dimension, question}` down to `{id, question}`.
23+
const STANDARD_CHECKS: ReadonlyArray<{ id: string; question: string }> =
24+
STANDARD_VISUAL_PARITY_CHECKS.map((c) => ({ id: c.id, question: c.question }));
7325

7426
export const SYSTEM_PROMPT = `You are a meticulous visual QA judge comparing two UI screenshots.
7527
@@ -191,11 +143,17 @@ export function makeJudgeVisualParity(runVisionPrompt: RunVisionPromptFn): Judge
191143
{ data: dataUrlToBase64(candidate.dataUrl), mimeType: candidate.mediaType },
192144
];
193145

146+
// 4096 fits comfortably under the output cap of every vision model we
147+
// currently route to (Gemini Flash ~8k, Sonnet ~8k, GPT-4o ~16k) and
148+
// matches the judge's actual envelope: 12 short {passed, reason} entries
149+
// + a 1-2 sentence summary lands at ~1.5k tokens in practice. 8000 was a
150+
// worst-case safety buffer that caused waste on cheap models without
151+
// helping correctness — review nit on PR #241.
194152
const result = await runVisionPrompt({
195153
systemPrompt: SYSTEM_PROMPT,
196154
userText: USER_PROMPT,
197155
userImages,
198-
maxTokens: 8000,
156+
maxTokens: 4096,
199157
...(signal ? { signal } : {}),
200158
});
201159

apps/desktop/src/main/render-ui-kit.ts

Lines changed: 10 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -47,40 +47,38 @@ export function makeUiKitRenderer(): RenderUiKitFn {
4747
capturePage: () => Promise<{ toPNG: () => Buffer }>;
4848
};
4949

50-
// Race: load + settle window vs hard timeout vs abort signal
50+
// Race: load + settle window vs hard timeout vs abort signal.
51+
// Centralize cleanup in `finish()` so EVERY exit path
52+
// (success, fail, catch, timeout, abort) drops the timeout +
53+
// unregisters the abort listener. Earlier this lived only in the
54+
// success branch, which leaked the listener on fail/catch/timeout
55+
// paths (review finding on PR #241).
5156
await new Promise<void>((resolve, reject) => {
5257
let settled = false;
5358
const finish = (err?: Error) => {
5459
if (settled) return;
5560
settled = true;
61+
clearTimeout(hardTimeout);
62+
signal?.removeEventListener('abort', onAbort);
5663
if (err) reject(err);
5764
else resolve();
5865
};
66+
const onAbort = () => finish(new Error('renderUiKit aborted by signal'));
5967
const hardTimeout = setTimeout(
6068
() => finish(new Error(`renderUiKit hard timeout after ${HARD_TIMEOUT_MS}ms`)),
6169
HARD_TIMEOUT_MS,
6270
);
63-
const onAbort = () => {
64-
clearTimeout(hardTimeout);
65-
finish(new Error('renderUiKit aborted by signal'));
66-
};
6771
signal?.addEventListener('abort', onAbort, { once: true });
6872

6973
wc.once('did-finish-load', () => {
7074
// Give fonts + CSS animations a moment to settle for visual parity
71-
setTimeout(() => {
72-
clearTimeout(hardTimeout);
73-
signal?.removeEventListener('abort', onAbort);
74-
finish();
75-
}, SETTLE_AFTER_LOAD_MS);
75+
setTimeout(() => finish(), SETTLE_AFTER_LOAD_MS);
7676
});
7777
wc.once('did-fail-load', () => {
78-
clearTimeout(hardTimeout);
7978
finish(new Error('renderUiKit did-fail-load'));
8079
});
8180

8281
void win.loadURL(dataUrl).catch((err: unknown) => {
83-
clearTimeout(hardTimeout);
8482
finish(err instanceof Error ? err : new Error(String(err)));
8583
});
8684
});

0 commit comments

Comments
 (0)