Skip to content

Commit 45f9025

Browse files
committed
feat(capture): pipeline improvements — contact sheets, design styles, snapshot
Capture pipeline work that came out of the 11-round website-to-video eval branch. The wins that actually moved quality were the artifacts agents read (contact sheets, design-styles) and the snapshot tool visual-verification fixes; the rest are smaller follow-ons. **Contact sheets (`contactSheet.ts`, new)** - Replaces the embedded one-image-per-asset listing with paginated labeled grids (3-col screenshots / 4-col raster / 5-col SVG). Each page contains 9–15 cells with filename labels baked in via SVG text overlay (`escapeXml` covers `&<>"'`). - `fit: "contain"` keeps every asset visible at its real aspect ratio; the old `fit: "cover"` cropped to the first image's box. - Returns `string[]` (page paths) — single-page captures get one file, multi-page produce `contact-sheet-1.jpg`, `contact-sheet-2.jpg`, etc. - `createSvgContactSheet` scans both `assets/svgs/` (inline-extracted SVGs) and `assets/` root (external SVGs from `<img src="*.svg">`) and de-dupes by filename. Sites with all-external SVGs (huly.io) now get coverage they previously didn't. **Design styles extractor (`designStyleExtractor.ts`, new)** - Walks the live DOM and reads computed styles to produce `extracted/design-styles.json`: typography hierarchy (every text role with exact font-size / weight / line-height / letter-spacing), button variants (background / padding / radius / shadow), card / container / nav styles, spacing scale with base unit, border-radius scale, box-shadow values with usage counts. - Primary data source for DESIGN.md authoring at Step 1. Replaces the prior "guess from screenshots" workflow. **Snapshot tool (`snapshot.ts`)** - HyperShader pre-rendering used to swallow the entire snapshot capture window (every frame after the first showed the loading overlay or final-opacity-zero exit fades). Wait signal is now `window.__hf.shaderTransitions[].ready` (set after both warm and cold cache paths complete); local-time seek for sub-comps means exit fades read at their own t=0..duration, not global time. - Gemini vision per-frame analysis runs by default (`descriptions.md` next to the contact sheet). `--describe "custom Q"` overrides the prompt; `--describe false` opts out. - 3-column contact sheet generation for snapshot frames so reviewers see all beats at a glance. **Screenshot capture (`screenshotCapture.ts`)** - Replaces `querySelectorAll('*') + getComputedStyle` overlay scan with a TreeWalker that early-exits on cheap rect checks before reaching the expensive style read. Caps at 5000 elements per page. - Cookie/consent dismissal selectors are scoped under cookie / consent / gdpr ancestors so we don't click "Accept invitation" or similar unrelated buttons. **Agent prompt (`agentPromptGenerator.ts`)** - Auto-discovers contact-sheet page count (matches base name plus paginated `-NNN` variants only, with regex escaping on the base name and numeric sort for 10+ pages). - `inferColorRole`: classifies extracted hex colors as bg-dark / bg-light / accent / surface / neutral via luminance + saturation, so the agent prompt shows `#533AFD (accent)` instead of bare hex. - `design-styles.json` row is gated on `existsSync` — the upstream write is wrapped in try/catch and may skip on failure, so the prompt only points to files actually on disk. **Other CLI ergonomics** - `cli.ts`: auto-load `.env` from CWD on startup so subcommands like `snapshot` don't need explicit `export GEMINI_API_KEY=…`. Handles `export FOO=bar`, quoted values, inline `# comments`. - `commands/transcribe.ts`: default output dir is the input file's directory, not CWD. Stops the "wrote transcript.json somewhere unexpected" footgun. - `assetDownloader.ts`: improved asset naming uses catalog context; de-duplicates inline SVG filenames. - `contentExtractor.ts`: captions SVGs via Gemini (code-as-text) and integrates them into asset descriptions. - `tokenExtractor.ts` + `types.ts`: SVG bounding box dimensions and new DesignStyles schema added.
1 parent 5e7a7a8 commit 45f9025

12 files changed

Lines changed: 1387 additions & 84 deletions

packages/cli/src/capture/agentPromptGenerator.ts

Lines changed: 112 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -10,12 +10,35 @@
1010
* website-to-hyperframes skill — this file points agents there.
1111
*/
1212

13-
import { writeFileSync } from "node:fs";
13+
import { writeFileSync, readdirSync, existsSync } from "node:fs";
1414
import { join } from "node:path";
1515
import type { DesignTokens } from "./types.js";
1616
import type { AnimationCatalog } from "./animationCataloger.js";
1717
import type { CatalogedAsset } from "./assetCataloger.js";
1818

19+
/**
20+
* Infer a human-readable role hint from a hex color based on luminance and saturation.
21+
* Not a substitute for DESIGN.md — just helps orient agents scanning the brand summary.
22+
*/
23+
function inferColorRole(hex: string): string {
24+
const r = parseInt(hex.slice(1, 3), 16) / 255;
25+
const g = parseInt(hex.slice(3, 5), 16) / 255;
26+
const b = parseInt(hex.slice(5, 7), 16) / 255;
27+
if (isNaN(r) || isNaN(g) || isNaN(b)) return "color";
28+
29+
const max = Math.max(r, g, b);
30+
const min = Math.min(r, g, b);
31+
const luminance = 0.2126 * r + 0.7152 * g + 0.0722 * b;
32+
const saturation = max === 0 ? 0 : (max - min) / max;
33+
34+
if (luminance < 0.04) return "bg-dark";
35+
if (luminance > 0.9) return "bg-light";
36+
if (saturation > 0.4 && luminance > 0.05 && luminance < 0.7) return "accent";
37+
if (luminance < 0.2) return "surface-dark";
38+
if (luminance > 0.7) return "surface-light";
39+
return "neutral";
40+
}
41+
1942
export function generateAgentPrompt(
2043
outputDir: string,
2144
url: string,
@@ -25,25 +48,28 @@ export function generateAgentPrompt(
2548
hasLottie?: boolean,
2649
hasShaders?: boolean,
2750
_catalogedAssets?: CatalogedAsset[], // reserved for future asset inventory
28-
detectedLibraries?: string[],
51+
_detectedLibraries?: string[],
2952
): void {
30-
const prompt = buildPrompt(url, tokens, hasScreenshot, hasLottie, hasShaders, detectedLibraries);
53+
const prompt = buildPrompt(outputDir, url, tokens, hasScreenshot, hasLottie, hasShaders);
3154
writeFileSync(join(outputDir, "AGENTS.md"), prompt, "utf-8");
3255
writeFileSync(join(outputDir, "CLAUDE.md"), prompt, "utf-8");
3356
writeFileSync(join(outputDir, ".cursorrules"), prompt, "utf-8");
3457
}
3558

3659
function buildPrompt(
60+
outputDir: string,
3761
url: string,
3862
tokens: DesignTokens,
3963
hasScreenshot: boolean,
4064
hasLottie?: boolean,
4165
hasShaders?: boolean,
42-
detectedLibraries?: string[],
4366
): string {
4467
const title = tokens.title || new URL(url).hostname.replace(/^www\./, "");
4568

46-
const colorSummary = tokens.colors.slice(0, 10).join(", ");
69+
const colorSummary = tokens.colors
70+
.slice(0, 10)
71+
.map((hex) => `${hex} (${inferColorRole(hex)})`)
72+
.join(", ");
4773
const fontSummary =
4874
tokens.fonts
4975
.map(
@@ -58,17 +84,66 @@ function buildPrompt(
5884
.join(", ") || "none detected";
5985

6086
// Build the data inventory table rows
87+
// Helper: find all contact sheet pages for a given base name. Matches the
88+
// exact base file plus paginated variants only (e.g. `contact-sheet.jpg`,
89+
// `contact-sheet-2.jpg`, `contact-sheet-3.jpg`). The "-NNN" suffix is digits
90+
// only, so unrelated files that happen to share the prefix (notably the
91+
// `contact-sheet-svgs.jpg` SVG fallback sheet in assets/) don't get mixed in.
92+
function contactSheetRows(dir: string, baseFile: string, label: string): string[] {
93+
const fullDir = join(outputDir, dir);
94+
if (!existsSync(fullDir)) return [];
95+
const baseName = baseFile.replace(/\.jpg$/, "");
96+
// Escape regex metacharacters in baseName so future callers can pass
97+
// filenames containing `.`, `+`, `(`, etc. without the regex breaking.
98+
const escapedBase = baseName.replace(/[.*+?^${}()|[\]\\]/g, "\\$&");
99+
const paginatedRe = new RegExp(`^${escapedBase}(?:-(\\d+))?\\.jpg$`);
100+
// Sort by the numeric page suffix so `contact-sheet-10.jpg` lands after
101+
// `contact-sheet-2.jpg`, not before (default string sort orders them
102+
// lexicographically and breaks at 10+ pages). Unpaginated `contact-sheet.jpg`
103+
// gets page 0 so it sorts first if it co-exists with paginated files.
104+
const all = readdirSync(fullDir)
105+
.filter((f) => paginatedRe.test(f))
106+
.map((f) => ({ name: f, page: parseInt(f.match(paginatedRe)?.[1] ?? "0", 10) }))
107+
.sort((a, b) => a.page - b.page)
108+
.map((entry) => entry.name);
109+
if (all.length === 0) return [];
110+
if (all.length === 1) {
111+
return [`| \`${dir}/${all[0]}\` | ${label} |`];
112+
}
113+
return all.map((f, i) => `| \`${dir}/${f}\` | ${label} — page ${i + 1} of ${all.length} |`);
114+
}
115+
61116
const tableRows: string[] = [];
62117
if (hasScreenshot) {
118+
const screenshotRows = contactSheetRows(
119+
"screenshots",
120+
"contact-sheet.jpg",
121+
"**View this first.** All scroll screenshots in labeled grid — see the entire page at a glance",
122+
);
123+
if (screenshotRows.length > 0) {
124+
tableRows.push(...screenshotRows);
125+
} else {
126+
tableRows.push(
127+
"| `screenshots/contact-sheet.jpg` | **View this first.** All scroll screenshots in one labeled grid. |",
128+
);
129+
}
63130
tableRows.push(
64-
"| `screenshots/scroll-*.png` | Viewport screenshots of the full page. Start with `scroll-000.png` (hero). |",
131+
"| `screenshots/scroll-*.png` | Individual viewport screenshots if you need detail on a specific section. |",
65132
);
66133
}
67134
tableRows.push(
68-
"| `extracted/asset-descriptions.md` | One-line description of every downloaded asset. **Read this first.** |",
135+
`| \`extracted/tokens.json\` | Design tokens: ${tokens.colors.length} colors, ${tokens.fonts.length} fonts, ${tokens.headings?.length ?? 0} headings, ${tokens.ctas?.length ?? 0} CTAs |`,
69136
);
137+
// design-styles.json is written from a try/catch in capture/index.ts and
138+
// gets skipped when the live-DOM style extraction fails. Only list it in the
139+
// agent prompt when it actually exists, so the agent isn't pointed at a 404.
140+
if (existsSync(join(outputDir, "extracted", "design-styles.json"))) {
141+
tableRows.push(
142+
"| `extracted/design-styles.json` | Computed styles from live DOM: typography hierarchy, button/card/nav styles, spacing scale, border-radius, box shadows. Primary data source for DESIGN.md. |",
143+
);
144+
}
70145
tableRows.push(
71-
`| \`extracted/tokens.json\` | Design tokens: ${tokens.colors.length} colors, ${tokens.fonts.length} fonts, ${tokens.headings?.length ?? 0} headings, ${tokens.ctas?.length ?? 0} CTAs |`,
146+
"| `extracted/asset-descriptions.md` | One-line description of every downloaded asset. Read this for asset selection — only open individual files for safe-zone checking. |",
72147
);
73148
tableRows.push(
74149
"| `extracted/visible-text.txt` | Page text in DOM order, prefixed with HTML tag (`[h1]`, `[p]`, `[a]`). Use as context — rephrase freely. |",
@@ -81,20 +156,41 @@ function buildPrompt(
81156
if (hasShaders) {
82157
tableRows.push("| `extracted/shaders.json` | WebGL shader source (GLSL). |");
83158
}
84-
if (detectedLibraries && detectedLibraries.length > 0) {
85-
tableRows.push(
86-
`| \`extracted/detected-libraries.json\` | Libraries: ${detectedLibraries.join(", ")} |`,
87-
);
159+
160+
// Asset contact sheets — dynamically list all pages
161+
const assetSheetRows = contactSheetRows(
162+
"assets",
163+
"contact-sheet.jpg",
164+
"Downloaded images in labeled grid — view before opening individual files",
165+
);
166+
if (assetSheetRows.length > 0) {
167+
tableRows.push(...assetSheetRows);
168+
} else {
169+
tableRows.push("| `assets/contact-sheet.jpg` | All downloaded images in one labeled grid. |");
88170
}
89-
tableRows.push("| `assets/` | Downloaded images, SVGs, and font files. |");
171+
172+
// SVG contact sheets — check both assets/svgs/ and assets/ root fallback
173+
const svgSubdirRows = contactSheetRows(
174+
"assets/svgs",
175+
"contact-sheet.jpg",
176+
"SVGs rendered as thumbnails in labeled grid",
177+
);
178+
const svgRootRows = contactSheetRows(
179+
"assets",
180+
"contact-sheet-svgs.jpg",
181+
"SVGs rendered as thumbnails in labeled grid",
182+
);
183+
const svgRows = svgSubdirRows.length > 0 ? svgSubdirRows : svgRootRows;
184+
if (svgRows.length > 0) {
185+
tableRows.push(...svgRows);
186+
}
187+
188+
tableRows.push("| `assets/` | Individual downloaded images, SVGs, and font files. |");
90189

91190
// Brand summary — just the essentials
92191
const brandLines: string[] = [];
93192
brandLines.push(`- **Colors**: ${colorSummary || "see tokens.json"}`);
94193
brandLines.push(`- **Fonts**: ${fontSummary}`);
95-
if (detectedLibraries && detectedLibraries.length > 0) {
96-
brandLines.push(`- **Built with**: ${detectedLibraries.join(", ")}`);
97-
}
98194

99195
return `# ${title}
100196

packages/cli/src/capture/assetDownloader.ts

Lines changed: 102 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -24,11 +24,21 @@ export async function downloadAssets(
2424

2525
// 1. ALL inline SVGs — save as files (logos get priority naming)
2626
mkdirSync(join(outputDir, "assets", "svgs"), { recursive: true });
27+
const usedSvgNames = new Set<string>();
2728
for (let i = 0; i < tokens.svgs.length && i < 30; i++) {
2829
const svg = tokens.svgs[i]!;
2930
if (!svg.outerHTML || svg.outerHTML.length < 50) continue;
3031
const label = svg.label?.replace(/[^a-zA-Z0-9-_ ]/g, "").trim();
31-
const name = label ? slugify(label) + ".svg" : svg.isLogo ? `logo-${i}.svg` : `icon-${i}.svg`;
32+
let slug = label ? slugify(label) : svg.isLogo ? `logo-${i}` : `icon-${i}`;
33+
// Deduplicate — two SVGs with same aria-label get suffixed
34+
let finalSlug = slug;
35+
let suffix = 2;
36+
while (usedSvgNames.has(finalSlug)) {
37+
finalSlug = `${slug}-${suffix}`;
38+
suffix++;
39+
}
40+
usedSvgNames.add(finalSlug);
41+
const name = `${finalSlug}.svg`;
3242
const localPath = `assets/svgs/${name}`;
3343
try {
3444
writeFileSync(join(outputDir, localPath), svg.outerHTML, "utf-8");
@@ -86,55 +96,49 @@ export async function downloadAssets(
8696
}
8797
}
8898

89-
// Download all images (no arbitrary cap) — Claude Code needs to see every asset to use them creatively.
90-
// The 10KB minimum size filter handles tracking pixels and tiny icons.
99+
// Download all images — use catalog context for human-readable filenames.
91100
// Pre-filter to deduplicate before downloading.
92-
const toDownload: { url: string; isPoster: boolean; normalized: string }[] = [];
101+
const toDownload: {
102+
url: string;
103+
isPoster: boolean;
104+
normalized: string;
105+
catalog?: CatalogedAsset;
106+
}[] = [];
93107
for (const { url, isPoster } of imageUrls) {
94108
const normalized = normalizeUrl(url);
95109
if (downloadedUrls.has(normalized)) continue;
96-
downloadedUrls.add(normalized); // Reserve to prevent duplicates in parallel batches
97-
toDownload.push({ url, isPoster, normalized });
110+
downloadedUrls.add(normalized);
111+
const catalog = catalogedAssets?.find((a) => normalizeUrl(a.url) === normalized);
112+
toDownload.push({ url, isPoster, normalized, catalog });
98113
}
99114

100115
// Download in parallel batches of 5
101116
const BATCH_SIZE = 5;
102117
let imgIdx = 0;
118+
const usedNames = new Set<string>();
103119
for (let i = 0; i < toDownload.length; i += BATCH_SIZE) {
104120
const batch = toDownload.slice(i, i + BATCH_SIZE);
105121
const results = await Promise.allSettled(
106-
batch.map(async ({ url, isPoster }) => {
122+
batch.map(async ({ url, isPoster, catalog }) => {
107123
const parsedUrl = new URL(url);
108124
const pathExt = extname(parsedUrl.pathname);
109125
const ext = pathExt && pathExt.length <= 5 ? pathExt : ".jpg";
110126
const buffer = await fetchBuffer(url);
111127
if (!buffer) return null;
112-
// SVGs are inherently small — don't apply the 10KB minimum to them
113128
const isSvg = ext === ".svg" || url.includes(".svg");
114129
const minSize = isSvg ? 200 : 10000;
115130
if (buffer.length < minSize) return null;
116-
return { url, isPoster, parsedUrl, ext, buffer };
131+
return { url, isPoster, parsedUrl, ext, buffer, catalog };
117132
}),
118133
);
119134
for (const result of results) {
120135
if (result.status !== "fulfilled" || !result.value) continue;
121-
const { url, isPoster, parsedUrl, ext, buffer } = result.value;
136+
const { url, isPoster, parsedUrl, ext, buffer, catalog } = result.value;
122137
try {
123-
const prefix = isPoster ? "poster" : "image";
124-
const rawName =
125-
parsedUrl.pathname
126-
.split("/")
127-
.pop()
128-
?.replace(/\.[^.]+$/, "") || "";
129-
const isMeaningful =
130-
rawName.length > 2 &&
131-
rawName.length < 50 &&
132-
!/^[a-f0-9]{8,}$/i.test(rawName) &&
133-
!/^\d+$/.test(rawName) &&
134-
!rawName.includes("_next") &&
135-
!rawName.includes("?");
136-
const slug = isMeaningful ? slugify(rawName) : `${prefix}-${imgIdx}`;
138+
// Generate human-readable name from catalog context
139+
const slug = deriveAssetName(parsedUrl, catalog, isPoster, imgIdx, usedNames);
137140
const name = `${slug}${ext}`;
141+
usedNames.add(slug);
138142
const localPath = `assets/${name}`;
139143
writeFileSync(join(outputDir, localPath), buffer);
140144
assets.push({ url, localPath, type: "image" });
@@ -303,3 +307,77 @@ function slugify(text: string): string {
303307
.replace(/^-|-$/g, "")
304308
.slice(0, 40);
305309
}
310+
311+
/**
312+
* Derive a human-readable filename from catalog context.
313+
* Priority: alt text > nearest heading > meaningful URL path > fallback index.
314+
*/
315+
function deriveAssetName(
316+
parsedUrl: URL,
317+
catalog: CatalogedAsset | undefined,
318+
isPoster: boolean,
319+
idx: number,
320+
usedNames: Set<string>,
321+
): string {
322+
const candidates: string[] = [];
323+
324+
// 1. Alt text / description from catalog
325+
if (catalog?.description) {
326+
const desc = catalog.description.replace(/[^a-zA-Z0-9 -]/g, "").trim();
327+
if (desc.length > 3 && desc.length < 80) candidates.push(desc);
328+
}
329+
330+
// 2. Nearest heading context
331+
if (catalog?.nearestHeading) {
332+
const heading = catalog.nearestHeading.replace(/[^a-zA-Z0-9 -]/g, "").trim();
333+
if (heading.length > 3 && heading.length < 60) candidates.push(heading);
334+
}
335+
336+
// 3. Meaningful URL path segment
337+
const rawName =
338+
parsedUrl.pathname
339+
.split("/")
340+
.pop()
341+
?.replace(/\.[^.]+$/, "") || "";
342+
const isMeaningful =
343+
rawName.length > 2 &&
344+
rawName.length < 50 &&
345+
!/^[a-f0-9]{8,}$/i.test(rawName) &&
346+
!/^\d+$/.test(rawName) &&
347+
!rawName.includes("_next") &&
348+
!rawName.includes("?");
349+
if (isMeaningful) candidates.push(rawName);
350+
351+
// 4. Section classes as context
352+
if (catalog?.sectionClasses) {
353+
const classes = catalog.sectionClasses
354+
.split(/\s+/)
355+
.filter((c) => c.length > 3 && c.length < 30 && !/^(w-|h-|p-|m-|flex|grid|block)/.test(c))
356+
.slice(0, 2)
357+
.join("-");
358+
if (classes.length > 3) candidates.push(classes);
359+
}
360+
361+
// Pick the best candidate
362+
const prefix = isPoster ? "poster" : catalog?.aboveFold ? "hero" : "image";
363+
let slug = "";
364+
365+
for (const c of candidates) {
366+
slug = slugify(c);
367+
if (slug.length > 3 && !usedNames.has(slug)) break;
368+
}
369+
370+
if (!slug || slug.length <= 3 || usedNames.has(slug)) {
371+
slug = `${prefix}-${idx}`;
372+
}
373+
374+
// Deduplicate
375+
let final = slug;
376+
let suffix = 2;
377+
while (usedNames.has(final)) {
378+
final = `${slug}-${suffix}`;
379+
suffix++;
380+
}
381+
382+
return final;
383+
}

0 commit comments

Comments
 (0)