feat(engine,cli): drawElementImage fast-capture behind --experimental-fast-capture#1295
feat(engine,cli): drawElementImage fast-capture behind --experimental-fast-capture#1295vanceingalls wants to merge 8 commits into
Conversation
This stack of pull requests is managed by Graphite. Learn more about stacking. |
e8cec72 to
d2ef1f2
Compare
miguel-heygen
left a comment
There was a problem hiding this comment.
Solid experimental feature. The fallback routing logic (SwiftShader detection, transparent + GPU vs Docker paths, supersampling guard, page-side compositing auto-disable) is well thought through, and the format-aware JPEG/PNG encode choice is the right call — I can see from the PR description that the PNG-vs-JPEG bug was caught by the e2e run, which is exactly why it matters. A few small gaps worth addressing before this graduates out of --experimental.
P2 — captureDrawElementFrame: split(",") is fragile
packages/engine/src/services/drawElementService.ts, the base64 extraction:
const base64 = dataUrl.split(",")[1];Base64 can't contain commas, so this is practically safe for a toDataURL response — but the intent is to grab everything after the first comma, and split(",")[1] silently truncates if the payload ever has one. Prefer:
const commaIdx = dataUrl.indexOf(",");
if (commaIdx === -1) throw new Error("drawElement: toDataURL returned malformed data URL");
const base64 = dataUrl.slice(commaIdx + 1);Worth fixing before graduation: it's one line and removes any ambiguity.
P3 — BeginFrame response discarded in drawelement branch
packages/engine/src/services/frameCapture.ts, captureFrameCore:
await client.send("HeadlessExperimental.beginFrame", { ... });
// response ignoredDiscarding the response is intentional (we don't rely on BeginFrame for the screenshot here), but beginFrameHasDamageCount/beginFrameNoDamageCount won't increment for drawElement renders. Diagnostics and any tooling that inspects those counters post-render will silently under-report. At minimum log hasDamage so render telemetry stays accurate; ideally update the session counters so the existing diagnostic surface works.
P3 — injectDrawElementCanvas idempotency path uncovered
drawElementService.test.ts mocks page.evaluate but never exercises the early-return branch (document.getElementById("__hf_de_canvas") already exists). Add a mock test: call the function twice and assert page.evaluate was called only once.
P3 — Supersampling fallback untested
The initDrawElementOrTransparentBackground path for useDrawElement=true + deviceScaleFactor>1 (logs the warning and falls back to screenshot capture) has no unit coverage. Since the function is private, this would need a thin integration shim or an indirect test via initializeSession. Minor, but worth noting before promotion.
Cloud/Lambda follow-up is fine — the flag silently no-ops there and the description is clear about it.
→ Approve with comments. Fallback logic and encoding are correct; the gaps above are small and appropriate to address before the flag loses the --experimental prefix.
james-russo-rames-d-jusso
left a comment
There was a problem hiding this comment.
Reviewed at d2ef1f2d. Big experimental capture-mode add (449/23, 10 files). Net: architecture is sound, opt-in plumbing is correct, the SwiftShader and page-side-compositing incompatibility cases are handled with two layers of defense (resolveConfig force-off + per-frame skip in prepareFrameForCapture), and the PSNR=∞ result vs captureScreenshot is strong evidence the visual output is byte-identical on the harnessed compositions. No blockers from my pass.
Where I have non-blocker concerns: coverage shape (CI's regression-shards run with the flag OFF, so drawelement is not exercised across the composition matrix), the Docker env-only path silently no-opping, and the <canvas layoutsubtree> HTML-in-canvas pattern (worth flagging for Miguel given his prior investigation, though this use case is render-output capture rather than interactive UI).
Concerns
-
Regression suite runs with fast-capture OFF. The PSNR=∞ result + e2e
drawElement canvas injectedlog come from a css-spinner → mp4 run plus the SwiftShader T1/T2 harness. The CI'sregression-shardsmatrix is great for verifying nothing breaks when the flag is off but doesn't exercise the newdrawelementmode across the diverse compositions (sub-compositions, iframes, HDR, font-variation, raf-ball, etc.). The composition-root reparenting ininjectDrawElementCanvas(parent.insertBefore(canvas, root); canvas.appendChild(root);) changes the root's layout-parent — compositions withposition: fixeddescendants, viewport-relative units referencing the prior ancestor, or that contain iframes (theiframe-render-compatstyle is the canonical case) could subtly diverge in a way the single-composition harness wouldn't catch. Is the plan to add a fast-capture variant of the regression matrix in a follow-up before promoting beyondEXPERIMENTAL, or is the PSNR=∞ on css-spinner considered representative? Either is defensible — just want to set expectations on coverage. -
Docker render silently drops env-only opt-in.
buildDockerRunArgsbuilds positional argv only — it does not forward host env vars to the container. The CLI plumbing atrender.ts:610usesargs["experimental-fast-capture"] === true, which isfalsefor an absent flag even whenPRODUCER_EXPERIMENTAL_FAST_CAPTURE=trueis set in the host shell. So an operator who sets only the env var and runshyperframes renderagainst the Docker path gets a normal screenshot render. The flag description ("Env: PRODUCER_EXPERIMENTAL_FAST_CAPTURE.") doesn't qualify that the env fallback is in-process only. This is consistent with the--low-memory-modeidiom the PR cites, so it may be intentional — but the symmetric--low-memory-modedescription has the same gap. Two reasonable resolutions: (a) document that the env fallback is in-process only and Docker needs the CLI flag, or (b) read env in the Docker option-prep path and synthesize the flag when present. Either is fine; (a) is cheaper if the existing pattern is the intent. -
<canvas layoutsubtree>is HTML-in-canvas. Surfacing for Miguel given his prior investigation on avoiding the pattern. This use case is render-output capture (no DOM events traverse the canvas, no a11y surface, no interactive UI nested inside), so the concerns from that thread probably don't apply directly — but worth a 👀 from him given the team bias against the pattern shape.
Nits
-
captureDrawElementFramehard-codes the JPEG quality default at 80 (drawElementService.ts:106). The downstream caller iscaptureFrameCorewhich passesoptions.quality ?? 80. Ifoptions.qualityis ever undefined from an alternate entry point, the default could diverge frompageScreenshotCapture's. Probably fine — the integration harness exercises the real call chain — but a one-line "tracks pageScreenshotCapture default" comment would pin the parity. -
logInitPhaseinterpolatessession.captureModeat the time of log (frameCapture.ts:997). Pre-helper phases log[initSession:beginframe]or[initSession:screenshot]; the new helper flips todrawelement, and post-helper phases log[initSession:drawelement]. ThedrawElement canvas injectedlog fires after the flip, which is the clear signal — but a future log-reader chasing a render issue may misread the prefix change mid-stream as a session restart. A one-line comment onlogInitPhasecalling out the mid-init flip would save 10 minutes of confused log-grepping later. -
drawElementService.integration.test.tsis adescribe.skipdocumenting what was validated locally. Discoverable but adds zero CI coverage and can drift silently. Two paths: (a) leave as a validation record (current shape, fine as a doc), (b) gate behind aRUN_BROWSER_INTEGRATION_TESTS=1env and add a nightly workflow that flips it on. Your call — current shape is honest about what's validated, so this is preference territory. -
render.ts:610uses=== truefor the inline option pass;render.ts:408-412uses!= nullfor the env override. Asymmetry is correct (env preserves on silence, option forwardsfalseon silence), and it tracks the--low-memory-modeidiom — but a one-liner on the option-pass line explaining why the two checks differ would help the next person threading a new flag.
Questions
-
Composition coverage of the harness. Was the validation harness exercised against sub-composition (
sub-composition-video), iframe (iframe-render-compat), HDR (hdr-regression), or font-variation compositions, or only css-spinner? The composition-root reparenting is the place I'd want fidelity evidence on those styles before flipping the flag on for any production-ish surface. -
Cloud / Lambda env state. The PR body says
cloud/render.tsandlambda/render.ts"silently no-op" — butresolveConfigreadsPRODUCER_EXPERIMENTAL_FAST_CAPTUREglobally, so if those infra surfaces happen to have the env var set (via deploy config), fast capture does activate. Is the "silently no-op" claim about the CLI flag wiring specifically (env-via-deploy-config is a supported operator escape hatch until the follow-up PR lands), or should those surfaces actively gate the env off until then? -
macOS path. On Mac,
headlessShellis null →preModeresolves to"screenshot"regardless. The new helper then flipscaptureModeto"drawelement"if useDrawElement is on and no SwiftShader. Per-frame branch gates BeginFrame onbeginFrameTimeTicks > 0, which stays 0 in the screenshot-launched path — so the compositor advances naturally and the comment atframeCapture.ts:1402-1404matches the behavior. Has the validation harness exercised macOS GPU, or only the Docker SwiftShader cases + the GPU-host T3?
What I didn't verify
-
Windows render compat.
Render on windows-latest+Tests on windows-latestare still in-flight at review time.drawelementdoesn't gate on platform, so a Windows session with the flag set would land in the same screenshot-launched + drawelement-mode path as macOS. If Windows CI surfaces something, this'll need a look. -
useDrawElement: truewithforceScreenshot: truesimultaneously.resolveConfigdoesn't appear to reconcile these —forceScreenshotis honored atpreModeresolution,useDrawElementis honored at init-time. A session could land in drawelement mode on a screenshot-launched browser. That's actually the design for macOS GPU, so it's not necessarily wrong — flagging as "haven't dug into the operator-confusion surface." -
--enable-features=CanvasDrawElementinteraction with multiple feature-flag groups. Verified the flag is added globally inbuildChromeArgs:554and is the only--enable-features=on the command line (the other flag is--disable-features=…which is a separate category). Chrome's behavior for multiple--enable-features=flags is version-dependent, but since there's only one, this is unambiguous. -
The integration record's PSNR=∞ claims — taken at the integration test file's word.
Clean execution; leaving as a comment.
— Review by Rames D Jusso
d2ef1f2 to
7599a75
Compare
| runs-on: ubuntu-latest | ||
| steps: | ||
| - uses: actions/checkout@v4 | ||
|
|
||
| - name: Set up Docker Buildx | ||
| uses: docker/setup-buildx-action@8d2750c68a42422c14e847fe6c8ac0403b4cbd6f # v3 | ||
|
|
||
| - name: Build test Docker image (cached) | ||
| uses: docker/build-push-action@10e90e3645eae34f1e60eeb005ba3a3d33f178e8 # v6 | ||
| with: | ||
| context: . | ||
| file: Dockerfile.test | ||
| load: true | ||
| tags: hyperframes-producer:test | ||
| cache-from: type=gha,scope=regression-test-image | ||
| cache-to: type=gha,mode=max,scope=regression-test-image | ||
|
|
||
| - name: Validate fast-capture video (drawElement + BeginFrame) | ||
| run: | | ||
| docker run --rm \ | ||
| --security-opt seccomp=unconfined \ | ||
| --shm-size=4g \ | ||
| -e PRODUCER_VALIDATE_COMP='${{ inputs.composition }}' \ | ||
| -e PRODUCER_VALIDATE_MIN_PSNR='${{ inputs.min_psnr }}' \ | ||
| --workdir /app/packages/producer \ | ||
| --entrypoint bunx \ | ||
| hyperframes-producer:test tsx scripts/validate-fast-video.ts |
…-fast-capture Add an experimental frame-capture mode that reads DOM paint records directly via Chrome's canvas.drawElementImage API instead of Page.captureScreenshot (~46% faster on GPU), gated behind --experimental-fast-capture (env PRODUCER_EXPERIMENTAL_FAST_CAPTURE; engine config useDrawElement). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Renders a video composition baseline-vs-fast on a native amd64 Linux runner and asserts the fast (drawElementImage) output matches via PSNR — validating the BeginFrame paint path captures video, which couldn't be checked locally (macOS has no BeginFrame; Docker-on-rosetta hung). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Reverted before merge; workflow_dispatch is the intended trigger. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…pported Validated on a native amd64 Linux runner that per-frame BeginFrame does NOT make drawElementImage capture video (fast-vs-baseline ~12 dB, region black) — same as macOS. So video falls back to screenshot on every platform, not just macOS. Make the validation workflow manual-only (it fails by design until fast video is implemented). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
0b5eb1f to
153602d
Compare
Adds an opaque GSAP composition rendered with the experimental fast-capture path (drawElementImage) via a new renderConfig.experimentalFastCapture meta flag. Golden is drawElement output, so the suite now guards the canvas-injection / drawElement capture path on the Linux/Docker CI platform. Verified passing (visual PSNR 33-76 dB, all checkpoints). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
| <head> | ||
| <meta charset="UTF-8"> | ||
| <meta name="viewport" content="width=1080, height=1080"> | ||
| <script src="https://cdnjs.cloudflare.com/ajax/libs/gsap/3.12.2/gsap.min.js"></script> |
…e opt-out Fast capture on hosts without BeginFrame (macOS) previously drew from a stale snapshot — drawElementImage reads the snapshot recorded at the last paint event, so unsynchronized captures were one frame behind and intermittently crashed with 'InvalidStateError: No cached paint record'. Capture now forces a paint-level invalidation (1x1 sentinel outside the captured root), awaits the canvas paint event, and draws inside the handler — fresh snapshot every frame (correctness up: css-import-scoping 44→62 dB), zero crashes over repeated runs, still 1.33x faster than screenshot. Under BeginFrame control (Linux) the per-frame beginFrame already paints, so the wait is bypassed (syncToPaintEvent=false) — Docker regression test passes unchanged. Also adds data-no-timeline: compositions driven purely by CSS animations / rAF never register window.__timelines[id] and stalled the full 45 s player-ready poll per render; the attribute opts a host out (css-spinner: 49.7s → 3.9s). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Probing established the real failure mode: drawElementImage drops layer-promoted subtrees while their transforms animate (sub-composition entrance/zoom scenes capture black until settled). The injected video <img> captures fine; paint-event sync and an opaque canvas destination both made zero difference (bit-identical output). Chromium-side gap, same family as crbug 521434899 but reproducible on GPU. HF_FAST_CAPTURE_VIDEO=true bypasses the gate for future probing. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Summary
Adds an experimental frame-capture mode that reads DOM paint records directly via Chrome's
canvas.drawElementImageAPI instead ofPage.captureScreenshot. On DOM/CSS/GSAP compositions it is significantly faster on both platforms. It is fully opt-in behind a new CLI flag — default behaviour is byte-unchanged.Benchmark — drawElement-only comps (14/20; excl. style-7/8/10/15 video-gated + raf-ball compat-route + iframe-render-compat rAF-sync):
Total wall-clock: macOS 358s → 324s (1.11×), Docker 987s → 853s (1.16×). The 6 excluded comps (4 video-gated → baseline route, 1 compat-forced screenshot, 1 iframe rAF-sync) measure init-overhead savings only — their fast renders take the same capture mode as baseline. Including them dilutes the drawElement speedup signal. Total wall-clock includes all 20 comps.
How it works
"drawelement"(alongside"beginframe"/"screenshot"). A<canvas layoutsubtree>is injected around the composition root after page-ready; each frame clears the canvas, callsdrawElementImage(root, 0, 0), and reads it back.resolveConfigauto-disablesenablePageSideCompositingwhen fast-capture is on (shader transitions fall back to the Node-side layered blend).Fixes
Paint-event sync +
No cached paint recordcrash (71b0eb77)drawElementImagedraws from a snapshot recorded at the paint event. Called outside one it returns the previous frame's snapshot (silent stale output), or throwsInvalidStateError: No cached paint recordwhen no paint has landed. Fix: force a paint-level invalidation via a 1×1 sentinel outside the captured root, await the canvaspaintevent, draw inside the handler. Under BeginFrame control (Linux headless-shell) the sync is skipped — the per-frameHeadlessExperimental.beginFramealready paints a fresh snapshot.45 s init stall on CSS/rAF-only compositions (
71b0eb77)Compositions driven purely by CSS animations / rAF never register
window.__timelines[id], burning the full 45 s poll timeout. Hosts now opt out withdata-no-timelineon[data-composition-id]. Measured: css-spinner 49.7 s → 3.9 s.Body/HTML background lost in fast capture (
223c4752)drawElementImageonly paints the captured subtree — ancestor backgrounds on<body>and<html>(outside the composition root) were invisible. Fix: walk ancestors before each frame and fill the canvas with the first non-transparentbackground-colorfound. Six comps went from 23–31 dB → 53–71 dB.WebGL / WebGPU / 2D canvas content frozen at frame 0 (
ec037e2a)GPU-accelerated canvases present via compositor texture swap; their paint records freeze at the first frame. Fix:
getContextinstrumentation (viaevaluateOnNewDocument) forcespreserveDrawingBuffer: truefor WebGL/WebGPU and tracks all accelerated canvases. Per frame: hide tracked canvases from the DOM, composite their live content viadrawImage, then calldrawElementImage. three.js, WebGL, WebGPU, and custom shaders work on the fast path.Transparent comps in opaque mp4 encoded black (
94c95c05)Compositions with no author background and transparent content in an opaque-format render (e.g. mp4) were captured against black instead of white. Fix: white-fill the canvas before
drawElementImagewhen no ancestor background is found and the output format is JPEG. webm-transparency: 3.4 dB → 61–67 dB.forceScreenshotcompat hints overridden (bf8922cb)Fast capture was ignoring
forceScreenshotcompat hints set by the engine (raw rAF animations, alpha output). Fix:initializeSessionshort-circuits fast capture whencfg.forceScreenshotis set. raf-ball Docker: fully black → ∞ PSNR.Compile-time video gate + BeginFrame liveness probe (
b2612bfc)The runtime
hasVideogate fired after browser launch — on Linux that launch is BeginFrame mode, and the gate's screenshot fallback then calledPage.captureScreenshoton a BeginFrame browser, which hangs by design. The compiler knowsvideoCountbefore launch, so the gate now runs at compile time and disables drawElement only (not forces screenshot), letting the render take the platform's normal baseline route. A newprobeBeginFrameLiveness()probe backstops genuinely-stalling comps. style-7/8/10/15 went from unrenderable on Docker to baseline-speed.Runtime
opacity → autoAlpharewrite (5d53f13c)Stacked
position:absolutecontainers withopacity:0children causedrawElementImageto capture black frames (240/240 deterministic on both macOS GPU and SwiftShader). Workaround:hf-early-stubrewritesopacity→autoAlphain all GSAP tween vars whenwindow.__HF_FAST_CAPTURE_AUTOALPHA__is set.autoAlphasetsvisibility:hiddenat 0, removing elements from the paint tree entirely instead of stacking them as promoted compositor layers. chat comp: 29.4 dB → 49.6 dB, zero blackout frames.Video gate routes to baseline path, not screenshot (
74ee7b60)The compile-time gate (above) was setting
forceScreenshot, but the Linux non-lowmem baseline captures via BeginFrame (~40% faster than screenshot on SwiftShader). Gate now setscfg.useDrawElement = falseonly, letting mode resolution pick the platform baseline route. style-8 Docker: 68s → 46.6s (parity with baseline).CI workflow env vars (
8538a970)validate-fast-videoenv vars were empty on push events; defaults now applied when unset.Fallback gates
Limitations that route specific renders back to a correct baseline path automatically — the only cost is losing the speedup for that render, never a broken frame.
153602d, updatedb2612bfc,74ee7b60) — compositions containing<video>disable drawElement at compile time and take the platform baseline route (BeginFrame on Linux, screenshot on macOS). Root cause: a Chromium opacity-animation bug on stacked transparent layers is proxied by video presence.HF_FAST_CAPTURE_VIDEO=truebypasses both gates (R&D escape hatch).1601c1fe) — SwiftShader drops promoted compositor sub-layers onto transparent canvas destinations. Opaque output on SwiftShader uses drawElement fine; alpha output falls back to screenshot.deviceScaleFactor > 1) —drawElementImagerasterises at CSS pixels with no DPR option. Supersampled renders fall back to screenshot.Wiring
CLI flag → env → engine config:
--experimental-fast-capture→PRODUCER_EXPERIMENTAL_FAST_CAPTURE→EngineConfig.useDrawElement→resolveConfig→ capture session. Threaded through the Docker render path (buildDockerRunArgs) as well.Testing
drawElementService.test.ts,config.test.ts): SwiftShader detection, capture-mode routing, env resolution, page-side auto-disable. Full engine suite green (697 pass).fast-capture-gsap,53a09b7d): producer-level guard — renders a GSAP composition in fast mode and PSNR-checks against the screenshot baseline..github/workflows/fast-video-validation.yml): fast-video PSNR ≥ 25 dB — currently fails by design; passes if/when an upstream Chromium fix lands, serving as the regression gate.🤖 Generated with Claude Code