feat(browse): add record command for video evidence of interactive bug repros#1483
Open
itstimwhite wants to merge 2 commits into
Open
feat(browse): add record command for video evidence of interactive bug repros#1483itstimwhite wants to merge 2 commits into
record command for video evidence of interactive bug repros#1483itstimwhite wants to merge 2 commits into
Conversation
added 2 commits
May 13, 2026 20:10
…bug repros Wraps Playwright's `BrowserContext.recordVideo` option behind a `record start|stop|status` subcommand. Useful when an interactive bug needs a timing-faithful repro that a screenshot can't capture: form submission order, async loading state flicker, drag/drop, scroll-triggered behavior, focus management, dialog timing. `record start [path] [--size WxH]` Saves session state, recreates the context with recordVideo enabled (path defaults to a timestamped dir under TEMP_DIR), restores state. Calling `start` while already recording auto-stops the prior recording (single-recording invariant). `record stop` Collects each live page's `video()` path, clears the recordVideo flag, rebuilds the context (which flushes the .webm to disk), and returns the paths. No-op when not recording. `record status` Prints the active recording directory or `Not recording.` Implementation hooks into the existing `recreateContext()` save/restore path, so cookies, URLs, and open pages survive both `start` and `stop`. Headed mode rejects with a clear error (the user can use their OS's screen recorder for headed sessions). Output paths pass through the standard `validateOutputPath` policy so the command can't write outside SAFE_DIRECTORIES. 9 integration tests cover: status before start, no-op stop, start → activity → stop produces a non-empty WebM (verified by magic-byte check), browser remains functional after stop (state preserved across context recreate), auto-stop on double-start, malformed flag rejection, malformed `--size` rejection, missing action error, unknown subaction error.
Adds a 'Record video evidence for interactive bug repros' section to browse/SKILL.md.tmpl right after the retina-screenshot section, explaining the per-context recording model, the ref-invalidation caveat across start and stop, and when video is the right evidence shape vs an annotated screenshot. Regenerates browse/SKILL.md, SKILL.md, and gstack/llms.txt via `bun run gen:skill-docs --host all`. No host-specific outputs change beyond the new command row in the COMMAND_REFERENCE tables.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
Static screenshots can't capture the timing of a UI bug. Form-submission order, async loading flicker, drag/drop, scroll-triggered behavior, focus management, dialog timing — these all benefit from a few seconds of video. Annotated screenshots stay the right answer for static bugs (typo, clipped UI, missing element), but interactive bugs deserve a
.webmso the developer fixing them can see the exact repro shape.agent-browser(Vercel Labs) ships adogfoodskill that records video evidence as part of its structured QA workflow. I wanted the same evidence shape available behind$B— including from inside/qaand/qa-only— so this PR adds the primitive at the browse layer and leaves the QA-side flag for a follow-up.What
A new
recordmeta-command with three subactions:Wraps Playwright's
BrowserContext.recordVideooption. Implementation hooks into the existingrecreateContext()save/restore path, so cookies, URLs, and open pages survive both start and stop.@erefs are invalidated by the recreate (same caveat asviewport --scale);record stopprints a hint reminding you to re-snapshot.Defaults output to a timestamped subdirectory under
TEMP_DIRso concurrent recordings never collide. Custom paths pass through the standardvalidateOutputPathpolicy.Single-recording invariant: calling
record startwhile already recording auto-stops the prior recording. Headed mode rejects with a clear error (the user can use their OS screen recorder in headed sessions).Tests
9 integration tests in
browse/test/record.test.ts:record statusreports not-recording before startrecord stopis a no-op when not recordingrecord start → activity → record stopproduces a non-empty.webm(verified by reading the EBML magic bytes0x1A 0x45 0xDF 0xA3)record startwhile already recording auto-stops the prior recording--size, missing action, unknown subaction all reject with usage errorsAll passing locally:
Existing
browse/test/commands.test.ts(223 tests) still green.The pre-existing
snapshot.test.tsflake (closetab last tab auto-creates new) reproduces onmainwithout these changes; not related.Commits (bisect-friendly)
feat(browse): add record command …— implementation only (browser-manager.ts,commands.ts,meta-commands.ts, newrecord.test.ts). 352 insertions.docs(browse): document record command in SKILL.md.tmpl + regenerate—SKILL.md.tmplchange + regeneratedbrowse/SKILL.md, top-levelSKILL.md,gstack/llms.txt. 67 insertions.Template change and generated-doc regeneration intentionally split, per the gstack contributor guidance.
Out of scope
NEW_IN_VERSION: 'record': '1.35.0.0'so the unknown-command hint works the moment this lands; adjust if you cut a different version./qa --evidence-per-findingflag (the QA-side complement) — separate PR coming, since it's a template-only change to two SKILL.md.tmpl files and is easier to review on its own..webmfiles. They're evidence; outliving the session is the point.