Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
39 changes: 39 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,44 @@
# Changelog

## [1.18.0.0] - 2026-04-26

## **`browse screenshot` no longer bricks your session by capturing 3000px-tall PNGs.**

The default flipped from full-page to viewport, matching Playwright's own API. Long pages used to silently produce images that exceeded Anthropic's vision API per-side limit, and the failed image stayed in conversation history so every subsequent turn re-failed on the same envelope. New `--full-page` flag opts in, and warns when the result will be too tall to safely send.

### What this means for you

If you were taking screenshots of any documentation page, blog post, or feed-style UI, you've probably hit this and not realized why your session went sideways. Now `browse screenshot path.png` captures what's visible. If you genuinely want the whole scroll, `--full-page` does the old behavior, and tells you when the resulting PNG is going to exceed ~2000px so you can switch to `--clip` or `--selector` instead. `--viewport` still works as a back-compat alias.

### The numbers that matter

Captured against a fixture with ~3000px scroll height, viewport 1280×720:

| Mode | PNG height | Vision API verdict |
|---|---|---|
| Before (default `fullPage:true`) | ~3700px | Rejected (>2000px) |
| After (default viewport) | ~720px | Accepted |
| After (`--full-page`) | ~3700px | Rejected, but `[browse] warning` line tells the agent why |

Reported by @raffoz in #1214 with a clean reproducer and three proposed options. Goes with the issue author's vote: option 1 (post-capture warn) plus option 2 (flip the default). Annotated and heatmap snapshot modes still default to full-page — those are debug UI for humans, not images fed to agents via `Read`.

### Itemized changes

#### Changed
- `browse screenshot` defaults to viewport-only capture, matching Playwright's `page.screenshot()` API. Pass `--full-page` to opt into the previous behavior. (#1214)
- Output message no longer carries `(viewport)` suffix in the default case; only `--full-page` gets a `(full-page)` suffix.

#### Added
- `--full-page` flag for explicit full-scroll-height captures.
- Oversize warning: any `--full-page` capture whose PNG width or height exceeds ~1800px emits `[browse] warning: …exceeds ~2000px Anthropic vision API limit…` on stderr. Capture still succeeds; the warning is informational.

#### Fixed
- `--viewport` is preserved as a no-op back-compat alias (now the default) so existing scripts continue to work.

#### For contributors
- New `browse/test/fixtures/tall.html` (~3000px scroll) feeds the regression coverage.
- Five new test cases in `browse/test/commands.test.ts`: default-viewport regression for #1214, `--full-page` capture height verification, oversize warning emission, and the two mutually-exclusive flag combinations (`--full-page + --clip`, `--full-page + selector`).
- PNG dimensions are read from the IHDR chunk (zero deps, ~10 lines in `meta-commands.ts`).
## [1.15.0.0] - 2026-04-26

## **Real-PTY test harness ships. 11 plan-mode E2E tests, 23 unit tests, and 50K fewer tokens per invocation.**
Expand Down
2 changes: 1 addition & 1 deletion SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -859,7 +859,7 @@ Refs are invalidated on navigation — run `snapshot` again after `goto`.
| `pdf [path] [--format letter|a4|legal] [--width <dim> --height <dim>] [--margins <dim>] [--margin-top <dim> --margin-right <dim> --margin-bottom <dim> --margin-left <dim>] [--header-template <html>] [--footer-template <html>] [--page-numbers] [--tagged] [--outline] [--print-background] [--prefer-css-page-size] [--toc] [--tab-id <N>] | pdf --from-file <payload.json> [--tab-id <N>]` | Save the current page as PDF. Supports page layout (--format, --width, --height, --margins, --margin-*), structure (--toc waits for Paged.js), branding (--header-template, --footer-template, --page-numbers), accessibility (--tagged, --outline), and --from-file <payload.json> for large payloads. Use --tab-id <N> to target a specific tab. |
| `prettyscreenshot [--scroll-to sel|text] [--cleanup] [--hide sel...] [--width px] [path]` | Clean screenshot with optional cleanup, scroll positioning, and element hiding |
| `responsive [prefix]` | Screenshots at mobile (375x812), tablet (768x1024), desktop (1280x720). Saves as {prefix}-mobile.png etc. |
| `screenshot [--selector <css>] [--viewport] [--clip x,y,w,h] [--base64] [selector|@ref] [path]` | Save screenshot. --selector targets a specific element (explicit flag form). Positional selectors starting with ./#/@/[ still work. |
| `screenshot [--selector <css>] [--viewport] [--full-page] [--clip x,y,w,h] [--base64] [selector|@ref] [path]` | Save screenshot. Defaults to viewport-only (matches Playwright). --full-page captures the whole scroll height; warns when the result exceeds ~1800px (Anthropic vision API ceiling). --selector targets an element (explicit flag); positional selectors starting with ./#/@/[ still work. --viewport is accepted for back-compat (now the default). |

### Snapshot
| Command | Description |
Expand Down
2 changes: 1 addition & 1 deletion VERSION
Original file line number Diff line number Diff line change
@@ -1 +1 @@
1.15.0.0
1.18.0.0
2 changes: 1 addition & 1 deletion browse/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -783,7 +783,7 @@ $B prettyscreenshot --cleanup --scroll-to ".pricing" --width 1440 ~/Desktop/hero
| `pdf [path] [--format letter|a4|legal] [--width <dim> --height <dim>] [--margins <dim>] [--margin-top <dim> --margin-right <dim> --margin-bottom <dim> --margin-left <dim>] [--header-template <html>] [--footer-template <html>] [--page-numbers] [--tagged] [--outline] [--print-background] [--prefer-css-page-size] [--toc] [--tab-id <N>] | pdf --from-file <payload.json> [--tab-id <N>]` | Save the current page as PDF. Supports page layout (--format, --width, --height, --margins, --margin-*), structure (--toc waits for Paged.js), branding (--header-template, --footer-template, --page-numbers), accessibility (--tagged, --outline), and --from-file <payload.json> for large payloads. Use --tab-id <N> to target a specific tab. |
| `prettyscreenshot [--scroll-to sel|text] [--cleanup] [--hide sel...] [--width px] [path]` | Clean screenshot with optional cleanup, scroll positioning, and element hiding |
| `responsive [prefix]` | Screenshots at mobile (375x812), tablet (768x1024), desktop (1280x720). Saves as {prefix}-mobile.png etc. |
| `screenshot [--selector <css>] [--viewport] [--clip x,y,w,h] [--base64] [selector|@ref] [path]` | Save screenshot. --selector targets a specific element (explicit flag form). Positional selectors starting with ./#/@/[ still work. |
| `screenshot [--selector <css>] [--viewport] [--full-page] [--clip x,y,w,h] [--base64] [selector|@ref] [path]` | Save screenshot. Defaults to viewport-only (matches Playwright). --full-page captures the whole scroll height; warns when the result exceeds ~1800px (Anthropic vision API ceiling). --selector targets an element (explicit flag); positional selectors starting with ./#/@/[ still work. --viewport is accepted for back-compat (now the default). |

### Snapshot
| Command | Description |
Expand Down
2 changes: 1 addition & 1 deletion browse/src/commands.ts
Original file line number Diff line number Diff line change
Expand Up @@ -135,7 +135,7 @@ export const COMMAND_DESCRIPTIONS: Record<string, { category: string; descriptio
'scrape': { category: 'Extraction', description: 'Bulk download all media from page. Writes manifest.json', usage: 'scrape <images|videos|media> [--selector sel] [--dir path] [--limit N]' },
'archive': { category: 'Extraction', description: 'Save complete page as MHTML via CDP', usage: 'archive [path]' },
// Visual
'screenshot': { category: 'Visual', description: 'Save screenshot. --selector targets a specific element (explicit flag form). Positional selectors starting with ./#/@/[ still work.', usage: 'screenshot [--selector <css>] [--viewport] [--clip x,y,w,h] [--base64] [selector|@ref] [path]' },
'screenshot': { category: 'Visual', description: 'Save screenshot. Defaults to viewport-only (matches Playwright). --full-page captures the whole scroll height; warns when the result exceeds ~1800px (Anthropic vision API ceiling). --selector targets an element (explicit flag); positional selectors starting with ./#/@/[ still work. --viewport is accepted for back-compat (now the default).', usage: 'screenshot [--selector <css>] [--viewport] [--full-page] [--clip x,y,w,h] [--base64] [selector|@ref] [path]' },
'pdf': { category: 'Visual', description: 'Save the current page as PDF. Supports page layout (--format, --width, --height, --margins, --margin-*), structure (--toc waits for Paged.js), branding (--header-template, --footer-template, --page-numbers), accessibility (--tagged, --outline), and --from-file <payload.json> for large payloads. Use --tab-id <N> to target a specific tab.', usage: 'pdf [path] [--format letter|a4|legal] [--width <dim> --height <dim>] [--margins <dim>] [--margin-top <dim> --margin-right <dim> --margin-bottom <dim> --margin-left <dim>] [--header-template <html>] [--footer-template <html>] [--page-numbers] [--tagged] [--outline] [--print-background] [--prefer-css-page-size] [--toc] [--tab-id <N>] | pdf --from-file <payload.json> [--tab-id <N>]' },
'responsive': { category: 'Visual', description: 'Screenshots at mobile (375x812), tablet (768x1024), desktop (1280x720). Saves as {prefix}-mobile.png etc.', usage: 'responsive [prefix]' },
'diff': { category: 'Visual', description: 'Text diff between pages', usage: 'diff <url1> <url2>' },
Expand Down
73 changes: 65 additions & 8 deletions browse/src/meta-commands.ts
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,49 @@ import { TEMP_DIR } from './platform';
import { resolveConfig } from './config';
import type { Frame } from 'playwright';

/**
* Read PNG dimensions from the IHDR chunk. Returns null if the input is not
* a valid PNG (signature mismatch). The IHDR chunk follows the 8-byte PNG
* signature: 4-byte length, 4-byte "IHDR" type, 4-byte width, 4-byte height,
* each big-endian. We only need width/height, at offsets 16 and 20.
*/
function readPngDimensions(input: Buffer | string): { width: number; height: number } | null {
let buf: Buffer;
if (typeof input === 'string') {
try {
const fd = fs.openSync(input, 'r');
buf = Buffer.alloc(24);
fs.readSync(fd, buf, 0, 24, 0);
fs.closeSync(fd);
} catch {
return null;
}
} else {
buf = input;
}
if (buf.length < 24) return null;
// PNG signature: 89 50 4E 47 0D 0A 1A 0A
if (buf[0] !== 0x89 || buf[1] !== 0x50 || buf[2] !== 0x4E || buf[3] !== 0x47) return null;
return { width: buf.readUInt32BE(16), height: buf.readUInt32BE(20) };
}

/** Anthropic vision API rejects images with any side > 2000px in many-image
* requests. We warn at 1800 to leave headroom. The screenshot still saves —
* the warning just tells the agent to expect breakage downstream. */
const SCREENSHOT_DIMENSION_WARN_PX = 1800;

function warnIfOversize(input: Buffer | string) {
const dims = readPngDimensions(input);
if (!dims) return;
if (dims.width > SCREENSHOT_DIMENSION_WARN_PX || dims.height > SCREENSHOT_DIMENSION_WARN_PX) {
console.warn(
`[browse] warning: full-page screenshot is ${dims.width}x${dims.height}px — ` +
`exceeds ~2000px Anthropic vision API limit. ` +
`Consider --viewport (default), --clip x,y,w,h, or --selector for a smaller capture.`
);
}
}

/** Tokenize a pipe segment respecting double-quoted strings. */
function tokenizePipeSegment(segment: string): string[] {
const tokens: string[] = [];
Expand Down Expand Up @@ -419,19 +462,28 @@ export async function handleMetaCommand(

// ─── Visual ────────────────────────────────────────
case 'screenshot': {
// Parse priority: flags (--viewport, --clip, --base64) → selector (@ref, CSS) → output path
// Parse priority: flags (--viewport, --full-page, --clip, --base64) → selector (@ref, CSS) → output path
//
// Default is viewport-only. `--full-page` opts in. The previous default
// was fullPage:true, which silently produced PNGs taller than 2000px on
// long pages, exceeding the Anthropic vision API limit and bricking
// sessions when the agent later tried to Read the file (#1214). Matches
// Playwright's own page.screenshot() default of fullPage:false.
// `--viewport` is kept as a back-compat no-op alias for the new default.
const page = bm.getPage();
let outputPath = `${TEMP_DIR}/browse-screenshot.png`;
let clipRect: { x: number; y: number; width: number; height: number } | undefined;
let targetSelector: string | undefined;
let viewportOnly = false;
let fullPage = false;
let base64Mode = false;

const remaining: string[] = [];
let flagSelector: string | undefined;
for (let i = 0; i < args.length; i++) {
if (args[i] === '--viewport') {
viewportOnly = true;
// Back-compat: --viewport is the new default. Accept silently.
} else if (args[i] === '--full-page') {
fullPage = true;
} else if (args[i] === '--base64') {
base64Mode = true;
} else if (args[i] === '--selector') {
Expand Down Expand Up @@ -477,8 +529,11 @@ export async function handleMetaCommand(
if (clipRect && targetSelector) {
throw new Error('Cannot use --clip with a selector/ref — choose one');
}
if (viewportOnly && clipRect) {
throw new Error('Cannot use --viewport with --clip — choose one');
if (fullPage && clipRect) {
throw new Error('Cannot use --full-page with --clip — choose one');
}
if (fullPage && targetSelector) {
throw new Error('Cannot use --full-page with a selector/ref — choose one');
}

// --base64 mode: capture to buffer instead of disk
Expand All @@ -491,7 +546,8 @@ export async function handleMetaCommand(
} else if (clipRect) {
buffer = await page.screenshot({ clip: clipRect });
} else {
buffer = await page.screenshot({ fullPage: !viewportOnly });
buffer = await page.screenshot({ fullPage });
if (fullPage) warnIfOversize(buffer);
}
if (buffer.length > 10 * 1024 * 1024) {
throw new Error('Screenshot too large for --base64 (>10MB). Use disk path instead.');
Expand All @@ -511,8 +567,9 @@ export async function handleMetaCommand(
return `Screenshot saved (clip ${clipRect.x},${clipRect.y},${clipRect.width},${clipRect.height}): ${outputPath}`;
}

await page.screenshot({ path: outputPath, fullPage: !viewportOnly });
return `Screenshot saved${viewportOnly ? ' (viewport)' : ''}: ${outputPath}`;
await page.screenshot({ path: outputPath, fullPage });
if (fullPage) warnIfOversize(outputPath);
return `Screenshot saved${fullPage ? ' (full-page)' : ''}: ${outputPath}`;
}

case 'pdf': {
Expand Down
100 changes: 91 additions & 9 deletions browse/test/commands.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -458,16 +458,100 @@ describe('Visual', () => {
fs.unlinkSync(screenshotPath);
});

test('screenshot --viewport saves viewport-only', async () => {
test('screenshot --viewport (back-compat) saves viewport-only', async () => {
await handleWriteCommand('goto', [baseUrl + '/basic.html'], bm);
const p = '/tmp/browse-test-viewport.png';
const result = await handleMetaCommand('screenshot', ['--viewport', p], bm, async () => {});
expect(result).toContain('Screenshot saved (viewport)');
// --viewport is now the default; flag is a no-op alias kept for back-compat.
expect(result).toContain('Screenshot saved');
expect(result).not.toContain('(full-page)');
expect(fs.existsSync(p)).toBe(true);
expect(fs.statSync(p).size).toBeGreaterThan(1000);
fs.unlinkSync(p);
});

test('screenshot defaults to viewport (regression for #1214)', async () => {
// Regression for #1214: long pages used to default to fullPage:true and
// produce PNGs > 2000px tall, which the Anthropic vision API rejects,
// poisoning every following turn. Default is now viewport-only.
await handleWriteCommand('viewport', ['1280x720'], bm);
await handleWriteCommand('goto', [baseUrl + '/tall.html'], bm);
const p = '/tmp/browse-test-default-viewport.png';
const result = await handleMetaCommand('screenshot', [p], bm, async () => {});
expect(result).toContain('Screenshot saved');
expect(result).not.toContain('(full-page)');
expect(fs.existsSync(p)).toBe(true);
// Read PNG IHDR for dimensions: width @ byte 16, height @ byte 20, big-endian uint32
const fd = fs.openSync(p, 'r');
const buf = Buffer.alloc(24);
fs.readSync(fd, buf, 0, 24, 0);
fs.closeSync(fd);
const height = buf.readUInt32BE(20);
// Viewport is 720px tall; scroll height of tall.html is ~3000px.
// A viewport capture must NOT exceed the viewport.
expect(height).toBeLessThanOrEqual(800);
fs.unlinkSync(p);
});

test('screenshot --full-page captures full scroll height', async () => {
await handleWriteCommand('viewport', ['1280x720'], bm);
await handleWriteCommand('goto', [baseUrl + '/tall.html'], bm);
const p = '/tmp/browse-test-fullpage.png';
const result = await handleMetaCommand('screenshot', ['--full-page', p], bm, async () => {});
expect(result).toContain('Screenshot saved (full-page)');
expect(fs.existsSync(p)).toBe(true);
const fd = fs.openSync(p, 'r');
const buf = Buffer.alloc(24);
fs.readSync(fd, buf, 0, 24, 0);
fs.closeSync(fd);
const height = buf.readUInt32BE(20);
// tall.html has ~15 rows × 200px + h1 ≈ 3000+px scroll height
expect(height).toBeGreaterThan(2000);
fs.unlinkSync(p);
});

test('screenshot --full-page on tall page emits oversize warning', async () => {
// The warning is informational; capture still succeeds. Verifies the
// agent gets a signal that the resulting image will be rejected
// downstream by the Anthropic vision API (>1800px ceiling).
await handleWriteCommand('viewport', ['1280x720'], bm);
await handleWriteCommand('goto', [baseUrl + '/tall.html'], bm);
const p = '/tmp/browse-test-fullpage-warn.png';
const captured: string[] = [];
const origWarn = console.warn;
console.warn = (...a: any[]) => { captured.push(a.map(String).join(' ')); };
try {
await handleMetaCommand('screenshot', ['--full-page', p], bm, async () => {});
} finally {
console.warn = origWarn;
}
const warning = captured.find(line => line.includes('[browse] warning'));
expect(warning).toBeDefined();
expect(warning).toContain('exceeds');
expect(warning).toContain('2000px');
fs.unlinkSync(p);
});

test('screenshot --full-page + --clip throws', async () => {
await handleWriteCommand('goto', [baseUrl + '/basic.html'], bm);
try {
await handleMetaCommand('screenshot', ['--full-page', '--clip', '0,0,100,100'], bm, async () => {});
expect(true).toBe(false);
} catch (err: any) {
expect(err.message).toContain('Cannot use --full-page with --clip');
}
});

test('screenshot --full-page + selector throws', async () => {
await handleWriteCommand('goto', [baseUrl + '/basic.html'], bm);
try {
await handleMetaCommand('screenshot', ['--full-page', '#title'], bm, async () => {});
expect(true).toBe(false);
} catch (err: any) {
expect(err.message).toContain('Cannot use --full-page with a selector');
}
});

test('screenshot with CSS selector crops to element', async () => {
await handleWriteCommand('goto', [baseUrl + '/basic.html'], bm);
const p = '/tmp/browse-test-element-css.png';
Expand Down Expand Up @@ -509,14 +593,12 @@ describe('Visual', () => {
}
});

test('screenshot --viewport + --clip throws', async () => {
test('screenshot --viewport + --clip is allowed (--viewport is no-op default)', async () => {
await handleWriteCommand('goto', [baseUrl + '/basic.html'], bm);
try {
await handleMetaCommand('screenshot', ['--viewport', '--clip', '0,0,100,100'], bm, async () => {});
expect(true).toBe(false);
} catch (err: any) {
expect(err.message).toContain('Cannot use --viewport with --clip');
}
const p = '/tmp/browse-test-viewport-clip.png';
const result = await handleMetaCommand('screenshot', ['--viewport', '--clip', '0,0,100,100', p], bm, async () => {});
expect(result).toContain('Screenshot saved (clip 0,0,100,100)');
fs.unlinkSync(p);
});

test('screenshot --clip with invalid coords throws', async () => {
Expand Down
Loading
Loading