quarto-dev · cderv · Mar 27, 2026 · Mar 27, 2026
diff --git a/.claude/skills/screenshot/SKILL.md b/.claude/skills/screenshot/SKILL.md
@@ -0,0 +1,143 @@
+---
+name: capturing-screenshots
+description: Capture or update documentation screenshots for the Quarto website using Playwright. Use when screenshots need refreshing, new screenshots are needed for docs pages, or the user mentions screenshots, screen captures, or visual documentation.
+allowed-tools: Bash(node _tools/screenshots/*), Bash(npm.cmd *), Bash(cat *), Bash(playwright-cli *), Bash(oxipng *), Agent
+---
+
+## Setup
+
+When this skill loads, run these commands to gather context:
+
+1. **List registered screenshots:** `bash "${CLAUDE_SKILL_DIR}/scripts/list-screenshots.sh"`
+2. **Read visual rules:** `cat _tools/screenshots/CLAUDE.md`
+3. **Read capture agent reference:** `cat "${CLAUDE_SKILL_DIR}/capture-agent.md"`
+4. **Read manifest schema:** `cat "${CLAUDE_SKILL_DIR}/manifest-schema.md"`
+
+**Working directory:** `npm run` commands (`render`, `capture`, `compress`) work from
+any directory — they resolve paths from `_tools/screenshots/package.json`. Direct
+`node scripts/...` calls and `playwright-cli` must run from `_tools/screenshots/` or
+use absolute paths. Be careful not to double up path segments if you've already `cd`'d
+into `_tools/screenshots/`.
+
+## Instructions
+
+You are the screenshot orchestrator. The list output shows all registered screenshots, the visual rules define quality standards, and the capture agent reference describes how browser operations work.
+
+### If the user wants to UPDATE existing screenshots:
+
+1. Ask which screenshots to update (or "all")
+2. Process screenshots **one at a time** — never batch-capture without confirmation:
+   a. Render: `node _tools/screenshots/scripts/render.js <project-path>` (can batch-render all profiles upfront)
+   b. Capture: `npm run capture -- --name <name>` (handles serve, capture, dark variant, compress)
+   c. Show the user the output image(s) using the Read tool
+   d. **STOP and wait for explicit confirmation** before proceeding to the next screenshot
+   e. If the user requests adjustments, update manifest and re-capture
+   f. Only after confirmation, move to the next screenshot
+3. Show results summary
+
+**Critical:** Each screenshot requires user visual review and explicit approval. Do not proceed to the next screenshot until the user confirms the current one is acceptable. This applies to both new captures and re-captures of existing screenshots.
+
+### If the user wants to CREATE a new screenshot:
+
+Gather these parameters (ask about unknowns, infer from context when obvious):
+
+| Parameter | Values / Notes |
+|-----------|---------------|
+| Source type | `url` (live site), `example` (Quarto project — render then serve) |
+| Source detail | URL or example project path (create minimal project if needed) |
+| Viewport | navbar=1440x400, sidebar=992x600, about=1200x900, full page=1440x900 |
+| Zoom | Default 1.0; use 1.15 for about pages or excess internal padding |
+| Element | CSS selector if capturing a specific element; omit for full viewport |
+| Interactions | Clicks, hovers, etc. needed before capture |
+| Trim / Crop | `trim: true` for uniform background edges; `cropBottom`/`maxHeight` when vertical rules prevent trim |
+| Output path | Suggest based on doc location |
+| Doc file | Which .qmd references this image (for manifest `doc.file`) |
+
+Then work through two phases:
+
+#### Phase A: Visual design (what to capture)
+
+Use playwright-cli to explore the page interactively and nail down the visual.
+Phase A ends when the user approves the screenshot visual.
+
+1. Create example project if needed
+2. Render: `node _tools/screenshots/scripts/render.js <project-path>` (add `--profile <name>` if needed)
+3. Serve the **rendered output directory**: `node _tools/screenshots/scripts/serve.js <output-dir>`
+   The serve script takes a directory path — it does not understand `--profile`.
+   For default renders, the output is `_site/` inside the project. For profiled renders,
+   it's `docs-<profile>/` (e.g., `examples/navbar-basic/docs-reader-mode`). Check the
+   render output to confirm the actual path.
+4. Open in headed mode: `playwright-cli -s=screenshot open --headed <url>`
+   (headed mode shows the browser window so you can see the page)
+5. Discover what to capture:
+   a. Take a snapshot (`playwright-cli -s=screenshot snapshot`) to see page structure
+   b. If replacing an existing screenshot, download and read the current image to
+      understand what it looks like (e.g., `curl -sL -o "$TMPDIR/existing.png" <url>`
+      then Read tool). Note what's included, cropped, and framed — the new
+      screenshot should match unless the doc content has changed.
+   c. Read the .qmd doc file to understand what the image should illustrate — check
+      the YAML example above the image, the fig-alt text, and surrounding prose
+   d. Determine initial viewport from the category table (navbar=1440x400,
+      sidebar=992x600, about=1200x900, full page=1440x900)
+6. Test and iterate in headed mode:
+   a. Resize: `playwright-cli -s=screenshot resize <w> <h>`
+   b. Test cleanup evals if needed (hiding elements, removing banners)
+   c. Test interactions (click/hover) — take snapshot, find ref, click, verify state
+   d. Take a test screenshot:
+      `playwright-cli -s=screenshot screenshot --filename="$TMPDIR/test.png"`
+   e. Show the screenshot to the user: `npm run open -- "$TMPDIR/test.png"`
+      (cross-platform; do NOT use `open` or `start` directly)
+   f. Provide review context so the user can judge the screenshot:
+      - Which .qmd file and section (line number, heading)
+      - The fig-alt text (what the image is supposed to show)
+      - The code example shown alongside it in the doc (if any)
+      - A link to the live doc page if available (e.g., quarto.org URL)
+      - What to specifically check (does navbar match the YAML? Are the
+        right items visible? etc.)
+   g. Ask: "Does this capture what the doc needs? Anything to adjust?"
+   h. Repeat until the user approves the visual
+7. Encode findings into manifest:
+   a. Read manifest-schema.md for the complete field reference
+   b. Create the manifest entry based on what was validated interactively
+   c. Every field value should come from tested exploration, not guesswork
+
+Use `playwright-cli --help` to discover available commands.
+See capture-agent.md for `eval` vs `run-code` guidance — use `run-code` for complex JS.
+
+#### When stuck: Chrome DevTools MCP (only if available)
+
+If playwright-cli's shell escaping fights you on complex JS (template literals,
+nested quotes, `getComputedStyle`), Chrome DevTools MCP can help — but ONLY if
+it's available in the current session, and ALWAYS ask the user before switching.
+
+- `evaluate_script` — proper JS function, no shell escaping layer
+- `take_screenshot` — inline visual feedback in conversation
+- Best for: iterative CSS/DOM debugging (e.g., spotlight stacking contexts)
+- Trade-off: more verbose output per call = higher token usage
+
+Never switch to Chrome DevTools MCP proactively. Suggest it as an option and
+let the user decide.
+
+#### Phase B: Image processing (how to post-process)
+
+Phase B starts after the user approves the visual in Phase A and a manifest entry
+exists. Now run the automated capture pipeline and tune post-processing.
+
+1. Add the manifest entry to `_tools/screenshots/manifest.json`
+2. Run `npm run validate` to check the manifest entry
+3. Run `npm run capture -- --name <name>` to produce the screenshot
+3. Show the user the output — ask them to verify visually
+4. If blank space remains, decide with the user:
+   - **Uniform background edges?** → add `"trim": true`
+   - **Vertical rules or multi-color edges?** → add `"cropBottom": N` or `"maxHeight": N`
+   - **Both?** → trim runs first, then crop
+5. Re-capture and verify until the user is satisfied
+
+### Launching the capture agent:
+
+Use the Agent tool with `subagent_type="general-purpose"` and `model="sonnet"`. Pass:
+- The base URL where the site is being served
+- The capture agent reference (from `${CLAUDE_SKILL_DIR}/capture-agent.md`)
+- Specific screenshot details: viewport, cleanup, interactions, element, output path
+- Note: zoom and post-processing (trim, crop) are handled by capture.js, not the agent. If the agent captures manually, it should apply zoom via `page.evaluate(z => document.body.style.zoom = z, String(zoom))`
+- Instruct it to follow the capture workflow and use `-s=screenshot` session flag
diff --git a/.claude/skills/screenshot/capture-agent.md b/.claude/skills/screenshot/capture-agent.md
@@ -0,0 +1,178 @@
+# Screenshot Capture Agent
+
+## Contents
+- Capture Workflow: navigate, resize, zoom, wait, cleanup, interactions, screenshot
+- Dark mode variants
+- Visual validation
+- Rules
+
+You capture documentation screenshots using playwright-cli. You receive:
+- A base URL where the page is already being served
+- One or more screenshot specifications from the manifest
+
+Always use the session flag: `-s=screenshot` for all playwright-cli commands.
+
+## eval vs run-code
+
+playwright-cli has two ways to execute JavaScript:
+
+**`eval`** — evaluates a string expression. Good for simple one-liners:
+```bash
+playwright-cli -s=screenshot eval "document.body.style.zoom = '1.25'"
+playwright-cli -s=screenshot eval "document.getElementById('header').style.display = 'none'"
+```
+
+**`run-code`** — executes an async function with access to the Playwright `page` object.
+Use for anything complex: multi-line logic, Playwright API calls, template literals,
+or any JS that would need shell escaping in `eval`:
+```bash
+playwright-cli -s=screenshot run-code "async page => {
+  await page.waitForTimeout(200);
+}"
+playwright-cli -s=screenshot run-code "async page => {
+  const bg = await page.evaluate(() => getComputedStyle(document.body).backgroundColor);
+  console.log(bg);
+}"
+```
+
+**Rule of thumb:** If your JS has quotes, template literals, `getComputedStyle`, or
+is more than one statement — use `run-code`. Shell escaping in `eval` breaks easily
+with complex expressions.
+
+## Debugging with Chrome DevTools MCP (only if available)
+
+If stuck on CSS/DOM issues and playwright-cli's shell escaping is making complex
+JS evaluation difficult, Chrome DevTools MCP can provide a faster feedback loop.
+**Only suggest this if it's available in the current session, and always ask the
+user before switching.**
+
+- `evaluate_script` — proper JS function, no shell escaping
+- `take_screenshot` — inline visual feedback in conversation
+- Best for: iterative CSS debugging (e.g., spotlight stacking contexts)
+- Trade-off: more verbose output per call (higher token usage)
+
+## Capture Workflow
+
+For each screenshot:
+
+### 1. Navigate and resize
+
+```bash
+playwright-cli -s=screenshot open <url>/<page>
+playwright-cli -s=screenshot resize <width> <height>
+```
+
+### 2. Apply zoom (if specified)
+
+If the manifest entry has `capture.zoom`, apply it before any other operations:
+```bash
+playwright-cli -s=screenshot eval "document.body.style.zoom = '1.15'"
+playwright-cli -s=screenshot eval "new Promise(r => setTimeout(r, 200))"
+```
+
+Note: `capture.js` handles zoom automatically. This step is only needed for manual/interactive captures.
+
+### 3. Wait for full load
+
+```bash
+playwright-cli -s=screenshot snapshot
+```
+
+Check the snapshot for:
+- Bootstrap Icons rendered (not blank boxes or missing glyphs)
+- Fonts loaded (text not in fallback font)
+- Content fully rendered (not loading spinners)
+
+If not ready, wait and re-snapshot:
+```bash
+playwright-cli -s=screenshot eval "new Promise(r => setTimeout(r, 2000))"
+playwright-cli -s=screenshot snapshot
+```
+
+### 4. Run cleanup steps
+
+Read cleanup steps from `manifest.json` `defaults.cleanup` array. For each step with `"action": "eval"`, run:
+```bash
+playwright-cli -s=screenshot eval "<script from manifest>"
+```
+
+Then run any additional cleanup from the per-screenshot `capture.cleanup` array.
+
+### 5. Run interaction steps
+
+For clicks (e.g., opening a dropdown):
+1. Take a snapshot to get refs
+2. Find the ref matching the target (e.g., the GitHub icon)
+3. Click it: `playwright-cli -s=screenshot click <ref>`
+4. **Verify the interaction worked**: snapshot again and check for the expected state (e.g., `[expanded]` attribute, `.dropdown-menu.show`)
+5. If a dropdown toggled closed instead of opening, click again (click toggles)
+
+**Stateful toggle caution:** Some toggles (reader mode, sidebar collapse) persist state
+in localStorage. When `dark: true`, interactions run twice (light + dark). On the dark
+pass the page reloads but localStorage persists, so the toggle may already be active —
+clicking it again deactivates it. In the manifest, use an `eval` with a guard condition
+instead of a plain `click` for these. See `manifest-schema.md` for the pattern.
+
+For element screenshots where content overflows (like dropdowns):
+1. Use `eval` to get bounding boxes of the element + overflow content
+2. Calculate a clip region that contains everything
+3. Use `run-code` with Playwright's clip option:
+```bash
+playwright-cli -s=screenshot run-code "async page => {
+  await page.screenshot({ path: 'out.png', clip: { x, y, width, height } });
+}"
+```
+
+### 6. Take screenshot
+
+**Element screenshot** (when capture.element is specified):
+1. Snapshot to find the ref for the element
+2. `playwright-cli -s=screenshot screenshot <ref> --filename=<output-path>`
+
+**Full page/viewport screenshot:**
+```bash
+playwright-cli -s=screenshot screenshot --filename=<output-path>
+```
+
+### 6b. Post-capture processing (automatic)
+
+`capture.js` handles these automatically — the agent does not need to replicate them:
+- **trim** (`capture.trim`) — content-aware whitespace removal via sharp
+- **cropBottom** (`capture.cropBottom`) — removes N pixels from bottom edge
+- **maxHeight** (`capture.maxHeight`) — caps image height, crops from bottom
+- **compress** — oxipng compression
+
+These run in order: trim → crop → compress.
+
+### 7. Dark mode variant
+
+If the screenshot has `"dark": true` in manifest:
+1. Toggle via JS: `playwright-cli -s=screenshot run-code "async page => { await page.evaluate(() => window.quartoToggleColorScheme()); await page.locator('body.quarto-dark').waitFor(); }"`
+2. Re-run any interactions (e.g., dropdown may have closed during toggle)
+3. Take the screenshot again with `-dark` suffix on the filename
+4. Toggle back: `playwright-cli -s=screenshot run-code "async page => { await page.evaluate(() => window.quartoToggleColorScheme()); await page.locator('body.quarto-light').waitFor(); }"`
+
+Note: `capture.js` handles this automatically. This step is only needed for manual/interactive captures.
+
+### 8. Visual validation
+
+After capturing, verify:
+- The screenshot file was created and is non-empty
+- Report what you captured (element, viewport size, interactions performed)
+
+### 9. Close when done
+
+```bash
+playwright-cli -s=screenshot close
+```
+
+## Rules
+
+- ALWAYS use `-s=screenshot` session flag (avoids collisions with other sessions)
+- ALWAYS capture light mode first (default), then dark if needed
+- ALWAYS wait for fonts and icons to load before capturing
+- ALWAYS remove prerelease/preview banners
+- Use consistent viewport sizes from the manifest
+- If something looks wrong (missing icons, broken layout), report it — don't save a bad screenshot
+- If an interaction fails (ref not found, dropdown didn't open), report the error with the snapshot content
+- Use relative paths from repo root for --filename output (L12: path resolution)