Skip to content

Proof of Concept: Screenshot Workflow for openadapt-evals #5

@abrichr

Description

@abrichr

Proof of Concept: Screenshot Workflow

Build a working end-to-end example that demonstrates the bootstrap concept using a real-world use case from openadapt-evals.

Use Case

Problem: openadapt-evals needs screenshots for the benchmark viewer README/PR, but generating them manually is tedious (15-20 minutes of clicking, resizing, capturing).

Solution: Record the screenshot workflow once, replay it automatically whenever needed.

Goals

  1. Demonstrate full bootstrap workflow end-to-end
  2. Solve a real problem (screenshot generation for openadapt-evals)
  3. Prove the concept works with a real-world task
  4. Create reusable template for future workflows

Implementation Steps

1. Use Existing Playwright Implementation ✅

The PlaywrightScreenshotWorkflow is already implemented in workflows/screenshot_workflow.py.

Current capabilities:

  • Open HTML file in browser
  • Resize to multiple viewports (desktop, tablet, mobile)
  • Navigate to different UI states
  • Capture screenshots
  • Save to output directory

2. Test with openadapt-evals Benchmark Viewer

# Generate benchmark viewer HTML (from openadapt-evals)
cd /Users/abrichr/oa/src/openadapt-evals
uv run python -m openadapt_evals.benchmarks.cli mock --tasks 10
uv run python -m openadapt_evals.benchmarks.cli view --run-name my_mock_eval

# Use bootstrap to generate screenshots
cd /Users/abrichr/oa/src/openadapt-bootstrap
uv run python examples/generate_benchmark_screenshots.py \
    --html-path ../openadapt-evals/benchmark_results/my_mock_eval/viewer.html \
    --output-dir ../openadapt-evals/screenshots/ \
    --use-playwright \
    --viewports desktop tablet mobile \
    --states overview task_detail log_expanded

3. Validate Generated Screenshots

Use the screenshot validation from openadapt-evals (see updated CLAUDE.md).

Expected output: 9 screenshots (3 viewports × 3 states)

4. Success Criteria

✅ Screenshot workflow generates valid images
✅ Images match expected viewports and states
✅ Time savings demonstrated (85-90%)

Value Demonstration

Manual process: 15-20 minutes
Automated process: 2-3 minutes
Time savings: 85-90%

Estimated Effort

1-2 hours (testing, validation, documentation)

Related Issues

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions