Proof of Concept: Screenshot Workflow
Build a working end-to-end example that demonstrates the bootstrap concept using a real-world use case from openadapt-evals.
Use Case
Problem: openadapt-evals needs screenshots for the benchmark viewer README/PR, but generating them manually is tedious (15-20 minutes of clicking, resizing, capturing).
Solution: Record the screenshot workflow once, replay it automatically whenever needed.
Goals
- Demonstrate full bootstrap workflow end-to-end
- Solve a real problem (screenshot generation for openadapt-evals)
- Prove the concept works with a real-world task
- Create reusable template for future workflows
Implementation Steps
1. Use Existing Playwright Implementation ✅
The PlaywrightScreenshotWorkflow is already implemented in workflows/screenshot_workflow.py.
Current capabilities:
- Open HTML file in browser
- Resize to multiple viewports (desktop, tablet, mobile)
- Navigate to different UI states
- Capture screenshots
- Save to output directory
2. Test with openadapt-evals Benchmark Viewer
# Generate benchmark viewer HTML (from openadapt-evals)
cd /Users/abrichr/oa/src/openadapt-evals
uv run python -m openadapt_evals.benchmarks.cli mock --tasks 10
uv run python -m openadapt_evals.benchmarks.cli view --run-name my_mock_eval
# Use bootstrap to generate screenshots
cd /Users/abrichr/oa/src/openadapt-bootstrap
uv run python examples/generate_benchmark_screenshots.py \
--html-path ../openadapt-evals/benchmark_results/my_mock_eval/viewer.html \
--output-dir ../openadapt-evals/screenshots/ \
--use-playwright \
--viewports desktop tablet mobile \
--states overview task_detail log_expanded
3. Validate Generated Screenshots
Use the screenshot validation from openadapt-evals (see updated CLAUDE.md).
Expected output: 9 screenshots (3 viewports × 3 states)
4. Success Criteria
✅ Screenshot workflow generates valid images
✅ Images match expected viewports and states
✅ Time savings demonstrated (85-90%)
Value Demonstration
Manual process: 15-20 minutes
Automated process: 2-3 minutes
Time savings: 85-90%
Estimated Effort
1-2 hours (testing, validation, documentation)
Related Issues
Proof of Concept: Screenshot Workflow
Build a working end-to-end example that demonstrates the bootstrap concept using a real-world use case from openadapt-evals.
Use Case
Problem: openadapt-evals needs screenshots for the benchmark viewer README/PR, but generating them manually is tedious (15-20 minutes of clicking, resizing, capturing).
Solution: Record the screenshot workflow once, replay it automatically whenever needed.
Goals
Implementation Steps
1. Use Existing Playwright Implementation ✅
The
PlaywrightScreenshotWorkflowis already implemented inworkflows/screenshot_workflow.py.Current capabilities:
2. Test with openadapt-evals Benchmark Viewer
3. Validate Generated Screenshots
Use the screenshot validation from openadapt-evals (see updated CLAUDE.md).
Expected output: 9 screenshots (3 viewports × 3 states)
4. Success Criteria
✅ Screenshot workflow generates valid images
✅ Images match expected viewports and states
✅ Time savings demonstrated (85-90%)
Value Demonstration
Manual process: 15-20 minutes
Automated process: 2-3 minutes
Time savings: 85-90%
Estimated Effort
1-2 hours (testing, validation, documentation)
Related Issues