File tree Expand file tree Collapse file tree
Expand file tree Collapse file tree Original file line number Diff line number Diff line change 11# CHANGELOG
22
33
4+ ## v0.77.2 (2026-03-29)
5+
6+ ### Bug Fixes
7+
8+ - Add numpy to dev dependencies for CI
9+ ([ #221 ] ( https://github.com/OpenAdaptAI/openadapt-evals/pull/221 ) ,
10+ [ ` 6a0374a ` ] ( https://github.com/OpenAdaptAI/openadapt-evals/commit/6a0374a25c4ea4e0b3808ac9cc0780ee89a7f087 ) )
11+
12+ test_workflow_models.py and workflow pipeline import numpy directly. Was transitive via
13+ openadapt-ml, now needed as dev dep.
14+
15+ Co-authored-by: Claude Opus 4.6 (1M context) < noreply@anthropic.com >
16+
17+ ### Testing
18+
19+ - Demoexecutor e2e tests with mock WAA environment
20+ ([ #222 ] ( https://github.com/OpenAdaptAI/openadapt-evals/pull/222 ) ,
21+ [ ` 8e6f685 ` ] ( https://github.com/OpenAdaptAI/openadapt-evals/commit/8e6f685e94a61425df3e563623ece79690c7e44a ) )
22+
23+ 12 tests covering the full DemoExecutor pipeline: - Keyboard-only demo: 3 steps execute in order,
24+ all Tier 1 - Mixed demo: click uses grounder, keyboard bypasses it - Evaluation: dense score with
25+ milestones, binary without - Telemetry: start/completed events with tier counts - Edge cases:
26+ empty demo, missing values, unknown action types
27+
28+ All tests use MockEnv (no WAA server, no HTTP, no API keys). 12/12 pass in 0.05s.
29+
30+ Co-authored-by: Claude Opus 4.6 (1M context) < noreply@anthropic.com >
31+
32+
433## v0.77.1 (2026-03-29)
534
635### Bug Fixes
Original file line number Diff line number Diff line change @@ -4,7 +4,7 @@ build-backend = "hatchling.build"
44
55[project ]
66name = " openadapt-evals"
7- version = " 0.77.1 "
7+ version = " 0.77.2 "
88description = " Evaluation infrastructure for GUI agent benchmarks"
99readme = " README.md"
1010requires-python = " >=3.10"
You can’t perform that action at this time.
0 commit comments