File tree Expand file tree Collapse file tree
Expand file tree Collapse file tree Original file line number Diff line number Diff line change 11# CHANGELOG
22
33
4+ ## v0.26.0 (2026-03-03)
5+
6+ ### Documentation
7+
8+ - Add EC2 setup guide for WAA deployment
9+ ([ #90 ] ( https://github.com/OpenAdaptAI/openadapt-evals/pull/90 ) ,
10+ [ ` ca6a936 ` ] ( https://github.com/OpenAdaptAI/openadapt-evals/commit/ca6a9362556852bd6ad040ba9ac7a5dfe3a7d880 ) )
11+
12+ Co-authored-by: Claude Opus 4.6 < noreply@anthropic.com >
13+
14+ ### Features
15+
16+ - Add TaskVerifierRegistry for custom task verification
17+ ([ #89 ] ( https://github.com/OpenAdaptAI/openadapt-evals/pull/89 ) ,
18+ [ ` 639a6a2 ` ] ( https://github.com/OpenAdaptAI/openadapt-evals/commit/639a6a2ba2a15e0c7a2a3bd65fa57a38f6966965 ) )
19+
20+ Add a registry pattern for custom task verifiers that can inspect VM state after task execution.
21+ This enables GoTo IT Autopilot (and other integrators) to register domain-specific verification
22+ functions without subclassing BenchmarkAdapter.
23+
24+ - TaskVerifierRegistry with decorator and programmatic registration - VerificationResult dataclass
25+ with success/score/details - WAALiveAdapter.run_powershell() for executing PowerShell on the VM -
26+ Built-in clear_browsing_data reference verifier - 33 tests covering registry operations and
27+ built-in verifiers - Exports from evaluation package and main package __ init__
28+
29+ Co-authored-by: Claude Opus 4.6 < noreply@anthropic.com >
30+
31+
432## v0.25.1 (2026-03-03)
533
634### Bug Fixes
Original file line number Diff line number Diff line change @@ -4,7 +4,7 @@ build-backend = "hatchling.build"
44
55[project ]
66name = " openadapt-evals"
7- version = " 0.25.1 "
7+ version = " 0.26.0 "
88description = " Evaluation infrastructure for GUI agent benchmarks"
99readme = " README.md"
1010requires-python = " >=3.10"
You can’t perform that action at this time.
0 commit comments