Skip to content

Commit d4176b6

Browse files
author
semantic-release
committed
chore: release 0.26.0
1 parent 639a6a2 commit d4176b6

2 files changed

Lines changed: 29 additions & 1 deletion

File tree

CHANGELOG.md

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,34 @@
11
# CHANGELOG
22

33

4+
## v0.26.0 (2026-03-03)
5+
6+
### Documentation
7+
8+
- Add EC2 setup guide for WAA deployment
9+
([#90](https://github.com/OpenAdaptAI/openadapt-evals/pull/90),
10+
[`ca6a936`](https://github.com/OpenAdaptAI/openadapt-evals/commit/ca6a9362556852bd6ad040ba9ac7a5dfe3a7d880))
11+
12+
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
13+
14+
### Features
15+
16+
- Add TaskVerifierRegistry for custom task verification
17+
([#89](https://github.com/OpenAdaptAI/openadapt-evals/pull/89),
18+
[`639a6a2`](https://github.com/OpenAdaptAI/openadapt-evals/commit/639a6a2ba2a15e0c7a2a3bd65fa57a38f6966965))
19+
20+
Add a registry pattern for custom task verifiers that can inspect VM state after task execution.
21+
This enables GoTo IT Autopilot (and other integrators) to register domain-specific verification
22+
functions without subclassing BenchmarkAdapter.
23+
24+
- TaskVerifierRegistry with decorator and programmatic registration - VerificationResult dataclass
25+
with success/score/details - WAALiveAdapter.run_powershell() for executing PowerShell on the VM -
26+
Built-in clear_browsing_data reference verifier - 33 tests covering registry operations and
27+
built-in verifiers - Exports from evaluation package and main package __init__
28+
29+
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
30+
31+
432
## v0.25.1 (2026-03-03)
533

634
### Bug Fixes

pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ build-backend = "hatchling.build"
44

55
[project]
66
name = "openadapt-evals"
7-
version = "0.25.1"
7+
version = "0.26.0"
88
description = "Evaluation infrastructure for GUI agent benchmarks"
99
readme = "README.md"
1010
requires-python = ">=3.10"

0 commit comments

Comments
 (0)