File tree Expand file tree Collapse file tree
Expand file tree Collapse file tree Original file line number Diff line number Diff line change 11# CHANGELOG
22
33
4+ ## v0.72.1 (2026-03-28)
5+
6+ ### Bug Fixes
7+
8+ - Constrained decoding cache bug, task rotation, add trainer tests
9+ ([ #199 ] ( https://github.com/OpenAdaptAI/openadapt-evals/pull/199 ) ,
10+ [ ` 4ec7d51 ` ] ( https://github.com/OpenAdaptAI/openadapt-evals/commit/4ec7d5174e1c8420d436af5cc6810b211a85de61 ) )
11+
12+ Constrained decoding: - Remove (.|\n)* prefix from action regex — Outlines can't compile it into a
13+ DFA efficiently. Model must output action directly. - Fix cache sentinel: use False for failure
14+ (not [ ] ) so subsequent calls correctly return None instead of empty logits_processor list. Prior
15+ bug: [ ] cached as "success" → model generated unconstrained. - Upgrade warning to error level for
16+ visibility.
17+
18+ Task rotation: - Fix _ load_task_configs: check ` not task_ids ` once BEFORE the loop (was checking
19+ inside loop — only first task ever appended).
20+
21+ Tests (21 new): - TestActionRegex: 8 valid actions match, 6 invalid texts rejected -
22+ TestConstrainedDecodingCache: sentinel logic, regression for [ ] bug - TestTaskRotation: all tasks
23+ loaded, explicit ids preserved, rotation
24+
25+ Co-authored-by: Claude Opus 4.6 (1M context) < noreply@anthropic.com >
26+
27+
428## v0.72.0 (2026-03-28)
529
630### Features
Original file line number Diff line number Diff line change @@ -4,7 +4,7 @@ build-backend = "hatchling.build"
44
55[project ]
66name = " openadapt-evals"
7- version = " 0.72.0 "
7+ version = " 0.72.1 "
88description = " Evaluation infrastructure for GUI agent benchmarks"
99readme = " README.md"
1010requires-python = " >=3.10"
You can’t perform that action at this time.
0 commit comments