Skip to content

Commit c192c0e

Browse files
author
semantic-release
committed
chore: release 0.72.2
1 parent d011b2f commit c192c0e

2 files changed

Lines changed: 25 additions & 1 deletion

File tree

CHANGELOG.md

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,30 @@
11
# CHANGELOG
22

33

4+
## v0.72.2 (2026-03-28)
5+
6+
### Bug Fixes
7+
8+
- Constrained decoding preserves chain-of-thought reasoning
9+
([#200](https://github.com/OpenAdaptAI/openadapt-evals/pull/200),
10+
[`d011b2f`](https://github.com/OpenAdaptAI/openadapt-evals/commit/d011b2fcab1a0489ef1e4d3b7d7bc362a6cd64d5))
11+
12+
The Thought/Action format from SYSTEM_PROMPT is now enforced by the constrained decoding regex:
13+
14+
Thought: <up to 500 chars of reasoning>
15+
16+
Action: CLICK(x=0.50, y=0.30)
17+
18+
This gives the model a reasoning budget while guaranteeing parseable output. Prior regex had no
19+
prefix (model couldn't reason) or used (.|\n)* (Outlines couldn't compile the DFA).
20+
21+
Also exposes _ACTION_RE (action-only regex) for use by the parser.
22+
23+
Tests updated: 30 pass (was 21).
24+
25+
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
26+
27+
428
## v0.72.1 (2026-03-28)
529

630
### Bug Fixes

pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ build-backend = "hatchling.build"
44

55
[project]
66
name = "openadapt-evals"
7-
version = "0.72.1"
7+
version = "0.72.2"
88
description = "Evaluation infrastructure for GUI agent benchmarks"
99
readme = "README.md"
1010
requires-python = ">=3.10"

0 commit comments

Comments
 (0)