Skip to content

Commit bd3acaf

Browse files
author
semantic-release
committed
chore: release 0.81.8
1 parent 5a2bf7f commit bd3acaf

File tree

2 files changed

+20
-1
lines changed

2 files changed

+20
-1
lines changed

CHANGELOG.md

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,25 @@
11
# CHANGELOG
22

33

4+
## v0.81.8 (2026-03-29)
5+
6+
### Bug Fixes
7+
8+
- Disable Qwen3.5 thinking mode in TRL generation
9+
([#249](https://github.com/OpenAdaptAI/openadapt-evals/pull/249),
10+
[`5a2bf7f`](https://github.com/OpenAdaptAI/openadapt-evals/commit/5a2bf7f7d6dc262608fba994b02cfbce50eaa811))
11+
12+
Root cause of persistent garbage output: Qwen3.5-9B's chat template inserts <think> which activates
13+
internal reasoning mode. The model produces opaque thinking tokens (# # # # #) instead of DSL
14+
actions.
15+
16+
Fix: pass enable_thinking=False to apply_chat_template. Falls back to
17+
18+
stripping <think> from rendered text if the kwarg is not supported.
19+
20+
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
21+
22+
423
## v0.81.7 (2026-03-29)
524

625
### Bug Fixes

pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ build-backend = "hatchling.build"
44

55
[project]
66
name = "openadapt-evals"
7-
version = "0.81.7"
7+
version = "0.81.8"
88
description = "Evaluation infrastructure for GUI agent benchmarks"
99
readme = "README.md"
1010
requires-python = ">=3.10"

0 commit comments

Comments
 (0)