Skip to content

Commit dd0dd45

Browse files
author
semantic-release
committed
chore: release 0.81.7
1 parent 8e3bc45 commit dd0dd45

File tree

2 files changed

+22
-1
lines changed

2 files changed

+22
-1
lines changed

CHANGELOG.md

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,27 @@
11
# CHANGELOG
22

33

4+
## v0.81.7 (2026-03-29)
5+
6+
### Bug Fixes
7+
8+
- Comprehensive prompt diagnostics for debugging garbage output
9+
([#248](https://github.com/OpenAdaptAI/openadapt-evals/pull/248),
10+
[`8e3bc45`](https://github.com/OpenAdaptAI/openadapt-evals/commit/8e3bc45097231e35060c30e0064753c4dea527d1))
11+
12+
Adds detailed one-time logging to help debug the persistent garbage output issue:
13+
14+
1. Raw messages (role, content types, text preview) before chat template 2. Full rendered text_input
15+
(2000 chars, not 300) 3. Image metadata (mode, size, format) 4. Generation config (max_new_tokens,
16+
temperature, constrained, model type) 5. First generation output (500 chars + token count) 6.
17+
Input tensor shapes (input_ids, attention_mask, pixel_values, image_grid_thw)
18+
19+
The tensor shape logging is critical: if pixel_values is MISSING, the model isn't seeing the
20+
screenshot — which would explain degenerate output regardless of prompt correctness.
21+
22+
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
23+
24+
425
## v0.81.6 (2026-03-29)
526

627
### Bug Fixes

pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ build-backend = "hatchling.build"
44

55
[project]
66
name = "openadapt-evals"
7-
version = "0.81.6"
7+
version = "0.81.7"
88
description = "Evaluation infrastructure for GUI agent benchmarks"
99
readme = "README.md"
1010
requires-python = ">=3.10"

0 commit comments

Comments
 (0)