Skip to content

Commit 1d29341

Browse files
author
semantic-release
committed
chore: release 0.62.0
1 parent 8d4f595 commit 1d29341

2 files changed

Lines changed: 26 additions & 1 deletion

File tree

CHANGELOG.md

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,31 @@
11
# CHANGELOG
22

33

4+
## v0.62.0 (2026-03-22)
5+
6+
### Features
7+
8+
- Improve TraceAnalyzer HTML report with embedded screenshots and failure analysis
9+
([#184](https://github.com/OpenAdaptAI/openadapt-evals/pull/184),
10+
[`8d4f595`](https://github.com/OpenAdaptAI/openadapt-evals/commit/8d4f59526f2b574f2db7893318d68cc33ca1873e))
11+
12+
- Add score distribution bar chart with color-coded bars (green >0.75, yellow 0.25-0.75, red <0.25)
13+
for per-task score visualization - Add failure analysis section with grouped failure types and
14+
example episode IDs for each failure mode - Add interactive inline step viewer: click episode
15+
table rows to expand and see step-by-step screenshots without scrolling to a separate section -
16+
Embed screenshots as base64 data URIs so the HTML report is fully self-contained with no external
17+
file references - Add percentile statistics: median score, P25/P75 breakdowns, median time, and
18+
time distribution - Add side-by-side comparison stat cards (Run A vs Run B) with delta
19+
highlighting (green = improved, red = regressed) and unified task-level diff table showing score
20+
and step deltas - Add "Copy as Markdown" button that formats the summary as a Markdown table for
21+
pasting into Slack/GitHub - Improve table sorting with sort direction indicators and data-sort
22+
attributes for correct numeric sorting - Add hover effects on cards (lift + shadow), table rows,
23+
and screenshot thumbnails; add print-friendly styles - Add generation timestamp in report header -
24+
Add 14 new tests covering all new report features
25+
26+
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
27+
28+
429
## v0.61.0 (2026-03-22)
530

631
### Features

pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ build-backend = "hatchling.build"
44

55
[project]
66
name = "openadapt-evals"
7-
version = "0.61.0"
7+
version = "0.62.0"
88
description = "Evaluation infrastructure for GUI agent benchmarks"
99
readme = "README.md"
1010
requires-python = ">=3.10"

0 commit comments

Comments
 (0)