File tree Expand file tree Collapse file tree
Expand file tree Collapse file tree Original file line number Diff line number Diff line change 11# CHANGELOG
22
33
4+ ## v0.81.3 (2026-03-29)
5+
6+ ### Bug Fixes
7+
8+ - Try local eval before slow /evaluate endpoint in evaluate_dense
9+ ([ #245 ] ( https://github.com/OpenAdaptAI/openadapt-evals/pull/245 ) ,
10+ [ ` 3b8c1c2 ` ] ( https://github.com/OpenAdaptAI/openadapt-evals/commit/3b8c1c2b6317a693fec2e97cf8aa459205f1be4d ) )
11+
12+ 51% of TRL training time wasted on 5050 evaluate timeouts (180s × 3 retries = 9 min per evaluation).
13+ The local evaluation via evaluate_checks_local takes ~ 5s.
14+
15+ Fix: when task config has checks defined, try local eval FIRST. Only
16+
17+ fall through to the slow /evaluate endpoint when no local checks exist. This eliminates the 9-minute
18+ timeout for custom YAML tasks that define their own checks.
19+
20+ Before: evaluate() [ 9 min] → if 0.0 → local [ 5s]
21+
22+ After: local [ 5s] → if no checks → evaluate() [ 9 min]
23+
24+ Co-authored-by: Claude Opus 4.6 (1M context) < noreply@anthropic.com >
25+
26+
427## v0.81.2 (2026-03-29)
528
629### Bug Fixes
Original file line number Diff line number Diff line change @@ -4,7 +4,7 @@ build-backend = "hatchling.build"
44
55[project ]
66name = " openadapt-evals"
7- version = " 0.81.2 "
7+ version = " 0.81.3 "
88description = " Evaluation infrastructure for GUI agent benchmarks"
99readme = " README.md"
1010requires-python = " >=3.10"
You can’t perform that action at this time.
0 commit comments