File tree Expand file tree Collapse file tree
Expand file tree Collapse file tree Original file line number Diff line number Diff line change 11# CHANGELOG
22
33
4+ ## v0.35.1 (2026-03-07)
5+
6+ ### Bug Fixes
7+
8+ - Use WAA server for /evaluate instead of fragile socat proxy
9+ ([ #115 ] ( https://github.com/OpenAdaptAI/openadapt-evals/pull/115 ) ,
10+ [ ` 8bd1b43 ` ] ( https://github.com/OpenAdaptAI/openadapt-evals/commit/8bd1b439de23d49072e68f457eda6c95f37e4153 ) )
11+
12+ The evaluate endpoint (/evaluate) is already available on the WAA Flask server (port 5000), which is
13+ accessed via a single reliable SSH tunnel (local:5001 → VM:5000). The separate evaluate chain
14+ (local:5050 → VM:5051 → socat → docker exec → container:5050) was fragile and caused
15+ infrastructure failures when socat died mid-trial.
16+
17+ Changes: - Default --evaluate-url to None (falls back to --server URL) - Remove socat proxy setup
18+ (_ setup_eval_proxy) from run_dc_eval.py - Remove port 5050 from SSH tunnel forwarding - Make
19+ done-gate non-fatal when evaluate returns infrastructure error - All scripts pass --evaluate-url
20+ only when explicitly set
21+
22+ Co-authored-by: Claude Opus 4.6 < noreply@anthropic.com >
23+
24+
425## v0.35.0 (2026-03-06)
526
627### Features
Original file line number Diff line number Diff line change @@ -4,7 +4,7 @@ build-backend = "hatchling.build"
44
55[project ]
66name = " openadapt-evals"
7- version = " 0.35.0 "
7+ version = " 0.35.1 "
88description = " Evaluation infrastructure for GUI agent benchmarks"
99readme = " README.md"
1010requires-python = " >=3.10"
You can’t perform that action at this time.
0 commit comments