Skip to content

Commit db22f6b

Browse files
author
semantic-release
committed
chore: release 0.35.1
1 parent 8bd1b43 commit db22f6b

2 files changed

Lines changed: 22 additions & 1 deletion

File tree

CHANGELOG.md

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,27 @@
11
# CHANGELOG
22

33

4+
## v0.35.1 (2026-03-07)
5+
6+
### Bug Fixes
7+
8+
- Use WAA server for /evaluate instead of fragile socat proxy
9+
([#115](https://github.com/OpenAdaptAI/openadapt-evals/pull/115),
10+
[`8bd1b43`](https://github.com/OpenAdaptAI/openadapt-evals/commit/8bd1b439de23d49072e68f457eda6c95f37e4153))
11+
12+
The evaluate endpoint (/evaluate) is already available on the WAA Flask server (port 5000), which is
13+
accessed via a single reliable SSH tunnel (local:5001 → VM:5000). The separate evaluate chain
14+
(local:5050 → VM:5051 → socat → docker exec → container:5050) was fragile and caused
15+
infrastructure failures when socat died mid-trial.
16+
17+
Changes: - Default --evaluate-url to None (falls back to --server URL) - Remove socat proxy setup
18+
(_setup_eval_proxy) from run_dc_eval.py - Remove port 5050 from SSH tunnel forwarding - Make
19+
done-gate non-fatal when evaluate returns infrastructure error - All scripts pass --evaluate-url
20+
only when explicitly set
21+
22+
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
23+
24+
425
## v0.35.0 (2026-03-06)
526

627
### Features

pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ build-backend = "hatchling.build"
44

55
[project]
66
name = "openadapt-evals"
7-
version = "0.35.0"
7+
version = "0.35.1"
88
description = "Evaluation infrastructure for GUI agent benchmarks"
99
readme = "README.md"
1010
requires-python = ">=3.10"

0 commit comments

Comments
 (0)