File tree Expand file tree Collapse file tree
Expand file tree Collapse file tree Original file line number Diff line number Diff line change 11# CHANGELOG
22
33
4+ ## v0.36.0 (2026-03-16)
5+
6+ ### Features
7+
8+ - Add HttpAgent, per-step evaluation, and lightweight trace export
9+ ([ #118 ] ( https://github.com/OpenAdaptAI/openadapt-evals/pull/118 ) ,
10+ [ ` e820c0a ` ] ( https://github.com/OpenAdaptAI/openadapt-evals/commit/e820c0a52a5adc81bf474f878e8009e7592733ee ) )
11+
12+ Three platform infrastructure features:
13+
14+ 1 . HttpAgent (agents/http_agent.py): Generic agent-as-HTTP-service that forwards observations to any
15+ remote endpoint and parses BenchmarkAction responses. Enables teams to deploy custom agent stacks
16+ (model + prompt + parsing) as black-box HTTP servers, cleanly solving GPU/CPU separation.
17+
18+ 2 . Per-step evaluation in RLEnvironment: New evaluate_every_step parameter calls the WAA evaluator
19+ after each step and populates info[ "evaluation_score"] . Does NOT change the reward signal —
20+ training code decides how to use it. Useful for online RL training loops.
21+
22+ 3 . LightweightTraceExporter: Plain JSON + screenshots trace export with no openadapt-ml dependency.
23+ Produces episode JSON, manifest, and JSONL training samples in a universal format.
24+
25+ All 34 new tests pass. 984 existing tests unaffected.
26+
27+ Co-authored-by: Claude Opus 4.6 (1M context) < noreply@anthropic.com >
28+
29+
430## v0.35.2 (2026-03-08)
531
632### Bug Fixes
Original file line number Diff line number Diff line change @@ -4,7 +4,7 @@ build-backend = "hatchling.build"
44
55[project ]
66name = " openadapt-evals"
7- version = " 0.35.2 "
7+ version = " 0.36.0 "
88description = " Evaluation infrastructure for GUI agent benchmarks"
99readme = " README.md"
1010requires-python = " >=3.10"
You can’t perform that action at this time.
0 commit comments