File tree Expand file tree Collapse file tree
Expand file tree Collapse file tree Original file line number Diff line number Diff line change 11# CHANGELOG
22
33
4+ ## v0.68.0 (2026-03-23)
5+
6+ ### Features
7+
8+ - Add standalone GRPO trainer with WAADirect (no openadapt-ml dependency)
9+ ([ #191 ] ( https://github.com/OpenAdaptAI/openadapt-evals/pull/191 ) ,
10+ [ ` ba049f7 ` ] ( https://github.com/OpenAdaptAI/openadapt-evals/commit/ba049f74f82bb3034465cf4368f4808f22b5f7e9 ) )
11+
12+ Self-contained GRPO training package that eliminates the openadapt-ml dependency for RL training.
13+ Uses direct HTTP calls to WAA Flask server (WAADirect) instead of the WAALiveAdapter +
14+ RLEnvironment stack, removing version coupling and adapter indirection.
15+
16+ Package structure (695 LOC total): - config.py: TrainingConfig dataclass - waa_direct.py: WAADirect
17+ HTTP client (screenshot/click/type/key) - prompt.py: SYSTEM_PROMPT + build_agent_messages +
18+ parse_vlm_output_to_action - reward.py: compute_group_advantages + evaluate_milestones_screenshot
19+ - model_loader.py: load_model_and_processor (HF + PEFT) - trainer.py: GRPOTrainer with rollout
20+ collection + training loop
21+
22+ Key design decisions: - ZERO openadapt-ml imports (self-contained, will migrate later) -
23+ max_new_tokens=2048 default (100 was catastrophically low) - Multi-format parser (Thought/Action,
24+ bare DSL, JSON) - Fresh screenshot for evaluation (not cached) - Per-step backward to avoid OOM on
25+ long trajectories - VLM judge via OpenAI API for milestone evaluation
26+
27+ Co-authored-by: Claude Opus 4.6 (1M context) < noreply@anthropic.com >
28+
29+
430## v0.67.0 (2026-03-23)
531
632### Features
Original file line number Diff line number Diff line change @@ -4,7 +4,7 @@ build-backend = "hatchling.build"
44
55[project ]
66name = " openadapt-evals"
7- version = " 0.67 .0"
7+ version = " 0.68 .0"
88description = " Evaluation infrastructure for GUI agent benchmarks"
99readme = " README.md"
1010requires-python = " >=3.10"
You can’t perform that action at this time.
0 commit comments