|
1 | 1 | # CHANGELOG |
2 | 2 |
|
3 | 3 |
|
| 4 | +## v0.49.0 (2026-03-20) |
| 5 | + |
| 6 | +### Documentation |
| 7 | + |
| 8 | +- Comprehensive README update for planner-grounder, workflow, and training features |
| 9 | + ([#158](https://github.com/OpenAdaptAI/openadapt-evals/pull/158), |
| 10 | + [`1cb83b3`](https://github.com/OpenAdaptAI/openadapt-evals/commit/1cb83b308717565984cd903a556f035c1135a170)) |
| 11 | + |
| 12 | +Covers ~20 PRs merged since March 17 (#134-#157): PlannerGrounderAgent dual-model architecture, |
| 13 | + TaskConfig YAML custom tasks, 4-pass workflow extraction pipeline, RL training infra (TRL GRPO |
| 14 | + rollout, AReaL workflow, OpenEnv), LocalAdapter + ScrubMiddleware for governed desktop agent, |
| 15 | + correction flywheel, strict mode, and task setup dispatch. Updated architecture tree and key files |
| 16 | + table. |
| 17 | + |
| 18 | +Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |
| 19 | + |
| 20 | +### Features |
| 21 | + |
| 22 | +- Add full evaluation runner with resume support and pool integration |
| 23 | + ([#160](https://github.com/OpenAdaptAI/openadapt-evals/pull/160), |
| 24 | + [`ada912d`](https://github.com/OpenAdaptAI/openadapt-evals/commit/ada912d8a7c14532cc26f3b8a62ba3b2769e3996)) |
| 25 | + |
| 26 | +Implement _run_external_agent in pool.py to support PlannerGrounderAgent and other external agents |
| 27 | + across pool VMs via SSH tunnels. Create run_full_eval.py script for robust unattended WAA |
| 28 | + evaluation runs with incremental JSONL checkpointing, per-task error isolation, exponential |
| 29 | + backoff retry on server drops, --resume to continue interrupted runs, --dry-run mode, |
| 30 | + --save-screenshots, progress display with ETA, and --parallel N for distributed pool execution. |
| 31 | + |
| 32 | +Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |
| 33 | + |
| 34 | +- Implement install_apps handler via winget for WAA task setup |
| 35 | + ([#159](https://github.com/OpenAdaptAI/openadapt-evals/pull/159), |
| 36 | + [`233295e`](https://github.com/OpenAdaptAI/openadapt-evals/commit/233295e454a310e323eefad6dfcf1656bd4f835c)) |
| 37 | + |
| 38 | +Replace the warning-only stub in _config_entry_to_command with a working implementation that |
| 39 | + installs apps using Windows Package Manager (winget). |
| 40 | + |
| 41 | +- Map 16 common app names (chrome, firefox, libreoffice, vlc, vscode, 7zip, notepad++, gimp, obs, |
| 42 | + audacity, paint.net) to winget package IDs - Normalize app names (hyphens/spaces to underscores) |
| 43 | + to handle WAA config inconsistencies (e.g. "libreoffice-calc" vs "libreoffice_calc") - Deduplicate |
| 44 | + installs (e.g. libreoffice_calc + libreoffice_writer both map to |
| 45 | + TheDocumentFoundation.LibreOffice) - For unknown apps, fall back to winget search and install |
| 46 | + first match - Collect failures without crashing — each app install is independent - Use 600s HTTP |
| 47 | + timeout for install_apps (vs 120s default) since winget installs can take several minutes - Accept |
| 48 | + both success (rc=0) and already-installed (rc=-1978335189) |
| 49 | + |
| 50 | +Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |
| 51 | + |
| 52 | + |
4 | 53 | ## v0.48.5 (2026-03-20) |
5 | 54 |
|
6 | 55 | ### Bug Fixes |
|
0 commit comments