Commit b234a42
feat(agents): implement Qwen3VL agent with demo-conditioned inference
Full BenchmarkAgent implementation for Qwen3-VL models with:
- Action parsing for all 9 action types (click, double_click, right_click,
type, press, scroll, drag, wait, finished)
- Coordinate denormalization from Qwen [0,1000] to BenchmarkAction [0,1]
- Think block extraction and support
- Demo injection at every step for demo-conditioned inference
- Action history tracking across steps
- Lazy model loading via transformers
- System prompt aligned with openadapt-ml SFT training data
71 tests covering action parsing, coordinate math, demo injection,
think blocks, reset behavior, imports, and edge cases.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>1 parent d7e1292 commit b234a42
2 files changed
Lines changed: 716 additions & 323 deletions
0 commit comments