v0.24.0
v0.24.0
🎉 Overview
v0.24.0 is a major architectural release that fundamentally refactors how custom models can be integrated into the AskUI SDK. The centerpiece is the new Bring-Your-Own-Model-Provider system: instead of configuring models via string identifiers and using a complicated ModelRouter abstraction, you can now simply plug typed provider instances directly into AgentSettings. Three clean interfaces, VlmProvider, ImageQAProvider, and DetectionProvider, make it straightforward to swap in your own model backends for acting, querying, and locating UI elements. Built-in providers for Anthropic and Google are included alongside the AskUI defaults.
🚨 Breaking Changes
-
Model Provider Overhaul: We removed the
ModelRouterandModelRegistryand replaced them with a newmodel_providersarchitecture. You can now bring your own model providers through three typed interfaces:VlmProvider(foract),ImageQAProvider(forget), andDetectionProvider(forlocate). Built-in providers includeAnthropicVlmProvider,GoogleImageQAProvider,AskUIVlmProvider, and more. Please see the new Bring Your Own Model Provider docs for detailed instructions.Migration: Replace
model/modelsconstructor parameters with the newsettings: AgentSettingsparameter:# Before agent = VisionAgent(model="claude-sonnet-4-20250514") # After from askui import ComputerAgent, AgentSettings from askui.model_providers import AnthropicVlmProvider agent = ComputerAgent(settings=AgentSettings( vlm_provider=AnthropicVlmProvider(model_id="claude-sonnet-4-20250514"), ))
-
VisionAgentrenamed toComputerAgent: The main agent class is nowComputerAgent.VisionAgentstill works but emits aDeprecationWarning. Similarly,AndroidVisionAgentis nowAndroidAgent. -
click()/mouse_move()modelparameter replaced: Themodelparameter onclick(),mouse_move(), andlocate()has been replaced bylocate_settings: LocateSettingsfor controlling resolution and other locate options. -
betasparameter removed fromMessageSettings: The Anthropic-specificbetasparameter was replaced with a genericprovider_options: dict[str, Any]field. To pass betas, useprovider_options={"betas": [...]}. -
Chat API removed: The Chat API (
src/askui/chat/) has been removed from the package along with its dependencies (sqlalchemy,alembic,fastapi,uvicorn,apscheduler, etc.). -
pynputAgentOs backend removed: ThePynputAgentOsimplementation and theaskui[pynput]optional dependency group have been removed. Use the defaultAskUiControllerClient(gRPC) backend instead. -
UITars model removed: The
UITarsmodel integration (src/askui/models/ui_tars_ep/) has been removed. -
OpenAI integration removed: The OpenAI-compatible model provider (
src/askui/models/openai/) has been removed. Use the new provider interfaces for custom model integrations. -
ModelCompositionandModelDefinitionremoved: These classes have been replaced by the new provider system.
✨ New Features
-
AgentSettingsfor centralized configuration: A newAgentSettingsclass provides a clean, typed configuration surface for agents with three provider slots:vlm_provider,image_qa_provider, anddetection_provider— each with sensible AskUI defaults. -
Bring-Your-Own-Model-Provider: Three abstract provider interfaces (
VlmProvider,ImageQAProvider,DetectionProvider) allow users to plug in their own models. Built-in implementations:AskUIVlmProvider,AskUIImageQAProvider,AskUIDetectionProvider(defaults)AnthropicVlmProvider,AnthropicImageQAProvider(direct Anthropic API)GoogleImageQAProvider(direct Google Gemini API)
-
mouse_movementaccepts adurationparameter to control mouse movement speed (in milliseconds, default: 500ms) by @philipph-askui in #233 -
Time and wait tools added to universal tool store by @mlikasam-askui in #234:
GetCurrentTimeTool— returns current date/time for time-aware agent decisionsWaitTool— pauses execution for a specified durationWaitWithProgressTool— wait with a visual progress barWaitUntilConditionTool— polls a condition with configurable interval and timeout
-
LocateSettingsandGetSettingsexposed in public API: Users can now control per-call locate/get behavior includingresolution,max_tokens,temperature, andsystem_prompt. -
FallbackLocateModelandFallbackGetModel: New utility classes that try multiple models in sequence until one succeeds, replacing the oldModelCompositionpattern. -
getandlocatetools in act loop: The LLM can now usegetandlocateas tools duringact()calls (only when anAgentOsis available).
🐛 Bug Fixes
-
Fixed agent crash without AgentOs:
getandlocatetools are now only added to the act loop whenagent_osis set. Agents used without an AgentOs (e.g., pure LLM pipelines) no longer crash onact(). by @philipph-askui in #237 -
Fixed OpenTelemetry import errors:
opentelemetry-sdkis now a default dependency. Instrumentor imports (FastAPIInstrumentor,HTTPXClientInstrumentor, etc.) are safely guarded withtry/exceptso installing without[otel]extras no longer causes import failures. by @philipph-askui in #238 -
Fixed typechecking issue in
not_given.py— added@finaldecorator to resolve mypy ambiguity. -
Fixed
Displaydefault value fornameparameter in AgentOS (was raising an error when executing from cache).
📚 Documentation
- Complete restructuring of docs (
00_overview.mdthrough10_extracting_data.md) - Removed outdated docs for chat API, MCP, and direct tool use
- New Bring Your Own Model Provider guide
- Updated reporting docs to distinguish between execution reports and test reports
- Updated README to reflect new
ComputerAgentclass name, corrected Python version requirement (>=3.10, <3.14), and fixed broken links
Dependencies
Removed: openai, fastapi, uvicorn, sqlalchemy, alembic, apscheduler, pynput, mss, structlog, asgi-correlation-id, starlette-context, anyio, bson, aiofiles
Added to core: opentelemetry-sdk>=1.38.0 (promoted from optional chat extras)
Optional extras changed:
askui[chat]— removedaskui[pynput]— removedaskui[otel]— now contains only the instrumentor packages (the base SDK is always available)askui[all]— now includesandroid,bedrock,tracing,vertex,web
📝 Full Changelog: v0.23.1...v0.24.0