You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: src/askui/computer_agent.py
+66-4Lines changed: 66 additions & 4 deletions
Original file line number
Diff line number
Diff line change
@@ -53,18 +53,36 @@ class ComputerAgent(Agent):
53
53
This agent can perform various UI interactions like clicking, typing, scrolling, and more.
54
54
It uses computer vision models to locate UI elements and execute actions on them.
55
55
56
+
A single `ComputerAgent` can drive **one or more machines** through the
57
+
`agent_os_servers` argument. Each entry is an Agent OS server (local
58
+
subprocess or remote gRPC endpoint) identified by a stable `computer_id`.
59
+
At any moment one server is *active* and receives all explicit calls
60
+
(`click`, `type`, `keyboard`, ...). The active server can be changed at
61
+
runtime via `agent.tools.os.switch_agent_os_server(computer_id)` or
62
+
scoped to a block using `agent.tools.os.temporary_select(computer_id)`.
63
+
The `act()` model is also given list/switch/get-active tools so it can
64
+
orchestrate work across machines on its own (e.g. read something on one
65
+
computer and re-enter it on another).
66
+
56
67
Args:
57
-
display (int, optional): The display number to use for screen interactions. Defaults to `1`.
68
+
display (int, optional): The display number to use for screen interactions on the default local server. Ignored when `agent_os_servers` is provided. Defaults to `1`.
58
69
reporters (list[Reporter] | None, optional): List of reporter instances for logging and reporting. If `None`, an empty list is used.
Agent OS servers used by the default `AskUiControllerClient`. Must contain
61
-
at least one server, at most one local, and remote addresses must be unique.
62
-
Defaults to a single local Agent OS server.
71
+
Agent OS servers the agent can route actions to. May mix one
72
+
`LocalAgentOsServer` (managing a controller subprocess on this
73
+
machine) with any number of `RemoteAgentOsServer`s pointing at
74
+
controllers already running on other machines. Constraints: at
75
+
least one server, at most one local, and remote `address`es plus
76
+
all `computer_id`s must be unique. The first entry becomes the
77
+
initial active server. Defaults to a single local server bound to
78
+
`display`.
63
79
settings (AgentSettings | None, optional): Provider-based model settings. If `None`, uses the default AskUI model stack.
64
80
retry (Retry, optional): The retry instance to use for retrying failed actions. Defaults to `ConfigurableRetry` with exponential backoff. Currently only supported for `locate()` method.
65
81
act_tools (list[Tool] | None, optional): Additional tools to make available for the `act()` method.
66
82
67
83
Example:
84
+
Single local machine (the default):
85
+
68
86
```python
69
87
from askui import ComputerAgent
70
88
@@ -73,6 +91,50 @@ class ComputerAgent(Agent):
73
91
agent.type("Hello World")
74
92
agent.act("Open settings menu")
75
93
```
94
+
95
+
Example:
96
+
Research on one machine and write up the findings on another. The
97
+
first server in the list is the active one; `temporary_select`
98
+
re-routes a block of explicit calls and restores the previous
99
+
active server on exit.
100
+
101
+
```python
102
+
from askui import ComputerAgent
103
+
from askui.tools.askui import LocalAgentOsServer, RemoteAgentOsServer
104
+
105
+
with ComputerAgent(
106
+
agent_os_servers=[
107
+
LocalAgentOsServer(computer_id="research-box"),
108
+
RemoteAgentOsServer(
109
+
address="192.168.1.42:26000",
110
+
description="Writer box with a text editor open",
111
+
computer_id="writer-box",
112
+
),
113
+
],
114
+
) as agent:
115
+
agent.act(
116
+
"On research-box, open a browser, google 'askui', and read "
117
+
"the top results to gather key facts about what AskUI is, "
118
+
"what it does, and notable features. Then switch to "
119
+
"writer-box and write a Markdown document titled "
120
+
"'AskUI Findings' summarizing those facts as a bulleted "
121
+
"list in the open text editor."
122
+
)
123
+
```
124
+
125
+
Example:
126
+
Register a remote machine at runtime:
127
+
128
+
```python
129
+
from askui import ComputerAgent
130
+
131
+
with ComputerAgent() as agent:
132
+
agent.tools.os.add_remote_agent_os_server(
133
+
address="10.0.0.5:26000",
134
+
description="Build server",
135
+
)
136
+
agent.act("Kick off a release build on the build server")
0 commit comments