You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+102-9Lines changed: 102 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -16,23 +16,33 @@ Device automation CLI for AI agents. Mobile, TV, and desktop apps.
16
16
17
17
`agent-device` lets coding agents run real apps, inspect UI state, interact with visible elements, and collect debugging evidence through one CLI.
18
18
19
-
It is built around token-efficient accessibility snapshots, not pixel-first screenshots. Agents read compact UI trees, locate elements through refs like `@e3`, perform touch and text actions, and capture screenshots, video, logs, network, perf, and React profiles only when evidence is needed.
19
+
It is built around token-efficient accessibility snapshots, not pixel-first screenshots. Agents read compact UI trees, locate elements through refs like `@e3`, perform touch and text actions, and capture screenshots, video, logs, network, CPU/memory/perf, crash-related logs, and React profiles only when evidence is needed.
20
+
21
+
Agents can ingest the current docs from [llms-full.txt](https://incubator.callstack.com/agent-device/llms-full.txt). The installed CLI help remains authoritative for exact command syntax.
20
22
21
23
## Agentic QA And Development
22
24
23
-
-**Quality Assurance**: dogfood flows, validate PR builds, check accessibility coverage, capture evidence, and turn stable explorations into `.ad` e2e tests.
24
-
-**Development**: build from specs, reproduce crashes and support issues, inspect logs/network/perf data, and iterate until the UI matches the work.
25
+
-**Quality Assurance**: dogfood flows, validate PR builds, check accessibility coverage, and turn stable explorations into `.ad` e2e tests.
26
+
-**Development**: build from specs, inspect real runtime behavior, and iterate until the UI matches the work.
27
+
28
+
`agent-device` closes the agentic development loop: agents can write code, run the real app, verify the UI end-to-end, collect screenshots/videos/logs/perf evidence, and feed bugs, crashes, or performance findings back into the next fix iteration before a human reviews the PR.
29
+
30
+

25
31
26
32
If you know Vercel's [agent-browser](https://github.com/vercel-labs/agent-browser), this is the same idea for apps and devices.
27
33
28
-

34
+
Use it for AI mobile testing, AI QA for React Native and Expo apps, iOS Simulator automation, Android Emulator automation, tvOS/Android TV checks, and desktop app verification from coding agents. Humans install and configure `agent-device`; agents run the workflows.
35
+
36
+

37
+
38
+
Demo: Codex uses `agent-device` to inspect iOS Contacts through accessibility snapshots, interact with visible UI, and create a contact from a simple prompt.
29
39
30
40
## Quick Start
31
41
32
42
Install the CLI first:
33
43
34
44
```bash
35
-
npm install -g agent-device
45
+
npm install -g agent-device@latest
36
46
agent-device --version
37
47
agent-device help workflow
38
48
```
@@ -41,9 +51,22 @@ The CLI help is the source of truth for agents and is shipped with the installed
41
51
42
52
If you install skills separately, keep the CLI on `agent-device >= 0.14.0`. Older CLIs do not include the workflow help topics that the router skills expect.
43
53
54
+
### AI Agent Entry Points
55
+
56
+
-**Agent + terminal**: in Cursor, Codex, Claude Code, Windsurf, and similar clients, run `agent-device` in the integrated terminal. Start planning with `agent-device help workflow`; CLI help is authoritative.
57
+
-**Skills or rules**: install the skill with `npx skills add callstackincubator/agent-device`, use the bundled [agent-device skill](skills/agent-device/SKILL.md), or mirror it as a thin project rule, so the agent checks the installed version and reads `agent-device help workflow` before acting.
58
+
-**MCP router**: use `agent-device mcp` when an MCP-aware client needs install, status, and version-matched help discovery. MCP is intentionally a thin router; device automation still runs through CLI commands.
59
+
60
+
For client-specific setup, see [AI Agent Setup](https://incubator.callstack.com/agent-device/docs/agent-setup). For agent-readable docs, use [llms-full.txt](https://incubator.callstack.com/agent-device/llms-full.txt).
61
+
44
62
### MCP Router
45
63
46
-
`agent-device` also ships an official stdio MCP router for discovery-oriented clients. It exposes only `status`, `install`, and `help` tools plus workflow prompts/resources; device automation still runs through the CLI commands returned by version-matched help.
64
+
`agent-device` ships an official stdio MCP router for discovery-oriented clients. It exposes only `status`, `install`, and `help` tools plus workflow prompts/resources; it does not expose device automation or generic shell execution over MCP.
65
+
66
+
Paste one of these into clients that accept `mcpServers`, such as Cursor project `.cursor/mcp.json` or user-level MCP settings.
67
+
68
+
<details>
69
+
<summary>Global install MCP config</summary>
47
70
48
71
```json
49
72
{
@@ -56,6 +79,24 @@ If you install skills separately, keep the CLI on `agent-device >= 0.14.0`. Olde
56
79
}
57
80
```
58
81
82
+
</details>
83
+
84
+
<details>
85
+
<summary>No global install MCP config</summary>
86
+
87
+
```json
88
+
{
89
+
"mcpServers": {
90
+
"agent-device": {
91
+
"command": "npx",
92
+
"args": ["-y", "agent-device@latest", "mcp"]
93
+
}
94
+
}
95
+
}
96
+
```
97
+
98
+
</details>
99
+
59
100
Registry metadata uses MCP name `io.github.callstackincubator/agent-device`, npm package `agent-device`, stdio transport, `mcpName` package verification, `server.json`, and `smithery.yaml`.
60
101
61
102
```bash
@@ -91,20 +132,69 @@ agent-device close
91
132
92
133
Snapshots assign refs like `@e1`, `@e2`, and `@e3` to current-screen elements. Refs from the default snapshot are immediately actionable; for hidden content, scroll and re-snapshot.
93
134
135
+
### First 5 Minutes: Expo Test App
136
+
137
+
Use the bundled Expo fixture when you want a concrete first agent run with setup checks, screenshots, replay, and performance evidence. This path requires a repo checkout because `examples/test-app` and the `pnpm test-app:*` scripts are not included in the published npm package.
Use agent-device to dogfood the bundled Expo app and produce an evidence-backed report.
159
+
160
+
Setup:
161
+
- Read `agent-device help workflow`, `agent-device help dogfood`, `agent-device help debugging`, and `agent-device help react-devtools` before planning commands.
162
+
- Confirm the test app setup commands were run: `pnpm test-app:install`, `cd examples/test-app && npx expo-doctor@latest`, then `pnpm test-app:ios` or `pnpm test-app:android`.
163
+
- If Metro prints an Expo URL, prefer opening the shell with that URL. On iOS use `agent-device open "Expo Go" <url> --platform ios`; on Android use the visible Expo/dev-client target or URL. Confirm the app UI with `snapshot -i`.
164
+
165
+
Run:
166
+
- Create `./dogfood-output/screenshots`, `./dogfood-output/videos`, `./dogfood-output/traces`, `./dogfood-output/perf`, and `./dogfood-output/replays`.
167
+
- Open a named session `expo-qa` and save a replay script to `./dogfood-output/replays/expo-test.ad`.
168
+
- Use command shapes like `agent-device --session expo-qa open "Expo Go" <url> --platform ios --save-script ./dogfood-output/replays/expo-test.ad`, `agent-device --session expo-qa screenshot ./dogfood-output/screenshots/home.png`, `agent-device --session expo-qa perf --json > ./dogfood-output/perf/baseline.json`, and `agent-device --session expo-qa record start ./dogfood-output/videos/checkout.mp4`.
169
+
- Capture a baseline `snapshot -i`, screenshot, and `perf --json` sample.
170
+
- Exercise Home, Catalog, product detail, Checkout, and Settings. Re-snapshot after each mutation and use refs/selectors from fresh snapshots.
171
+
- Capture at least one overlay-ref screenshot, one normal screenshot, one short video recording for a meaningful flow, logs marks around any issue, and trace output if a runtime symptom needs diagnostics.
172
+
- Run focused performance checks: compare `perf --json` before and after a navigation or form flow; if React DevTools connects, capture profile slow/rerender output. If it cannot connect, include the status and continue.
173
+
- Close the session so the `.ad` replay is written.
174
+
175
+
Report:
176
+
- Write `./dogfood-output/report.md`.
177
+
- Link every screenshot, video, trace, log path, replay file, and performance artifact you used.
178
+
- Include setup results, platform/device, Expo doctor outcome, coverage, severity counts, findings with repro commands, and a short performance section summarizing startup/CPU/memory/frame-health or React profile findings.
179
+
- If no issues are found, report covered flows and residual risk instead of claiming the app is bug-free.
180
+
```
181
+
94
182
## Where To Run agent-device
95
183
96
184
| Path | Best for | Start with |
97
185
| --- | --- | --- |
98
186
| Local | Exploration, debugging, and development loops on simulators, emulators, physical devices, macOS apps, and Linux desktop targets. | Follow the Quick Start. |
99
187
| CI/CD | Automated PR and merge validation with replay scripts and captured artifacts. | Start with the [EAS workflow template](https://github.com/callstackincubator/eas-agent-device/blob/main/.eas/workflows/agent-qa-mobile.yml). GitHub Actions template coming soon. |
100
-
| Cloud | Linux runners, managed devices, and remote execution. | Use [Agent Device Cloud](https://agent-device.dev/cloud) or [contact Callstack](mailto:hello@callstack.com) for team-scale QA. |
188
+
| Cloud / remote execution | Linux runners, managed devices, and remote execution. | Use [Agent Device Cloud](https://agent-device.dev/cloud), see [Commands](https://incubator.callstack.com/agent-device/docs/commands) for remote profiles, or [contact Callstack](mailto:hello@callstack.com) for team-scale QA. |
101
189
102
190
## Capabilities
103
191
104
192
-**Platforms**: iOS, Android, tvOS, Android TV, macOS, and Linux. Real devices and simulators are supported.
-**Agent-native UI model**: token-efficient accessibility snapshots, current-screen refs for exploration, selectors for durable replay, and skill-tested workflow guidance.
194
+
-**Capture and debug**: screenshots, video, logs, network traffic, CPU/memory/performance data, crash-related logs, accessibility snapshots, and React render profiles.
106
195
-**Produce**: replayable `.ad` scripts (recorded replay files that run locally or in CI), e2e test runs, snapshot and screenshot diffs, and debugging artifacts.
107
196
-**React Native and Expo**: component tree inspection, props/state/hooks, and render profiling.
197
+
-**MCP boundary**: discovery and help over MCP; app/device control through the CLI for explicit, auditable commands.
108
198
-**License**: MIT. Free to use.
109
199
110
200
## How It Works
@@ -120,10 +210,13 @@ Used by teams and developers at Callstack, Expensify, Shopify, Kindred, Total Wi
@@ -139,4 +232,4 @@ See [CONTRIBUTING.md](CONTRIBUTING.md).
139
232
140
233
## Made at Callstack
141
234
142
-
agent-device is open source and MIT licensed. Try the [EAS workflow template](https://github.com/callstackincubator/eas-agent-device/blob/main/.eas/workflows/agent-qa-mobile.yml), use [Agent Device Cloud](https://agent-device.dev/cloud), or contact us at hello@callstack.com.
235
+
agent-device is open source and MIT licensed. Visit [agent-device.dev](https://agent-device.dev/), try the [EAS workflow template](https://github.com/callstackincubator/eas-agent-device/blob/main/.eas/workflows/agent-qa-mobile.yml), read the [incubator docs](https://incubator.callstack.com/agent-device/), or contact us at hello@callstack.com.
description: Configure Cursor, Codex, Claude Code, Windsurf, Cline, Goose, skills, and MCP for agent-device mobile, TV, and desktop app verification.
4
+
---
5
+
6
+
# AI Agent Setup
7
+
8
+
`agent-device` is built for AI agents, but humans usually install it, grant device permissions, and decide which agent client should use it.
9
+
10
+
Use this page to wire Cursor, Codex, Claude Code, Windsurf, Cline, Goose, or another coding agent into mobile, TV, and desktop app verification. It covers skills, project rules, and MCP setup for React Native QA, Expo app verification, iOS Simulator automation, Android Emulator automation, tvOS checks, Android TV checks, debugging, profiling, and exploratory QA.
11
+
12
+
The short version: install the CLI, make the agent read version-matched help, and let the agent run CLI commands in a terminal. MCP is available for discovery and help, not broad device control.
13
+
14
+
## Prerequisite: install the CLI
15
+
16
+
```bash
17
+
npm install -g agent-device@latest
18
+
agent-device --version
19
+
agent-device help workflow
20
+
```
21
+
22
+
For one-off use without a global install:
23
+
24
+
```bash
25
+
npx -y agent-device@latest --version
26
+
npx -y agent-device@latest help workflow
27
+
```
28
+
29
+
Global install is better for normal agent workflows because repeated commands, skills, and terminal sessions resolve to one stable version.
30
+
31
+
For Node, Xcode, Android SDK, macOS, and iOS device prerequisites, see [Installation](/docs/installation).
32
+
33
+
## Install the skill
34
+
35
+
Install the skill when your agent runtime supports skills:
36
+
37
+
```bash
38
+
npx skills add callstackincubator/agent-device
39
+
```
40
+
41
+
The bundled [agent-device skill](https://github.com/callstackincubator/agent-device/blob/main/skills/agent-device/SKILL.md) is the canonical router for skill-aware clients. It intentionally points agents back to installed CLI help instead of duplicating the command manual.
42
+
43
+
## Recommended agent rule
44
+
45
+
Add this as a project rule, custom instruction, or skill equivalent when your agent client supports it:
46
+
47
+
```text
48
+
Use agent-device only for app/device automation tasks. Before planning commands, run `agent-device --version` and read `agent-device help workflow`. For exploratory QA, read `agent-device help dogfood`. For logs, network, traces, or runtime failures, read `agent-device help debugging`. For React Native component trees, props/state/hooks, slow renders, or rerenders, read `agent-device help react-devtools`.
49
+
50
+
Use the CLI in the integrated terminal. MCP is only a discovery/help router and does not expose device automation tools. Prefer `open -> snapshot -i -> act -> re-snapshot -> verify -> close`. Use current refs such as `@e3` for exploration and selectors for durable replay. Keep mutating commands against one session serial. Capture screenshots, logs, network, perf, traces, recordings, and `.ad` replay scripts only when they add evidence.
51
+
```
52
+
53
+
## MCP router
54
+
55
+
`agent-device mcp` starts the official stdio MCP router for discovery-oriented clients. It exposes only `status`, `install`, and `help` tools plus workflow prompts/resources. Device automation still runs through the CLI commands returned by version-matched help.
56
+
57
+
Global install configuration:
58
+
59
+
```json
60
+
{
61
+
"mcpServers": {
62
+
"agent-device": {
63
+
"command": "agent-device",
64
+
"args": ["mcp"]
65
+
}
66
+
}
67
+
}
68
+
```
69
+
70
+
No global install variant:
71
+
72
+
```json
73
+
{
74
+
"mcpServers": {
75
+
"agent-device": {
76
+
"command": "npx",
77
+
"args": ["-y", "agent-device@latest", "mcp"]
78
+
}
79
+
}
80
+
}
81
+
```
82
+
83
+
Registry metadata uses MCP name `io.github.callstackincubator/agent-device`, npm package `agent-device`, stdio transport, `mcpName` package verification, `server.json`, and `smithery.yaml`.
84
+
85
+
## Cursor
86
+
87
+
Use Agent mode with the integrated terminal. Add the recommended rule above as a project rule, then run:
88
+
89
+
```bash
90
+
agent-device help workflow
91
+
agent-device apps --platform ios
92
+
agent-device open <app-or-url> --platform ios
93
+
agent-device snapshot -i
94
+
```
95
+
96
+
Optional: paste the [MCP router](#mcp-router) configuration into `.cursor/mcp.json`.
97
+
98
+
## Codex
99
+
100
+
Put the recommended rule in `AGENTS.md` or the project instructions. Let Codex run `agent-device` in the terminal:
101
+
102
+
```bash
103
+
agent-device help workflow
104
+
agent-device boot --platform ios
105
+
agent-device open <app-or-url> --platform ios
106
+
agent-device snapshot -i
107
+
```
108
+
109
+
For reviews or planning-only tasks, tell the agent not to run devices unless explicitly requested.
110
+
111
+
## Claude Code
112
+
113
+
Use the bundled skill when your Claude setup supports skills. Otherwise put the recommended rule in `CLAUDE.md`.
114
+
115
+
```bash
116
+
agent-device --version
117
+
agent-device help workflow
118
+
agent-device help dogfood
119
+
```
120
+
121
+
If you configure MCP, keep using CLI commands for automation. The MCP router gives Claude install/status/help context only.
122
+
123
+
## Windsurf, Cline, Goose, and other MCP clients
124
+
125
+
Use the [MCP router](#mcp-router) configuration when the client supports `mcpServers`, then tell the agent to run device commands through the terminal.
126
+
127
+
If the client has project rules or custom instructions, add the recommended agent rule above. If it does not, start the conversation by asking the agent to run `agent-device help workflow` before planning.
128
+
129
+
## Why this setup works
130
+
131
+
The CLI stays the auditable automation surface, installed help stays version-matched with the commands, skills and rules route agents toward the right help topics, and MCP gives discovery-oriented clients a small install/status/help entry point.
132
+
133
+
For the broader positioning, supported targets, observability features, and how `agent-device` differs from scripted test frameworks, see [Introduction](/docs/introduction). For exact command groups and platform behavior, see [Commands](/docs/commands).
134
+
135
+
For the local execution model, permissions, artifacts, and sensitive data guidance, see [Security & Trust](/docs/security-trust).
136
+
137
+
## Agent-readable docs
138
+
139
+
Use [llms-full.txt](https://incubator.callstack.com/agent-device/llms-full.txt) when an agent needs a single text bundle of the current docs. The installed CLI remains authoritative for exact command syntax:
0 commit comments