Skip to content

Commit 2ebb9aa

Browse files
authored
docs: improve agent discovery onboarding (#496)
1 parent 600e956 commit 2ebb9aa

11 files changed

Lines changed: 490 additions & 128 deletions

README.md

Lines changed: 102 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -16,23 +16,33 @@ Device automation CLI for AI agents. Mobile, TV, and desktop apps.
1616

1717
`agent-device` lets coding agents run real apps, inspect UI state, interact with visible elements, and collect debugging evidence through one CLI.
1818

19-
It is built around token-efficient accessibility snapshots, not pixel-first screenshots. Agents read compact UI trees, locate elements through refs like `@e3`, perform touch and text actions, and capture screenshots, video, logs, network, perf, and React profiles only when evidence is needed.
19+
It is built around token-efficient accessibility snapshots, not pixel-first screenshots. Agents read compact UI trees, locate elements through refs like `@e3`, perform touch and text actions, and capture screenshots, video, logs, network, CPU/memory/perf, crash-related logs, and React profiles only when evidence is needed.
20+
21+
Agents can ingest the current docs from [llms-full.txt](https://incubator.callstack.com/agent-device/llms-full.txt). The installed CLI help remains authoritative for exact command syntax.
2022

2123
## Agentic QA And Development
2224

23-
- **Quality Assurance**: dogfood flows, validate PR builds, check accessibility coverage, capture evidence, and turn stable explorations into `.ad` e2e tests.
24-
- **Development**: build from specs, reproduce crashes and support issues, inspect logs/network/perf data, and iterate until the UI matches the work.
25+
- **Quality Assurance**: dogfood flows, validate PR builds, check accessibility coverage, and turn stable explorations into `.ad` e2e tests.
26+
- **Development**: build from specs, inspect real runtime behavior, and iterate until the UI matches the work.
27+
28+
`agent-device` closes the agentic development loop: agents can write code, run the real app, verify the UI end-to-end, collect screenshots/videos/logs/perf evidence, and feed bugs, crashes, or performance findings back into the next fix iteration before a human reviews the PR.
29+
30+
![Sketch showing agent-device as the live app verification layer in the agentic development loop](./website/docs/public/agentic-development-loop.svg)
2531

2632
If you know Vercel's [agent-browser](https://github.com/vercel-labs/agent-browser), this is the same idea for apps and devices.
2733

28-
![agent-device demo showing an agent inspecting and interacting with a contacts app](./website/docs/public/agent-device-contacts.gif)
34+
Use it for AI mobile testing, AI QA for React Native and Expo apps, iOS Simulator automation, Android Emulator automation, tvOS/Android TV checks, and desktop app verification from coding agents. Humans install and configure `agent-device`; agents run the workflows.
35+
36+
![agent-device demo showing Codex using agent-device to create a new contact in the iOS Contacts app from a simple prompt](./website/docs/public/agent-device-contacts.gif)
37+
38+
Demo: Codex uses `agent-device` to inspect iOS Contacts through accessibility snapshots, interact with visible UI, and create a contact from a simple prompt.
2939

3040
## Quick Start
3141

3242
Install the CLI first:
3343

3444
```bash
35-
npm install -g agent-device
45+
npm install -g agent-device@latest
3646
agent-device --version
3747
agent-device help workflow
3848
```
@@ -41,9 +51,22 @@ The CLI help is the source of truth for agents and is shipped with the installed
4151

4252
If you install skills separately, keep the CLI on `agent-device >= 0.14.0`. Older CLIs do not include the workflow help topics that the router skills expect.
4353

54+
### AI Agent Entry Points
55+
56+
- **Agent + terminal**: in Cursor, Codex, Claude Code, Windsurf, and similar clients, run `agent-device` in the integrated terminal. Start planning with `agent-device help workflow`; CLI help is authoritative.
57+
- **Skills or rules**: install the skill with `npx skills add callstackincubator/agent-device`, use the bundled [agent-device skill](skills/agent-device/SKILL.md), or mirror it as a thin project rule, so the agent checks the installed version and reads `agent-device help workflow` before acting.
58+
- **MCP router**: use `agent-device mcp` when an MCP-aware client needs install, status, and version-matched help discovery. MCP is intentionally a thin router; device automation still runs through CLI commands.
59+
60+
For client-specific setup, see [AI Agent Setup](https://incubator.callstack.com/agent-device/docs/agent-setup). For agent-readable docs, use [llms-full.txt](https://incubator.callstack.com/agent-device/llms-full.txt).
61+
4462
### MCP Router
4563

46-
`agent-device` also ships an official stdio MCP router for discovery-oriented clients. It exposes only `status`, `install`, and `help` tools plus workflow prompts/resources; device automation still runs through the CLI commands returned by version-matched help.
64+
`agent-device` ships an official stdio MCP router for discovery-oriented clients. It exposes only `status`, `install`, and `help` tools plus workflow prompts/resources; it does not expose device automation or generic shell execution over MCP.
65+
66+
Paste one of these into clients that accept `mcpServers`, such as Cursor project `.cursor/mcp.json` or user-level MCP settings.
67+
68+
<details>
69+
<summary>Global install MCP config</summary>
4770

4871
```json
4972
{
@@ -56,6 +79,24 @@ If you install skills separately, keep the CLI on `agent-device >= 0.14.0`. Olde
5679
}
5780
```
5881

82+
</details>
83+
84+
<details>
85+
<summary>No global install MCP config</summary>
86+
87+
```json
88+
{
89+
"mcpServers": {
90+
"agent-device": {
91+
"command": "npx",
92+
"args": ["-y", "agent-device@latest", "mcp"]
93+
}
94+
}
95+
}
96+
```
97+
98+
</details>
99+
59100
Registry metadata uses MCP name `io.github.callstackincubator/agent-device`, npm package `agent-device`, stdio transport, `mcpName` package verification, `server.json`, and `smithery.yaml`.
60101

61102
```bash
@@ -91,20 +132,69 @@ agent-device close
91132

92133
Snapshots assign refs like `@e1`, `@e2`, and `@e3` to current-screen elements. Refs from the default snapshot are immediately actionable; for hidden content, scroll and re-snapshot.
93134

135+
### First 5 Minutes: Expo Test App
136+
137+
Use the bundled Expo fixture when you want a concrete first agent run with setup checks, screenshots, replay, and performance evidence. This path requires a repo checkout because `examples/test-app` and the `pnpm test-app:*` scripts are not included in the published npm package.
138+
139+
```bash
140+
git clone https://github.com/callstackincubator/agent-device.git
141+
cd agent-device
142+
```
143+
144+
First terminal:
145+
146+
```bash
147+
pnpm test-app:install
148+
cd examples/test-app
149+
npx expo-doctor@latest
150+
cd ../..
151+
pnpm test-app:ios
152+
# or: pnpm test-app:android
153+
```
154+
155+
Then give your agent this prompt:
156+
157+
```text
158+
Use agent-device to dogfood the bundled Expo app and produce an evidence-backed report.
159+
160+
Setup:
161+
- Read `agent-device help workflow`, `agent-device help dogfood`, `agent-device help debugging`, and `agent-device help react-devtools` before planning commands.
162+
- Confirm the test app setup commands were run: `pnpm test-app:install`, `cd examples/test-app && npx expo-doctor@latest`, then `pnpm test-app:ios` or `pnpm test-app:android`.
163+
- If Metro prints an Expo URL, prefer opening the shell with that URL. On iOS use `agent-device open "Expo Go" <url> --platform ios`; on Android use the visible Expo/dev-client target or URL. Confirm the app UI with `snapshot -i`.
164+
165+
Run:
166+
- Create `./dogfood-output/screenshots`, `./dogfood-output/videos`, `./dogfood-output/traces`, `./dogfood-output/perf`, and `./dogfood-output/replays`.
167+
- Open a named session `expo-qa` and save a replay script to `./dogfood-output/replays/expo-test.ad`.
168+
- Use command shapes like `agent-device --session expo-qa open "Expo Go" <url> --platform ios --save-script ./dogfood-output/replays/expo-test.ad`, `agent-device --session expo-qa screenshot ./dogfood-output/screenshots/home.png`, `agent-device --session expo-qa perf --json > ./dogfood-output/perf/baseline.json`, and `agent-device --session expo-qa record start ./dogfood-output/videos/checkout.mp4`.
169+
- Capture a baseline `snapshot -i`, screenshot, and `perf --json` sample.
170+
- Exercise Home, Catalog, product detail, Checkout, and Settings. Re-snapshot after each mutation and use refs/selectors from fresh snapshots.
171+
- Capture at least one overlay-ref screenshot, one normal screenshot, one short video recording for a meaningful flow, logs marks around any issue, and trace output if a runtime symptom needs diagnostics.
172+
- Run focused performance checks: compare `perf --json` before and after a navigation or form flow; if React DevTools connects, capture profile slow/rerender output. If it cannot connect, include the status and continue.
173+
- Close the session so the `.ad` replay is written.
174+
175+
Report:
176+
- Write `./dogfood-output/report.md`.
177+
- Link every screenshot, video, trace, log path, replay file, and performance artifact you used.
178+
- Include setup results, platform/device, Expo doctor outcome, coverage, severity counts, findings with repro commands, and a short performance section summarizing startup/CPU/memory/frame-health or React profile findings.
179+
- If no issues are found, report covered flows and residual risk instead of claiming the app is bug-free.
180+
```
181+
94182
## Where To Run agent-device
95183

96184
| Path | Best for | Start with |
97185
| --- | --- | --- |
98186
| Local | Exploration, debugging, and development loops on simulators, emulators, physical devices, macOS apps, and Linux desktop targets. | Follow the Quick Start. |
99187
| CI/CD | Automated PR and merge validation with replay scripts and captured artifacts. | Start with the [EAS workflow template](https://github.com/callstackincubator/eas-agent-device/blob/main/.eas/workflows/agent-qa-mobile.yml). GitHub Actions template coming soon. |
100-
| Cloud | Linux runners, managed devices, and remote execution. | Use [Agent Device Cloud](https://agent-device.dev/cloud) or [contact Callstack](mailto:hello@callstack.com) for team-scale QA. |
188+
| Cloud / remote execution | Linux runners, managed devices, and remote execution. | Use [Agent Device Cloud](https://agent-device.dev/cloud), see [Commands](https://incubator.callstack.com/agent-device/docs/commands) for remote profiles, or [contact Callstack](mailto:hello@callstack.com) for team-scale QA. |
101189

102190
## Capabilities
103191

104192
- **Platforms**: iOS, Android, tvOS, Android TV, macOS, and Linux. Real devices and simulators are supported.
105-
- **Capture**: screenshots, video, logs, network traffic, performance data, accessibility snapshots, and React render profiles.
193+
- **Agent-native UI model**: token-efficient accessibility snapshots, current-screen refs for exploration, selectors for durable replay, and skill-tested workflow guidance.
194+
- **Capture and debug**: screenshots, video, logs, network traffic, CPU/memory/performance data, crash-related logs, accessibility snapshots, and React render profiles.
106195
- **Produce**: replayable `.ad` scripts (recorded replay files that run locally or in CI), e2e test runs, snapshot and screenshot diffs, and debugging artifacts.
107196
- **React Native and Expo**: component tree inspection, props/state/hooks, and render profiling.
197+
- **MCP boundary**: discovery and help over MCP; app/device control through the CLI for explicit, auditable commands.
108198
- **License**: MIT. Free to use.
109199

110200
## How It Works
@@ -120,10 +210,13 @@ Used by teams and developers at Callstack, Expensify, Shopify, Kindred, Total Wi
120210
## Documentation
121211

122212
- [Installation](https://incubator.callstack.com/agent-device/docs/installation)
213+
- [AI Agent Setup](https://incubator.callstack.com/agent-device/docs/agent-setup)
123214
- [Typed Client](https://incubator.callstack.com/agent-device/docs/client-api)
124215
- [Commands](https://incubator.callstack.com/agent-device/docs/commands)
125216
- [Replay & E2E](https://incubator.callstack.com/agent-device/docs/replay-e2e)
217+
- [Security & Trust](https://incubator.callstack.com/agent-device/docs/security-trust)
126218
- [Known limitations](https://incubator.callstack.com/agent-device/docs/known-limitations)
219+
- [llms-full.txt](https://incubator.callstack.com/agent-device/llms-full.txt)
127220

128221
Agent integration:
129222

@@ -139,4 +232,4 @@ See [CONTRIBUTING.md](CONTRIBUTING.md).
139232

140233
## Made at Callstack
141234

142-
agent-device is open source and MIT licensed. Try the [EAS workflow template](https://github.com/callstackincubator/eas-agent-device/blob/main/.eas/workflows/agent-qa-mobile.yml), use [Agent Device Cloud](https://agent-device.dev/cloud), or contact us at hello@callstack.com.
235+
agent-device is open source and MIT licensed. Visit [agent-device.dev](https://agent-device.dev/), try the [EAS workflow template](https://github.com/callstackincubator/eas-agent-device/blob/main/.eas/workflows/agent-qa-mobile.yml), read the [incubator docs](https://incubator.callstack.com/agent-device/), or contact us at hello@callstack.com.

package.json

Lines changed: 21 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
{
22
"name": "agent-device",
33
"version": "0.14.7",
4-
"description": "Agent-driven CLI for mobile UI automation, network inspection, and performance diagnostics across iOS, Android, tvOS, and macOS.",
4+
"description": "Agent-native CLI for AI mobile testing and app automation across iOS, Android, tvOS, Android TV, macOS, and Linux.",
55
"mcpName": "io.github.callstackincubator/agent-device",
66
"license": "MIT",
77
"author": "Callstack",
@@ -160,7 +160,26 @@
160160
"performance",
161161
"mcp",
162162
"model-context-protocol",
163-
"mcp-server"
163+
"mcp-server",
164+
"ai-agent",
165+
"mobile-automation",
166+
"ios-simulator",
167+
"android-emulator",
168+
"xcuitest",
169+
"e2e-testing",
170+
"cursor",
171+
"claude-code",
172+
"expo",
173+
"mobile-testing",
174+
"qa-automation",
175+
"ai-testing",
176+
"ios-automation",
177+
"android-automation",
178+
"simulator",
179+
"emulator",
180+
"appium",
181+
"maestro",
182+
"detox"
164183
],
165184
"dependencies": {
166185
"fast-xml-parser": "^5.7.2",

website/docs/docs/_meta.json

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,11 @@
99
"type": "file",
1010
"label": "Installation"
1111
},
12+
{
13+
"name": "agent-setup",
14+
"type": "file",
15+
"label": "AI Agent Setup"
16+
},
1217
{
1318
"name": "quick-start",
1419
"type": "file",
@@ -34,6 +39,11 @@
3439
"type": "file",
3540
"label": "Configuration"
3641
},
42+
{
43+
"name": "security-trust",
44+
"type": "file",
45+
"label": "Security & Trust"
46+
},
3747
{
3848
"name": "batching",
3949
"type": "file",

website/docs/docs/agent-setup.md

Lines changed: 145 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,145 @@
1+
---
2+
title: AI Agent Setup
3+
description: Configure Cursor, Codex, Claude Code, Windsurf, Cline, Goose, skills, and MCP for agent-device mobile, TV, and desktop app verification.
4+
---
5+
6+
# AI Agent Setup
7+
8+
`agent-device` is built for AI agents, but humans usually install it, grant device permissions, and decide which agent client should use it.
9+
10+
Use this page to wire Cursor, Codex, Claude Code, Windsurf, Cline, Goose, or another coding agent into mobile, TV, and desktop app verification. It covers skills, project rules, and MCP setup for React Native QA, Expo app verification, iOS Simulator automation, Android Emulator automation, tvOS checks, Android TV checks, debugging, profiling, and exploratory QA.
11+
12+
The short version: install the CLI, make the agent read version-matched help, and let the agent run CLI commands in a terminal. MCP is available for discovery and help, not broad device control.
13+
14+
## Prerequisite: install the CLI
15+
16+
```bash
17+
npm install -g agent-device@latest
18+
agent-device --version
19+
agent-device help workflow
20+
```
21+
22+
For one-off use without a global install:
23+
24+
```bash
25+
npx -y agent-device@latest --version
26+
npx -y agent-device@latest help workflow
27+
```
28+
29+
Global install is better for normal agent workflows because repeated commands, skills, and terminal sessions resolve to one stable version.
30+
31+
For Node, Xcode, Android SDK, macOS, and iOS device prerequisites, see [Installation](/docs/installation).
32+
33+
## Install the skill
34+
35+
Install the skill when your agent runtime supports skills:
36+
37+
```bash
38+
npx skills add callstackincubator/agent-device
39+
```
40+
41+
The bundled [agent-device skill](https://github.com/callstackincubator/agent-device/blob/main/skills/agent-device/SKILL.md) is the canonical router for skill-aware clients. It intentionally points agents back to installed CLI help instead of duplicating the command manual.
42+
43+
## Recommended agent rule
44+
45+
Add this as a project rule, custom instruction, or skill equivalent when your agent client supports it:
46+
47+
```text
48+
Use agent-device only for app/device automation tasks. Before planning commands, run `agent-device --version` and read `agent-device help workflow`. For exploratory QA, read `agent-device help dogfood`. For logs, network, traces, or runtime failures, read `agent-device help debugging`. For React Native component trees, props/state/hooks, slow renders, or rerenders, read `agent-device help react-devtools`.
49+
50+
Use the CLI in the integrated terminal. MCP is only a discovery/help router and does not expose device automation tools. Prefer `open -> snapshot -i -> act -> re-snapshot -> verify -> close`. Use current refs such as `@e3` for exploration and selectors for durable replay. Keep mutating commands against one session serial. Capture screenshots, logs, network, perf, traces, recordings, and `.ad` replay scripts only when they add evidence.
51+
```
52+
53+
## MCP router
54+
55+
`agent-device mcp` starts the official stdio MCP router for discovery-oriented clients. It exposes only `status`, `install`, and `help` tools plus workflow prompts/resources. Device automation still runs through the CLI commands returned by version-matched help.
56+
57+
Global install configuration:
58+
59+
```json
60+
{
61+
"mcpServers": {
62+
"agent-device": {
63+
"command": "agent-device",
64+
"args": ["mcp"]
65+
}
66+
}
67+
}
68+
```
69+
70+
No global install variant:
71+
72+
```json
73+
{
74+
"mcpServers": {
75+
"agent-device": {
76+
"command": "npx",
77+
"args": ["-y", "agent-device@latest", "mcp"]
78+
}
79+
}
80+
}
81+
```
82+
83+
Registry metadata uses MCP name `io.github.callstackincubator/agent-device`, npm package `agent-device`, stdio transport, `mcpName` package verification, `server.json`, and `smithery.yaml`.
84+
85+
## Cursor
86+
87+
Use Agent mode with the integrated terminal. Add the recommended rule above as a project rule, then run:
88+
89+
```bash
90+
agent-device help workflow
91+
agent-device apps --platform ios
92+
agent-device open <app-or-url> --platform ios
93+
agent-device snapshot -i
94+
```
95+
96+
Optional: paste the [MCP router](#mcp-router) configuration into `.cursor/mcp.json`.
97+
98+
## Codex
99+
100+
Put the recommended rule in `AGENTS.md` or the project instructions. Let Codex run `agent-device` in the terminal:
101+
102+
```bash
103+
agent-device help workflow
104+
agent-device boot --platform ios
105+
agent-device open <app-or-url> --platform ios
106+
agent-device snapshot -i
107+
```
108+
109+
For reviews or planning-only tasks, tell the agent not to run devices unless explicitly requested.
110+
111+
## Claude Code
112+
113+
Use the bundled skill when your Claude setup supports skills. Otherwise put the recommended rule in `CLAUDE.md`.
114+
115+
```bash
116+
agent-device --version
117+
agent-device help workflow
118+
agent-device help dogfood
119+
```
120+
121+
If you configure MCP, keep using CLI commands for automation. The MCP router gives Claude install/status/help context only.
122+
123+
## Windsurf, Cline, Goose, and other MCP clients
124+
125+
Use the [MCP router](#mcp-router) configuration when the client supports `mcpServers`, then tell the agent to run device commands through the terminal.
126+
127+
If the client has project rules or custom instructions, add the recommended agent rule above. If it does not, start the conversation by asking the agent to run `agent-device help workflow` before planning.
128+
129+
## Why this setup works
130+
131+
The CLI stays the auditable automation surface, installed help stays version-matched with the commands, skills and rules route agents toward the right help topics, and MCP gives discovery-oriented clients a small install/status/help entry point.
132+
133+
For the broader positioning, supported targets, observability features, and how `agent-device` differs from scripted test frameworks, see [Introduction](/docs/introduction). For exact command groups and platform behavior, see [Commands](/docs/commands).
134+
135+
For the local execution model, permissions, artifacts, and sensitive data guidance, see [Security & Trust](/docs/security-trust).
136+
137+
## Agent-readable docs
138+
139+
Use [llms-full.txt](https://incubator.callstack.com/agent-device/llms-full.txt) when an agent needs a single text bundle of the current docs. The installed CLI remains authoritative for exact command syntax:
140+
141+
```bash
142+
agent-device help
143+
agent-device help workflow
144+
agent-device help dogfood
145+
```

0 commit comments

Comments
 (0)