Skip to content

Commit 06a8778

Browse files
authored
Enable iOS device core parity and align docs/skills (#55)
* Add iOS device runner support and document platform matrix * ios: make signing identity override-only * ios: address PR55 review follow-ups * docs: remove v1 phrasing from support messages * ios appstate: prefer xctest before ax fallback * docs: clarify ios runner derived path override behavior * ios: harden runner polling and appstate fallback * ios: remove automatic AX fallback paths * docs: clarify no automatic AX fallback * ios: replace ad-hoc runner poll sleep with retry policy * Harden iOS runner timeouts and cleanup safety * args: align scrollintoview schema with capabilities * test: skip iOS integration flows on CI by default
1 parent 4e91840 commit 06a8778

31 files changed

Lines changed: 979 additions & 190 deletions

README.md

Lines changed: 21 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -13,8 +13,8 @@ CLI to control iOS and Android devices for AI agents influenced by Vercel’s [a
1313
The project is in early development and considered experimental. Pull requests are welcome!
1414

1515
## Features
16-
- Platforms: iOS (simulator + limited device support) and Android (emulator + device).
17-
- Core commands: `open`, `back`, `home`, `app-switcher`, `press`, `long-press`, `swipe`, `focus`, `type`, `fill`, `scroll`, `scrollintoview`, `pinch`, `wait`, `alert`, `screenshot`, `close`, `reinstall`.
16+
- Platforms: iOS (simulator + physical device core automation) and Android (emulator + device).
17+
- Core commands: `open`, `back`, `home`, `app-switcher`, `press`, `long-press`, `focus`, `type`, `fill`, `scroll`, `scrollintoview`, `wait`, `alert`, `screenshot`, `close`, `reinstall`.
1818
- Inspection commands: `snapshot` (accessibility tree).
1919
- Device tooling: `adb` (Android), `simctl`/`devicectl` (iOS via Xcode).
2020
- Minimal dependencies; TypeScript executed directly on Node 22+ (no build step).
@@ -99,9 +99,10 @@ agent-device swipe 540 1500 540 500 120 --count 8 --pause-ms 30 --pattern ping-p
9999
| `ax` | Fast | Medium | Accessibility permission for the terminal app, not recommended |
100100

101101
Notes:
102-
- Default backend is `xctest` on iOS.
102+
- Default backend is `xctest` on iOS simulators and iOS devices.
103103
- Scope snapshots with `-s "<label>"` or `-s @ref`.
104-
- If XCTest returns 0 nodes (e.g., foreground app changed), agent-device falls back to AX when available.
104+
- If XCTest returns 0 nodes (e.g., foreground app changed), agent-device fails explicitly.
105+
- `ax` backend is simulator-only.
105106

106107
Flags:
107108
- `--version, -V` print version and exit
@@ -150,13 +151,13 @@ Navigation helpers:
150151
- `boot --platform ios|android` ensures the target is ready without launching an app.
151152
- Use `boot` mainly when starting a new session and `open` fails because no booted simulator/emulator is available.
152153
- `open [app|url]` already boots/activates the selected target when needed.
153-
- `reinstall <app> <path>` uninstalls and installs the app binary in one command (Android + iOS simulator in v1).
154+
- `reinstall <app> <path>` uninstalls and installs the app binary in one command (Android + iOS simulator).
154155
- `reinstall` accepts package/bundle id style app names and supports `~` in paths.
155156

156157
Deep links:
157158
- `open <url>` supports deep links with `scheme://...`.
158159
- Android opens deep links via `VIEW` intent.
159-
- iOS deep link open is simulator-only in v1.
160+
- iOS deep link open is simulator-only.
160161
- `--activity` cannot be combined with URL opens.
161162

162163
```bash
@@ -207,22 +208,22 @@ Android fill reliability:
207208
- If value does not match, agent-device clears the field and retries once with slower typing.
208209
- This reduces IME-related character swaps on long strings (e.g. emails and IDs).
209210

210-
Settings helpers (simulators):
211+
Settings helpers:
211212
- `settings wifi on|off`
212213
- `settings airplane on|off`
213214
- `settings location on|off` (iOS uses per-app permission for the current session app)
214-
Note: iOS wifi/airplane toggles status bar indicators, not actual network state. Airplane off clears status bar overrides.
215+
Note: iOS supports these only on simulators. iOS wifi/airplane toggles status bar indicators, not actual network state. Airplane off clears status bar overrides.
215216

216217
App state:
217-
- `appstate` shows the foreground app/activity (Android). On iOS it uses the current session app when available, otherwise it falls back to a snapshot-based guess (AX first, XCTest if AX can’t identify).
218+
- `appstate` shows the foreground app/activity (Android). On iOS it uses the current session app when available, otherwise it resolves via XCTest snapshot.
218219
- `apps --metadata` returns app list with minimal metadata.
219220

220221
## Debug
221222

222223
- `agent-device trace start`
223224
- `agent-device trace stop ./trace.log`
224225
- The trace log includes snapshot logs and XCTest runner logs for the session.
225-
- Built-in retries cover transient runner connection failures, AX snapshot hiccups, and Android UI dumps.
226+
- Built-in retries cover transient runner connection failures and Android UI dumps.
226227
- For snapshot issues (missing elements), compare with `--raw` flag for unaltered output and scope with `-s "<label>"`.
227228

228229
Boot diagnostics:
@@ -238,9 +239,10 @@ Boot diagnostics:
238239
- Built-in aliases include `Settings` for both platforms.
239240

240241
## iOS notes
241-
- Input commands (`press`, `type`, `scroll`, etc.) are supported only on simulators in v1 and use the XCTest runner.
242-
- `alert` and `scrollintoview` use the XCTest runner and are simulator-only in v1.
243-
- Real device support (including snapshots) is on the roadmap for iOS.
242+
- Core runner commands (`snapshot`, `wait`, `click`, `fill`, `get`, `is`, `find`, `press`, `long-press`, `focus`, `type`, `scroll`, `scrollintoview`, `back`, `home`, `app-switcher`) support iOS simulators and iOS devices.
243+
- Simulator-only commands: `alert`, `pinch`, `record`, `reinstall`, `apps`, `settings`.
244+
- iOS deep link open (`open <url>`) is simulator-only.
245+
- iOS device runs require valid signing/provisioning (Automatic Signing recommended). Optional overrides: `AGENT_DEVICE_IOS_TEAM_ID`, `AGENT_DEVICE_IOS_SIGNING_IDENTITY`, `AGENT_DEVICE_IOS_PROVISIONING_PROFILE`.
244246

245247
## Testing
246248

@@ -266,6 +268,12 @@ Environment selectors:
266268
- `ANDROID_DEVICE=Pixel_9_Pro_XL` or `ANDROID_SERIAL=emulator-5554`
267269
- `IOS_DEVICE="iPhone 17 Pro"` or `IOS_UDID=<udid>`
268270
- `AGENT_DEVICE_IOS_BOOT_TIMEOUT_MS=<ms>` to adjust iOS simulator boot timeout (default: `120000`, minimum: `5000`).
271+
- `AGENT_DEVICE_DAEMON_TIMEOUT_MS=<ms>` to increase daemon request timeout for slow first-run iOS device setup (for example `180000`).
272+
- `AGENT_DEVICE_IOS_TEAM_ID=<team-id>` optional Team ID override for iOS device runner signing.
273+
- `AGENT_DEVICE_IOS_SIGNING_IDENTITY=<identity>` optional signing identity override.
274+
- `AGENT_DEVICE_IOS_PROVISIONING_PROFILE=<profile>` optional provisioning profile specifier for iOS device runner signing.
275+
- `AGENT_DEVICE_IOS_RUNNER_DERIVED_PATH=<path>` optional override for iOS runner derived data root. By default, agent-device separates caches by target kind (`.../derived/simulator` and `.../derived/device`). If you set this override, use separate paths per kind to avoid simulator/device artifact collisions.
276+
- `AGENT_DEVICE_IOS_CLEAN_DERIVED=1` rebuild iOS runner artifacts from scratch. When `AGENT_DEVICE_IOS_RUNNER_DERIVED_PATH` is set, cleanup is blocked by default; set `AGENT_DEVICE_IOS_ALLOW_OVERRIDE_DERIVED_CLEAN=1` only for trusted custom paths.
269277

270278
Test screenshots are written to:
271279
- `test/screenshots/android-settings.png`

ios-runner/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,4 +8,4 @@ This folder is reserved for the lightweight XCUITest runner used to provide elem
88
- Support simulator prebuilds where compatible.
99

1010
## Status
11-
Planned for v1 automation layer. See `docs/ios-automation.md` and `docs/ios-runner-protocol.md`.
11+
Planned for the automation layer. See `docs/ios-automation.md` and `docs/ios-runner-protocol.md`.

skills/agent-device/SKILL.md

Lines changed: 14 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
22
name: agent-device
3-
description: Automates mobile and simulator interactions for iOS and Android devices. Use when navigating apps, taking snapshots/screenshots, tapping, typing, scrolling, pinching, or extracting UI info on mobile devices or simulators.
3+
description: Automates interactions for iOS simulators/devices and Android emulators/devices. Use when navigating apps, taking snapshots/screenshots, tapping, typing, scrolling, or extracting UI info on mobile targets.
44
---
55

66
# Mobile Automation with agent-device
@@ -39,13 +39,13 @@ npx -y agent-device
3939

4040
```bash
4141
agent-device boot # Ensure target is booted/ready without opening app
42-
agent-device boot --platform ios # Boot iOS simulator
42+
agent-device boot --platform ios # Boot iOS simulator/device target
4343
agent-device boot --platform android # Boot Android emulator/device target
4444
agent-device open [app|url] # Boot device/simulator; optionally launch app or deep link URL
4545
agent-device open [app] --relaunch # Terminate app process first, then launch (fresh runtime)
4646
agent-device open [app] --activity com.example/.MainActivity # Android: open specific activity (app targets only)
4747
agent-device open "myapp://home" --platform android # Android deep link
48-
agent-device open "https://example.com" --platform ios # iOS simulator deep link
48+
agent-device open "https://example.com" --platform ios # iOS simulator deep link (device unsupported)
4949
agent-device close [app] # Close app or just end session
5050
agent-device reinstall <app> <path> # Uninstall + install app in one command
5151
agent-device session list # List active sessions
@@ -64,10 +64,10 @@ agent-device snapshot -d 3 # Limit depth
6464
agent-device snapshot -s "Camera" # Scope to label/identifier
6565
agent-device snapshot --raw # Raw node output
6666
agent-device snapshot --backend xctest # default: XCTest snapshot (fast, complete, no permissions)
67-
agent-device snapshot --backend ax # macOS Accessibility tree (fast, needs permissions, less fidelity, optional)
67+
agent-device snapshot --backend ax # macOS Accessibility tree (manual diagnostics only; no automatic fallback)
6868
```
6969

70-
XCTest is the default: fast and complete and does not require permissions. Use it in most cases and only fall back to AX when something breaks.
70+
XCTest is the default: fast and complete and does not require permissions. Use AX only for manual diagnostics, and prefer XCTest for normal automation flows. agent-device does not automatically fall back to AX.
7171

7272
### Find (semantic)
7373

@@ -82,7 +82,7 @@ agent-device find "Settings" wait 10000
8282
agent-device find "Settings" exists
8383
```
8484

85-
### Settings helpers (simulators)
85+
### Settings helpers
8686

8787
```bash
8888
agent-device settings wifi on
@@ -95,6 +95,7 @@ agent-device settings location off
9595

9696
Note: iOS wifi/airplane toggles status bar indicators, not actual network state.
9797
Airplane off clears status bar overrides.
98+
iOS settings helpers are simulator-only.
9899

99100
### App state
100101

@@ -118,8 +119,8 @@ agent-device swipe 540 1500 540 500 120
118119
agent-device swipe 540 1500 540 500 120 --count 8 --pause-ms 30 --pattern ping-pong
119120
agent-device long-press 300 500 800 # Long press (where supported)
120121
agent-device scroll down 0.5
121-
agent-device pinch 2.0 # Zoom in 2x (iOS simulator)
122-
agent-device pinch 0.5 200 400 # Zoom out at coordinates (iOS simulator)
122+
agent-device pinch 2.0 # Zoom in 2x (iOS simulator only)
123+
agent-device pinch 0.5 200 400 # Zoom out at coordinates (iOS simulator only)
123124
agent-device back
124125
agent-device home
125126
agent-device app-switcher
@@ -174,19 +175,21 @@ agent-device apps --platform android --user-installed
174175
- `press` supports gesture series controls: `--count`, `--interval-ms`, `--hold-ms`, `--jitter-px`.
175176
- `swipe` supports coordinate + timing controls and repeat patterns: `swipe x1 y1 x2 y2 [durationMs] --count --pause-ms --pattern`.
176177
- `swipe` timing is platform-safe: Android uses requested duration; iOS uses normalized safe timing to avoid long-press side effects.
177-
- Pinch (`pinch <scale> [x y]`) is currently supported on iOS simulators only.
178+
- Pinch (`pinch <scale> [x y]`) is iOS simulator-only; scale > 1 zooms in, < 1 zooms out.
178179
- Snapshot refs are the core mechanism for interactive agent flows.
179180
- Use selectors for deterministic replay artifacts and assertions (e.g. in e2e test workflows).
180181
- Prefer `snapshot -i` to reduce output size.
181182
- On iOS, `xctest` is the default and does not require Accessibility permission.
182-
- If XCTest returns 0 nodes (foreground app changed), agent-device falls back to AX when available.
183+
- If XCTest returns 0 nodes (foreground app changed), treat it as an explicit failure and retry the flow/app state.
183184
- `open <app|url>` can be used within an existing session to switch apps or open deep links.
184185
- `open <app>` updates session app bundle context; URL opens do not set an app bundle id.
185186
- Use `open <app> --relaunch` during React Native/Fast Refresh debugging when you need a fresh app process without ending the session.
186187
- If AX returns the Simulator window or empty tree, restart Simulator or use `--backend xctest`.
187188
- Use `--session <name>` for parallel sessions; avoid device contention.
188189
- Use `--activity <component>` on Android to launch a specific activity (e.g. TV apps with LEANBACK); do not combine with URL opens.
189-
- iOS deep-link opens are simulator-only in v1.
190+
- iOS deep-link opens are simulator-only.
191+
- iOS physical-device runner requires Xcode signing/provisioning; optional overrides: `AGENT_DEVICE_IOS_TEAM_ID`, `AGENT_DEVICE_IOS_SIGNING_IDENTITY`, `AGENT_DEVICE_IOS_PROVISIONING_PROFILE`.
192+
- For long first-run physical-device setup/build, increase daemon timeout: `AGENT_DEVICE_DAEMON_TIMEOUT_MS=180000` (or higher).
190193
- Use `fill` when you want clear-then-type semantics.
191194
- Use `type` when you want to append/enter text without clearing.
192195
- On Android, prefer `fill` for important fields; it verifies entered text and retries once when IME reorders characters.

skills/agent-device/references/permissions.md

Lines changed: 15 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
## iOS AX snapshot
44

5-
AX snapshot is an alternative to XCTest for when it fails (which shouldn't happen usually); it uses macOS Accessibility APIs and requires permission:
5+
AX snapshot is available for manual diagnostics when needed; it is not used as an automatic fallback. It uses macOS Accessibility APIs and requires permission:
66

77
System Settings > Privacy & Security > Accessibility
88

@@ -13,6 +13,20 @@ agent-device snapshot --backend xctest --platform ios
1313
```
1414

1515
Hybrid/AX is fast; XCTest is equally fast but does not require permissions.
16+
AX backend is simulator-only.
17+
18+
## iOS physical device runner
19+
20+
For iOS physical devices, XCTest runner setup requires valid signing/provisioning.
21+
Use Automatic Signing in Xcode, or provide optional overrides:
22+
23+
- `AGENT_DEVICE_IOS_TEAM_ID`
24+
- `AGENT_DEVICE_IOS_SIGNING_IDENTITY`
25+
- `AGENT_DEVICE_IOS_PROVISIONING_PROFILE`
26+
27+
If first-run setup/build takes long, increase:
28+
29+
- `AGENT_DEVICE_DAEMON_TIMEOUT_MS` (for example `180000`)
1630

1731
## Simulator troubleshooting
1832

skills/agent-device/references/session-management.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@ Sessions isolate device context. A device can only be held by one session at a t
1414
- Name sessions semantically.
1515
- Close sessions when done.
1616
- Use separate sessions for parallel work.
17+
- In iOS sessions, use `open <app>` for simulator/device. `open <url>` is simulator-only.
1718
- For dev loops where runtime state can persist (for example React Native Fast Refresh), use `open <app> --relaunch` to restart the app process in the same session.
1819
- For deterministic replay scripts, prefer selector-based actions and assertions.
1920
- Use `replay -u` to update selector drift during maintenance.

skills/agent-device/references/snapshot-refs.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -55,6 +55,8 @@ agent-device snapshot -i -s @e3
5555
- Ref not found: re-snapshot.
5656
- AX returns Simulator window: restart Simulator and re-run.
5757
- AX empty: verify Accessibility permission or use `--backend xctest` (XCTest is more complete).
58+
- AX backend is simulator-only; use `--backend xctest` on iOS devices.
59+
- agent-device does not automatically fall back to AX when XCTest fails.
5860

5961
## Replay note
6062

skills/agent-device/references/video-recording.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,8 @@ agent-device close
2020
agent-device record stop
2121
```
2222

23+
`record` is iOS simulator-only.
24+
2325
## Android Emulator/Device
2426

2527
Use `agent-device record` commands (wrapper around adb):

src/core/__tests__/capabilities.test.ts

Lines changed: 11 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -32,10 +32,17 @@ test('iOS simulator-only commands reject iOS devices and Android', () => {
3232
}
3333
});
3434

35-
test('iOS simulator + Android commands reject iOS devices', () => {
35+
test('simulator-only iOS commands with Android support reject iOS devices', () => {
36+
for (const cmd of ['apps', 'reinstall', 'record', 'settings', 'swipe']) {
37+
assert.equal(isCommandSupportedOnDevice(cmd, iosSimulator), true, `${cmd} on iOS sim`);
38+
assert.equal(isCommandSupportedOnDevice(cmd, iosDevice), false, `${cmd} on iOS device`);
39+
assert.equal(isCommandSupportedOnDevice(cmd, androidDevice), true, `${cmd} on Android`);
40+
}
41+
});
42+
43+
test('core commands support iOS simulator, iOS device, and Android', () => {
3644
for (const cmd of [
3745
'app-switcher',
38-
'apps',
3946
'back',
4047
'boot',
4148
'click',
@@ -47,19 +54,16 @@ test('iOS simulator + Android commands reject iOS devices', () => {
4754
'home',
4855
'long-press',
4956
'open',
50-
'reinstall',
5157
'press',
52-
'record',
5358
'screenshot',
5459
'scroll',
55-
'swipe',
56-
'settings',
60+
'scrollintoview',
5761
'snapshot',
5862
'type',
5963
'wait',
6064
]) {
6165
assert.equal(isCommandSupportedOnDevice(cmd, iosSimulator), true, `${cmd} on iOS sim`);
62-
assert.equal(isCommandSupportedOnDevice(cmd, iosDevice), false, `${cmd} on iOS device`);
66+
assert.equal(isCommandSupportedOnDevice(cmd, iosDevice), true, `${cmd} on iOS device`);
6367
assert.equal(isCommandSupportedOnDevice(cmd, androidDevice), true, `${cmd} on Android`);
6468
}
6569
});

0 commit comments

Comments
 (0)