Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 22 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ CLI to control iOS and Android devices for AI agents influenced by Vercel’s [a
The project is in early development and considered experimental. Pull requests are welcome!

## Features
- Platforms: iOS (simulator + physical device core automation) and Android (emulator + device).
- Platforms: iOS/tvOS (simulator + physical device core automation) and Android/AndroidTV (emulator + device).
- Core commands: `open`, `back`, `home`, `app-switcher`, `press`, `long-press`, `focus`, `type`, `fill`, `scroll`, `scrollintoview`, `wait`, `alert`, `screenshot`, `close`, `reinstall`, `push`.
- Inspection commands: `snapshot` (accessibility tree), `diff snapshot` (structural baseline diff), `appstate`, `apps`, `devices`.
- App logs: `logs path` returns session log metadata; `logs start` / `logs stop` stream app output; `logs clear` truncates session app logs; `logs clear --restart` resets and restarts stream in one step; `logs doctor` checks readiness; `logs mark` writes timeline markers.
Expand Down Expand Up @@ -196,7 +196,8 @@ Efficient snapshot usage:

Flags:
- `--version, -V` print version and exit
- `--platform ios|android`
- `--platform ios|android|apple` (`apple` aliases the iOS/tvOS backend)
- `--target mobile|tv` select device class within platform (requires `--platform`; for example AndroidTV/tvOS)
- `--device <name>`
- `--udid <udid>` (iOS)
- `--serial <serial>` (Android)
Expand All @@ -216,8 +217,22 @@ Flags:
- `--on-error stop` batch: stop when a step fails
- `--max-steps <n>` batch: max allowed steps per request

TV targets:
- Use `--target tv` together with `--platform ios|android|apple`.
- TV target selection supports both simulator/emulator and connected physical devices (AppleTV + AndroidTV).
- AndroidTV app launch/app listing use TV launcher discovery (`LEANBACK_LAUNCHER`) and fallback component resolution when needed.
- tvOS uses the same runner-driven interaction/snapshot flow as iOS (`snapshot`, `wait`, `press`, `fill`, `get`, `scroll`, `back`, `home`, `app-switcher`, `record`, and related selector flows).
- tvOS back/home/app-switcher use Siri Remote semantics in the runner (`menu`, `home`, double-home).
- tvOS follows iOS simulator-only command semantics for helpers like `pinch`, `settings`, and `push`.

Examples:
- `agent-device open YouTube --platform android --target tv`
- `agent-device apps --platform android --target tv`
- `agent-device open Settings --platform ios --target tv`
- `agent-device screenshot ./apple-tv.png --platform ios --target tv`

Pinch:
- `pinch` is supported on iOS simulators.
- `pinch` is supported on iOS simulators (including tvOS simulator targets).
- On Android, `pinch` currently returns `UNSUPPORTED_OPERATION` in the adb backend.

Swipe timing:
Expand Down Expand Up @@ -246,7 +261,7 @@ Sessions:
- On iOS, `appstate` is session-scoped and requires an active session on the target device.

Navigation helpers:
- `boot --platform ios|android` ensures the target is ready without launching an app.
- `boot --platform ios|android|apple` ensures the target is ready without launching an app.
- Use `boot` mainly when starting a new session and `open` fails because no booted simulator/emulator is available.
- `open [app|url] [url]` already boots/activates the selected target when needed.
- `reinstall <app> <path>` uninstalls and installs the app binary in one command (Android + iOS simulator/device).
Expand Down Expand Up @@ -354,7 +369,7 @@ Boot diagnostics:
- Boot failures include normalized reason codes in `error.details.reason` (JSON mode) and verbose logs.
- Reason codes: `IOS_BOOT_TIMEOUT`, `IOS_RUNNER_CONNECT_TIMEOUT`, `ANDROID_BOOT_TIMEOUT`, `ADB_TRANSPORT_UNAVAILABLE`, `CI_RESOURCE_STARVATION_SUSPECTED`, `BOOT_COMMAND_FAILED`, `UNKNOWN`.
- Android boot waits fail fast for permission/tooling issues and do not always collapse into timeout errors.
- Use `agent-device boot --platform ios|android` when starting a new session only if `open` cannot find/connect to an available target.
- Use `agent-device boot --platform ios|android|apple` when starting a new session only if `open` cannot find/connect to an available target.
- `--debug` captures retry telemetry in diagnostics logs.
- Set `AGENT_DEVICE_RETRY_LOGS=1` to also print retry telemetry directly to stderr (ad-hoc troubleshooting).

Expand All @@ -371,6 +386,7 @@ Diagnostics files:
## iOS notes
- Core runner commands: `snapshot`, `wait`, `click`, `fill`, `get`, `is`, `find`, `press`, `longpress`, `focus`, `type`, `scroll`, `scrollintoview`, `back`, `home`, `app-switcher`.
- Simulator-only commands: `alert`, `pinch`, `settings`.
- tvOS targets are selectable (`--platform ios --target tv` or `--platform apple --target tv`) and support runner-driven interaction/snapshot commands.
- `record` supports iOS simulators and physical iOS devices.
- iOS simulator recording uses native `simctl io ... recordVideo`.
- Physical iOS device recording is runner-based and built from repeated `XCUIScreen.main.screenshot()` frames (no native video stream/audio capture).
Expand Down Expand Up @@ -409,7 +425,7 @@ Environment selectors:
- `AGENT_DEVICE_IOS_SIGNING_IDENTITY=<identity>` optional signing identity override.
- `AGENT_DEVICE_IOS_PROVISIONING_PROFILE=<profile>` optional provisioning profile specifier for iOS device runner signing.
- `AGENT_DEVICE_IOS_RUNNER_DERIVED_PATH=<path>` optional override for iOS runner derived data root. By default, simulator uses `~/.agent-device/ios-runner/derived` and physical device uses `~/.agent-device/ios-runner/derived/device`. If you set this override, use separate paths per kind to avoid simulator/device artifact collisions.
- `AGENT_DEVICE_IOS_CLEAN_DERIVED=1` rebuild iOS runner artifacts from scratch for runtime daemon-triggered builds (`pnpm ad ...`) on the selected path. `pnpm build:xcuitest`/`pnpm build:all` already clear `~/.agent-device/ios-runner/derived/device` and do not require this variable. When `AGENT_DEVICE_IOS_RUNNER_DERIVED_PATH` is set, cleanup is blocked by default; set `AGENT_DEVICE_IOS_ALLOW_OVERRIDE_DERIVED_CLEAN=1` only for trusted custom paths.
- `AGENT_DEVICE_IOS_CLEAN_DERIVED=1` rebuild iOS runner artifacts from scratch for runtime daemon-triggered builds (`pnpm ad ...`) on the selected path. `pnpm build:xcuitest` (alias of `pnpm build:xcuitest:ios`), `pnpm build:xcuitest:tvos`, and `pnpm build:all` already clear their default derived paths and do not require this variable. When `AGENT_DEVICE_IOS_RUNNER_DERIVED_PATH` is set, cleanup is blocked by default; set `AGENT_DEVICE_IOS_ALLOW_OVERRIDE_DERIVED_CLEAN=1` only for trusted custom paths.

Test screenshots are written to:
- `test/screenshots/android-settings.png`
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -258,7 +258,7 @@
MTL_ENABLE_DEBUG_INFO = INCLUDE_SOURCE;
MTL_FAST_MATH = YES;
ONLY_ACTIVE_ARCH = YES;
SDKROOT = iphoneos;
SDKROOT = auto;
SWIFT_ACTIVE_COMPILATION_CONDITIONS = "DEBUG $(inherited)";
SWIFT_OPTIMIZATION_LEVEL = "-Onone";
};
Expand Down Expand Up @@ -315,7 +315,7 @@
LOCALIZATION_PREFERS_STRING_CATALOGS = YES;
MTL_ENABLE_DEBUG_INFO = NO;
MTL_FAST_MATH = YES;
SDKROOT = iphoneos;
SDKROOT = auto;
SWIFT_COMPILATION_MODE = wholemodule;
VALIDATE_PRODUCT = YES;
};
Expand Down Expand Up @@ -350,7 +350,9 @@
SWIFT_EMIT_LOC_STRINGS = YES;
SWIFT_UPCOMING_FEATURE_MEMBER_IMPORT_VISIBILITY = YES;
SWIFT_VERSION = 5.0;
TARGETED_DEVICE_FAMILY = "1,2";
SUPPORTED_PLATFORMS = "iphoneos iphonesimulator appletvos appletvsimulator";
TARGETED_DEVICE_FAMILY = "1,2,3";
TVOS_DEPLOYMENT_TARGET = 15.6;
};
name = Debug;
};
Expand Down Expand Up @@ -383,7 +385,9 @@
SWIFT_EMIT_LOC_STRINGS = YES;
SWIFT_UPCOMING_FEATURE_MEMBER_IMPORT_VISIBILITY = YES;
SWIFT_VERSION = 5.0;
TARGETED_DEVICE_FAMILY = "1,2";
SUPPORTED_PLATFORMS = "iphoneos iphonesimulator appletvos appletvsimulator";
TARGETED_DEVICE_FAMILY = "1,2,3";
TVOS_DEPLOYMENT_TARGET = 15.6;
};
name = Release;
};
Expand All @@ -404,7 +408,9 @@
SWIFT_OBJC_BRIDGING_HEADER = "AgentDeviceRunnerUITests/AgentDeviceRunnerUITests-Bridging-Header.h";
SWIFT_UPCOMING_FEATURE_MEMBER_IMPORT_VISIBILITY = YES;
SWIFT_VERSION = 5.0;
TARGETED_DEVICE_FAMILY = "1,2";
SUPPORTED_PLATFORMS = "iphoneos iphonesimulator appletvos appletvsimulator";
TARGETED_DEVICE_FAMILY = "1,2,3";
TVOS_DEPLOYMENT_TARGET = 15.6;
TEST_TARGET_NAME = AgentDeviceRunner;
};
name = Debug;
Expand All @@ -426,7 +432,9 @@
SWIFT_OBJC_BRIDGING_HEADER = "AgentDeviceRunnerUITests/AgentDeviceRunnerUITests-Bridging-Header.h";
SWIFT_UPCOMING_FEATURE_MEMBER_IMPORT_VISIBILITY = YES;
SWIFT_VERSION = 5.0;
TARGETED_DEVICE_FAMILY = "1,2";
SUPPORTED_PLATFORMS = "iphoneos iphonesimulator appletvos appletvsimulator";
TARGETED_DEVICE_FAMILY = "1,2,3";
TVOS_DEPLOYMENT_TARGET = 15.6;
TEST_TARGET_NAME = AgentDeviceRunner;
};
name = Release;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@ final class RunnerTests: XCTestCase {
private let postSnapshotInteractionDelay: TimeInterval = 0.2
private let firstInteractionAfterActivateDelay: TimeInterval = 0.25
private let scrollInteractionIdleTimeoutDefault: TimeInterval = 1.0
private let tvRemoteDoublePressDelayDefault: TimeInterval = 0.0
private let minRecordingFps = 1
private let maxRecordingFps = 120
private var needsPostSnapshotInteractionDelay = false
Expand Down Expand Up @@ -790,7 +791,7 @@ final class RunnerTests: XCTestCase {
performBackGesture(app: activeApp)
return Response(ok: true, data: DataPayload(message: "back"))
case .home:
XCUIDevice.shared.press(.home)
pressHomeButton()
return Response(ok: true, data: DataPayload(message: "home"))
case .appSwitcher:
performAppSwitcherGesture(app: activeApp)
Expand Down Expand Up @@ -990,23 +991,78 @@ final class RunnerTests: XCTestCase {
back.tap()
return true
}
return false
return pressTvRemoteMenuIfAvailable()
}

private func performBackGesture(app: XCUIApplication) {
if pressTvRemoteMenuIfAvailable() {
return
}
let target = app.windows.firstMatch.exists ? app.windows.firstMatch : app
let start = target.coordinate(withNormalizedOffset: CGVector(dx: 0.05, dy: 0.5))
let end = target.coordinate(withNormalizedOffset: CGVector(dx: 0.8, dy: 0.5))
start.press(forDuration: 0.05, thenDragTo: end)
}

private func performAppSwitcherGesture(app: XCUIApplication) {
if performTvRemoteAppSwitcherIfAvailable() {
return
}
let target = app.windows.firstMatch.exists ? app.windows.firstMatch : app
let start = target.coordinate(withNormalizedOffset: CGVector(dx: 0.5, dy: 0.99))
let end = target.coordinate(withNormalizedOffset: CGVector(dx: 0.5, dy: 0.7))
start.press(forDuration: 0.6, thenDragTo: end)
}

private func pressHomeButton() {
if pressTvRemoteHomeIfAvailable() {
return
}
XCUIDevice.shared.press(.home)
}

private func pressTvRemoteMenuIfAvailable() -> Bool {
#if os(tvOS)
XCUIRemote.shared.press(.menu)
return true
#else
return false
#endif
}

private func pressTvRemoteHomeIfAvailable() -> Bool {
#if os(tvOS)
XCUIRemote.shared.press(.home)
return true
#else
return false
#endif
}

private func performTvRemoteAppSwitcherIfAvailable() -> Bool {
#if os(tvOS)
XCUIRemote.shared.press(.home)
sleepFor(resolveTvRemoteDoublePressDelay())
XCUIRemote.shared.press(.home)
return true
#else
return false
#endif
}

private func resolveTvRemoteDoublePressDelay() -> TimeInterval {
guard
let raw = ProcessInfo.processInfo.environment["AGENT_DEVICE_TV_REMOTE_DOUBLE_PRESS_DELAY_MS"],
!raw.trimmingCharacters(in: .whitespacesAndNewlines).isEmpty
else {
return tvRemoteDoublePressDelayDefault
}
guard let parsedMs = Double(raw), parsedMs >= 0 else {
return tvRemoteDoublePressDelayDefault
}
return min(parsedMs, 1000) / 1000.0
}

private func findElement(app: XCUIApplication, text: String) -> XCUIElement? {
let predicate = NSPredicate(format: "label CONTAINS[c] %@ OR identifier CONTAINS[c] %@ OR value CONTAINS[c] %@", text, text, text)
let element = app.descendants(matching: .any).matching(predicate).firstMatch
Expand Down Expand Up @@ -1109,6 +1165,9 @@ final class RunnerTests: XCTestCase {
}

private func swipe(app: XCUIApplication, direction: SwipeDirection) {
if performTvRemoteSwipeIfAvailable(direction: direction) {
return
}
let target = app.windows.firstMatch.exists ? app.windows.firstMatch : app
let start = target.coordinate(withNormalizedOffset: CGVector(dx: 0.5, dy: 0.2))
let end = target.coordinate(withNormalizedOffset: CGVector(dx: 0.5, dy: 0.8))
Expand All @@ -1127,6 +1186,24 @@ final class RunnerTests: XCTestCase {
}
}

private func performTvRemoteSwipeIfAvailable(direction: SwipeDirection) -> Bool {
#if os(tvOS)
switch direction {
case .up:
XCUIRemote.shared.press(.up)
case .down:
XCUIRemote.shared.press(.down)
case .left:
XCUIRemote.shared.press(.left)
case .right:
XCUIRemote.shared.press(.right)
}
return true
#else
return false
#endif
}

private func pinch(app: XCUIApplication, scale: Double, x: Double?, y: Double?) {
let target = app.windows.firstMatch.exists ? app.windows.firstMatch : app

Expand Down
4 changes: 3 additions & 1 deletion package.json
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,9 @@
"build": "rslib build",
"clean:daemon": "rm -f ~/.agent-device/daemon.json && rm -f ~/.agent-device/daemon.lock",
"build:node": "pnpm build && pnpm clean:daemon",
"build:xcuitest": "rm -rf ~/.agent-device/ios-runner/derived/device && xcodebuild build-for-testing -project ios-runner/AgentDeviceRunner/AgentDeviceRunner.xcodeproj -scheme AgentDeviceRunner -destination \"generic/platform=iOS Simulator\" -derivedDataPath ~/.agent-device/ios-runner/derived",
"build:xcuitest": "pnpm build:xcuitest:ios",
"build:xcuitest:ios": "rm -rf ~/.agent-device/ios-runner/derived/device && xcodebuild build-for-testing -project ios-runner/AgentDeviceRunner/AgentDeviceRunner.xcodeproj -scheme AgentDeviceRunner -destination \"generic/platform=iOS Simulator\" -derivedDataPath ~/.agent-device/ios-runner/derived",
"build:xcuitest:tvos": "rm -rf ~/.agent-device/ios-runner/derived/tvos && xcodebuild build-for-testing -project ios-runner/AgentDeviceRunner/AgentDeviceRunner.xcodeproj -scheme AgentDeviceRunner -destination \"generic/platform=tvOS Simulator\" -derivedDataPath ~/.agent-device/ios-runner/derived/tvos",
"build:all": "pnpm build:node && pnpm build:xcuitest",
"ad": "node bin/agent-device.mjs",
"format": "prettier --write .",
Expand Down
9 changes: 9 additions & 0 deletions skills/agent-device/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,14 @@ agent-device session list
```

Use `boot` only as fallback when `open` cannot find/connect to a ready target.
Use `--target mobile|tv` with `--platform` (required) to pick phone/tablet vs TV targets (AndroidTV/tvOS).

TV quick reference:
- AndroidTV: `open`/`apps` use TV launcher discovery automatically.
- TV target selection works on emulators/simulators and connected physical devices (AndroidTV + AppleTV).
- tvOS: runner-driven interactions and snapshots are supported (`snapshot`, `wait`, `press`, `fill`, `get`, `scroll`, `back`, `home`, `app-switcher`, `record` and related selector flows).
- tvOS `back`/`home`/`app-switcher` map to Siri Remote actions (`menu`, `home`, double-home) in the runner.
- tvOS follows iOS simulator-only command semantics for helpers like `pinch`, `settings`, and `push`.

### Snapshot and targeting

Expand Down Expand Up @@ -109,6 +117,7 @@ agent-device batch --steps-file /tmp/batch-steps.json --json
- Use `fill` for clear-then-type semantics; use `type` for focused append typing.
- iOS `appstate` is session-scoped; Android `appstate` is live foreground state.
- iOS settings helpers are simulator-only; use `appearance light|dark|toggle` and faceid `match|nonmatch|enroll|unenroll`.
- For AndroidTV/tvOS selection, always pair `--target` with `--platform` (`ios`, `android`, or `apple` alias); target-only selection is invalid.
- `push` simulates notification delivery:
- iOS simulator uses APNs-style payload JSON.
- Android uses broadcast action + typed extras (string/boolean/number).
Expand Down
3 changes: 2 additions & 1 deletion src/cli.ts
Original file line number Diff line number Diff line change
Expand Up @@ -308,8 +308,9 @@ export async function runCli(argv: string[], deps: CliDeps = DEFAULT_CLI_DEPS):
const name = d?.name ?? d?.id ?? 'unknown';
const platform = d?.platform ?? 'unknown';
const kind = d?.kind ? ` ${d.kind}` : '';
const target = d?.target ? ` target=${d.target}` : '';
const booted = typeof d?.booted === 'boolean' ? ` booted=${d.booted}` : '';
return `${name} (${platform}${kind})${booted}`;
return `${name} (${platform}${kind}${target})${booted}`;
});
process.stdout.write(`${lines.join('\n')}\n`);
if (logTailStopper) logTailStopper();
Expand Down
34 changes: 34 additions & 0 deletions src/core/__tests__/capabilities.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,22 @@ const androidDevice: DeviceInfo = {
kind: 'device',
};

const androidTvDevice: DeviceInfo = {
platform: 'android',
id: 'and-tv-1',
name: 'Android TV',
kind: 'device',
target: 'tv',
};

const tvOsSimulator: DeviceInfo = {
platform: 'ios',
id: 'tv-sim-1',
name: 'Apple TV',
kind: 'simulator',
target: 'tv',
};

test('iOS simulator-only commands reject iOS devices and Android', () => {
for (const cmd of ['alert', 'pinch']) {
assert.equal(isCommandSupportedOnDevice(cmd, iosSimulator), true, `${cmd} on iOS sim`);
Expand Down Expand Up @@ -84,6 +100,24 @@ test('core commands support iOS simulator, iOS device, and Android', () => {
}
});

test('Android TV uses Android capabilities for core commands', () => {
for (const cmd of ['open', 'apps', 'snapshot', 'press', 'swipe', 'back', 'home', 'scroll']) {
assert.equal(isCommandSupportedOnDevice(cmd, androidTvDevice), true, `${cmd} on Android TV`);
}
});

test('tvOS follows iOS capability matrix by device kind', () => {
for (const cmd of ['open', 'close', 'apps', 'screenshot', 'logs', 'reinstall', 'boot']) {
assert.equal(isCommandSupportedOnDevice(cmd, tvOsSimulator), true, `${cmd} on tvOS`);
}
for (const cmd of ['snapshot', 'wait', 'press', 'get', 'fill', 'scroll', 'back', 'home', 'app-switcher', 'record']) {
assert.equal(isCommandSupportedOnDevice(cmd, tvOsSimulator), true, `${cmd} on tvOS`);
}
for (const cmd of ['pinch', 'push', 'settings', 'alert']) {
assert.equal(isCommandSupportedOnDevice(cmd, tvOsSimulator), true, `${cmd} on tvOS simulator`);
}
});

test('unknown commands default to supported', () => {
assert.equal(isCommandSupportedOnDevice('some-future-cmd', iosSimulator), true);
assert.equal(isCommandSupportedOnDevice('some-future-cmd', androidDevice), true);
Expand Down
14 changes: 14 additions & 0 deletions src/core/__tests__/dispatch-target.test.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
import test from 'node:test';
import assert from 'node:assert/strict';
import { resolveTargetDevice } from '../dispatch.ts';
import { AppError } from '../../utils/errors.ts';

test('resolveTargetDevice requires platform when target selector is provided', async () => {
await assert.rejects(
() => resolveTargetDevice({ target: 'tv' }),
(error: unknown) =>
error instanceof AppError &&
error.code === 'INVALID_ARGS' &&
error.message.includes('requires --platform'),
);
});
Loading
Loading