Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,12 @@ Minimal operating guide for AI coding agents in this repo.
- Transform tasks into verifiable goals with clear success criteria.
- For multi-step tasks, state a brief plan with verification checkpoints.

## Docs & Skills
- For every behavior or CLI surface change, evaluate whether docs/skills updates are required.
- Update `README.md` and relevant `website/docs/**` pages when command behavior, flags, aliases, or workflows change.
- Update relevant `skills/**/SKILL.md` guidance when usage examples or workflow recommendations change.
- In final summaries, explicitly state whether docs/skills were updated; if not, state why no updates were needed.

## Scope
- Solve issues with the smallest context read.
- Keep changes scoped to one command family or module group.
Expand Down
10 changes: 5 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ The project is in early development and considered experimental. Pull requests a

## Features
- Platforms: iOS (simulator + physical device core automation) and Android (emulator + device).
- Core commands: `open`, `back`, `home`, `app-switcher`, `press`, `long-press`, `focus`, `type`, `fill`, `scroll`, `scrollintoview`, `wait`, `alert`, `screenshot`, `close`, `reinstall`.
- Core commands: `open`, `back`, `home`, `app-switcher`, `press`, `longpress`, `focus`, `type`, `fill`, `scroll`, `scrollintoview`, `wait`, `alert`, `screenshot`, `close`, `reinstall`.
- Inspection commands: `snapshot` (accessibility tree), `appstate`, `apps`, `devices`.
- Device tooling: `adb` (Android), `simctl`/`devicectl` (iOS via Xcode).
- Minimal dependencies; TypeScript executed directly on Node 22+ (no build step).
Expand Down Expand Up @@ -118,7 +118,7 @@ agent-device trace stop ./trace.log
```

Coordinates:
- All coordinate-based commands (`press`, `long-press`, `swipe`, `focus`, `fill`) use device coordinates with origin at top-left.
- All coordinate-based commands (`press`, `longpress`, `swipe`, `focus`, `fill`) use device coordinates with origin at top-left.
- X increases to the right, Y increases downward.
- `press` is the canonical tap command.
- `click` is an equivalent alias and accepts the same targets (`x y`, `@ref`, selector) and flags.
Expand All @@ -136,7 +136,7 @@ agent-device swipe 540 1500 540 500 120 --count 8 --pause-ms 30 --pattern ping-p
- `boot`, `open`, `close`, `reinstall`, `home`, `back`, `app-switcher`
- `batch`
- `snapshot`, `find`, `get`
- `press` (alias: `click`), `focus`, `type`, `fill`, `long-press`, `swipe`, `scroll`, `scrollintoview`, `pinch`, `is`
- `press` (alias: `click`), `focus`, `type`, `fill`, `longpress`, `swipe`, `scroll`, `scrollintoview`, `pinch`, `is`
- `alert`, `wait`, `screenshot`
- `trace start`, `trace stop`
- `settings wifi|airplane|location on|off`
Expand Down Expand Up @@ -179,7 +179,7 @@ Pinch:
Swipe timing:
- `swipe` accepts optional `durationMs` (default `250`, range `16..10000`).
- Android uses requested swipe duration directly.
- iOS uses a safe normalized duration to avoid long-press side effects.
- iOS uses a safe normalized duration to avoid longpress side effects.

## Skills
Install the automation skills listed in [SKILL.md](skills/agent-device/SKILL.md).
Expand Down Expand Up @@ -304,7 +304,7 @@ Diagnostics files:
- Built-in aliases include `Settings` for both platforms.

## iOS notes
- Core runner commands: `snapshot`, `wait`, `click`, `fill`, `get`, `is`, `find`, `press`, `long-press`, `focus`, `type`, `scroll`, `scrollintoview`, `back`, `home`, `app-switcher`.
- Core runner commands: `snapshot`, `wait`, `click`, `fill`, `get`, `is`, `find`, `press`, `longpress`, `focus`, `type`, `scroll`, `scrollintoview`, `back`, `home`, `app-switcher`.
- Simulator-only commands: `alert`, `pinch`, `record`, `settings`.
- iOS device runs require valid signing/provisioning (Automatic Signing recommended). Optional overrides: `AGENT_DEVICE_IOS_TEAM_ID`, `AGENT_DEVICE_IOS_SIGNING_IDENTITY`, `AGENT_DEVICE_IOS_PROVISIONING_PROFILE`.

Expand Down
5 changes: 3 additions & 2 deletions skills/agent-device/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -126,7 +126,7 @@ agent-device press @e1 --count 5 # Repeat taps on the same target
agent-device press @e1 --count 5 --double-tap # Use double-tap gesture per iteration
agent-device swipe 540 1500 540 500 120
agent-device swipe 540 1500 540 500 120 --count 8 --pause-ms 30 --pattern ping-pong
agent-device long-press 300 500 800 # Long press (where supported)
agent-device longpress 300 500 800 # Long press on iOS and Android
agent-device scroll down 0.5
agent-device pinch 2.0 # Zoom in 2x (iOS simulator only)
agent-device pinch 0.5 200 400 # Zoom out at coordinates (iOS simulator only)
Expand Down Expand Up @@ -235,7 +235,8 @@ agent-device apps --platform android --user-installed
- `press`/`click` support gesture series controls: `--count`, `--interval-ms`, `--hold-ms`, `--jitter-px`, `--double-tap`.
- `--double-tap` cannot be combined with `--hold-ms` or `--jitter-px`.
- `swipe` supports coordinate + timing controls and repeat patterns: `swipe x1 y1 x2 y2 [durationMs] --count --pause-ms --pattern`.
- `swipe` timing is platform-safe: Android uses requested duration; iOS uses normalized safe timing to avoid long-press side effects.
- `swipe` timing is platform-safe: Android uses requested duration; iOS uses normalized safe timing to avoid longpress side effects.
- `longpress` is coordinate-based and supported on iOS and Android.
- Pinch (`pinch <scale> [x y]`) is iOS simulator-only; scale > 1 zooms in, < 1 zooms out.
- Snapshot refs are the core mechanism for interactive agent flows.
- Use selectors for deterministic replay artifacts and assertions (e.g. in e2e test workflows).
Expand Down
15 changes: 15 additions & 0 deletions src/__tests__/cli-help.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -69,6 +69,21 @@ test('help appstate prints command help and skips daemon dispatch', async () =>
assert.match(result.stdout, /Global flags:/);
});

test('help longpress prints command help and skips daemon dispatch', async () => {
const result = await runCliCapture(['help', 'longpress']);
assert.equal(result.code, 0);
assert.equal(result.daemonCalls, 0);
assert.match(result.stdout, /Usage:\n agent-device longpress <x> <y> \[durationMs\]/);
});

test('help long-press resolves to longpress help and skips daemon dispatch', async () => {
const result = await runCliCapture(['help', 'long-press']);
assert.equal(result.code, 0);
assert.equal(result.daemonCalls, 0);
assert.match(result.stdout, /Usage:\n agent-device longpress <x> <y> \[durationMs\]/);
assert.doesNotMatch(result.stdout, /agent-device long-press/);
});

test('appstate --help prints command help and skips daemon dispatch', async () => {
const result = await runCliCapture(['appstate', '--help']);
assert.equal(result.code, 0);
Expand Down
2 changes: 1 addition & 1 deletion src/core/__tests__/capabilities.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,7 @@ test('core commands support iOS simulator, iOS device, and Android', () => {
'focus',
'get',
'home',
'long-press',
'longpress',
'open',
'press',
'screenshot',
Expand Down
2 changes: 1 addition & 1 deletion src/core/capabilities.ts
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ const COMMAND_CAPABILITY_MATRIX: Record<string, CommandCapability> = {
get: { ios: { simulator: true, device: true }, android: { emulator: true, device: true, unknown: true } },
is: { ios: { simulator: true, device: true }, android: { emulator: true, device: true, unknown: true } },
home: { ios: { simulator: true, device: true }, android: { emulator: true, device: true, unknown: true } },
'long-press': { ios: { simulator: true, device: true }, android: { emulator: true, device: true, unknown: true } },
longpress: { ios: { simulator: true, device: true }, android: { emulator: true, device: true, unknown: true } },
open: { ios: { simulator: true, device: true }, android: { emulator: true, device: true, unknown: true } },
reinstall: { ios: { simulator: true, device: true }, android: { emulator: true, device: true, unknown: true } },
press: { ios: { simulator: true, device: true }, android: { emulator: true, device: true, unknown: true } },
Expand Down
4 changes: 2 additions & 2 deletions src/core/dispatch.ts
Original file line number Diff line number Diff line change
Expand Up @@ -270,12 +270,12 @@ export async function dispatchCommand(
pattern,
};
}
case 'long-press': {
case 'longpress': {
const x = Number(positionals[0]);
const y = Number(positionals[1]);
const durationMs = positionals[2] ? Number(positionals[2]) : undefined;
if (Number.isNaN(x) || Number.isNaN(y)) {
throw new AppError('INVALID_ARGS', 'long-press requires x y [durationMs]');
throw new AppError('INVALID_ARGS', 'longpress requires x y [durationMs]');
}
await interactor.longPress(x, y, durationMs);
return { x, y, durationMs };
Expand Down
25 changes: 25 additions & 0 deletions src/utils/__tests__/args.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -140,6 +140,31 @@ test('parseArgs recognizes swipe positional + pattern flags', () => {
assert.equal(parsed.flags.pattern, 'ping-pong');
});

test('parseArgs recognizes longpress command', () => {
const parsed = parseArgs(['longpress', '300', '500', '800'], { strictFlags: true });
assert.equal(parsed.command, 'longpress');
assert.deepEqual(parsed.positionals, ['300', '500', '800']);
});

test('parseArgs supports legacy long-press alias', () => {
const parsed = parseArgs(['long-press', '300', '500', '800'], { strictFlags: true });
assert.equal(parsed.command, 'longpress');
assert.deepEqual(parsed.positionals, ['300', '500', '800']);
});

test('usageForCommand resolves longpress help', () => {
const help = usageForCommand('longpress');
assert.equal(help === null, false);
assert.match(help ?? '', /agent-device longpress <x> <y> \[durationMs\]/);
});

test('usageForCommand supports legacy long-press alias', () => {
const help = usageForCommand('long-press');
assert.equal(help === null, false);
assert.match(help ?? '', /agent-device longpress <x> <y> \[durationMs\]/);
assert.doesNotMatch(help ?? '', /agent-device long-press/);
});

test('parseArgs rejects invalid swipe pattern', () => {
assert.throws(
() => parseArgs(['swipe', '0', '0', '10', '10', '--pattern', 'diagonal']),
Expand Down
11 changes: 8 additions & 3 deletions src/utils/args.ts
Original file line number Diff line number Diff line change
Expand Up @@ -43,14 +43,14 @@ export function parseArgs(argv: string[], options?: ParseArgsOptions): ParsedArg
continue;
}
if (!parseFlags) {
if (!command) command = arg;
if (!command) command = normalizeCommandAlias(arg);
else positionals.push(arg);
continue;
}
const isLongFlag = arg.startsWith('--');
const isShortFlag = arg.startsWith('-') && arg.length > 1;
if (!isLongFlag && !isShortFlag) {
if (!command) command = arg;
if (!command) command = normalizeCommandAlias(arg);
else positionals.push(arg);
continue;
}
Expand Down Expand Up @@ -246,5 +246,10 @@ export function usage(): string {
}

export function usageForCommand(command: string): string | null {
return buildCommandUsageText(command);
return buildCommandUsageText(normalizeCommandAlias(command));
}

function normalizeCommandAlias(command: string): string {
if (command === 'long-press') return 'longpress';
return command;
}
4 changes: 2 additions & 2 deletions src/utils/command-schema.ts
Original file line number Diff line number Diff line change
Expand Up @@ -449,8 +449,8 @@ export const COMMAND_SCHEMAS: Record<string, CommandSchema> = {
allowsExtraPositionals: true,
allowedFlags: ['count', 'intervalMs', 'holdMs', 'jitterPx', 'doubleTap', ...SELECTOR_SNAPSHOT_FLAGS],
},
'long-press': {
description: 'Long press (where supported)',
longpress: {
description: 'Long press by coordinates (iOS and Android)',
positionalArgs: ['x', 'y', 'durationMs?'],
allowedFlags: [],
},
Expand Down
5 changes: 3 additions & 2 deletions website/docs/docs/commands.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@ agent-device press 300 500 --count 12 --interval-ms 45
agent-device press 300 500 --count 6 --hold-ms 120 --interval-ms 30 --jitter-px 2
agent-device swipe 540 1500 540 500 120
agent-device swipe 540 1500 540 500 120 --count 8 --pause-ms 30 --pattern ping-pong
agent-device long-press 300 500 800
agent-device longpress 300 500 800
agent-device scroll down 0.5
agent-device pinch 2.0 # zoom in 2x (iOS simulator)
agent-device pinch 0.5 200 400 # zoom out at coordinates (iOS simulator)
Expand All @@ -63,7 +63,8 @@ agent-device pinch 0.5 200 400 # zoom out at coordinates (iOS simulator)
`fill` clears then types. `type` does not clear.
On Android, `fill` also verifies text and performs one clear-and-retry pass on mismatch.
`swipe` accepts an optional `durationMs` argument (default `250ms`, range `16..10000`).
On iOS, swipe timing uses a safe normalized duration to avoid long-press side effects.
On iOS, swipe timing uses a safe normalized duration to avoid longpress side effects.
`longpress` is supported on iOS and Android.
`pinch` is iOS simulator-only.

## Find (semantic)
Expand Down
2 changes: 1 addition & 1 deletion website/docs/docs/introduction.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ If you know `agent-browser`, this is the mobile-native counterpart for iOS/Andro

## Platform support highlights

- iOS core runner commands: `snapshot`, `wait`, `click`, `fill`, `get`, `is`, `find`, `press`, `long-press`, `focus`, `type`, `scroll`, `scrollintoview`, `back`, `home`, `app-switcher`, `open` (app), `close`, `screenshot`, `apps`, `appstate`.
- iOS core runner commands: `snapshot`, `wait`, `click`, `fill`, `get`, `is`, `find`, `press`, `longpress`, `focus`, `type`, `scroll`, `scrollintoview`, `back`, `home`, `app-switcher`, `open` (app), `close`, `screenshot`, `apps`, `appstate`.
- iOS `appstate` is session-scoped on the selected target device.
- iOS simulator-only: `alert`, `pinch`, `record`, `reinstall`, `settings`.
- Android support remains unchanged.
Expand Down
Loading