Skip to content

Commit 81a71a4

Browse files
committed
fix: add dblclick alias routing and simplify click-like schema
1 parent 089cf96 commit 81a71a4

8 files changed

Lines changed: 110 additions & 19 deletions

File tree

README.md

Lines changed: 10 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ The project is in early development and considered experimental. Pull requests a
1515
## Features
1616
- Platforms: iOS (simulator + physical device core automation) and Android (emulator + device).
1717
- Core commands: `open`, `back`, `home`, `app-switcher`, `press`, `long-press`, `focus`, `type`, `fill`, `scroll`, `scrollintoview`, `wait`, `alert`, `screenshot`, `close`, `reinstall`.
18-
- Inspection commands: `snapshot` (accessibility tree), `appstate`, `apps`, `devices`.
18+
- Inspection commands: `snapshot` (accessibility tree), `diff snapshot` (snapshot diffs), `appstate`, `apps`, `devices`.
1919
- Device tooling: `adb` (Android), `simctl`/`devicectl` (iOS via Xcode).
2020
- Minimal dependencies; TypeScript executed directly on Node 22+ (no build step).
2121

@@ -34,13 +34,14 @@ npx agent-device open SampleApp
3434
## Quick Start
3535

3636
Use refs for agent-driven exploration and normal automation flows.
37-
Use `press` as the canonical tap command; `click` is an equivalent alias.
37+
Use `press` as the canonical tap command; `click` is an equivalent alias; `dblclick` is an alias for `click --double-tap`.
3838

3939
```bash
4040
agent-device open Contacts --platform ios # creates session on iOS Simulator
4141
agent-device snapshot
4242
agent-device press @e5
4343
agent-device fill @e6 "John"
44+
agent-device diff snapshot
4445
agent-device fill @e7 "Doe"
4546
agent-device press @e3
4647
agent-device close
@@ -105,6 +106,7 @@ agent-device open SampleApp
105106
agent-device snapshot
106107
agent-device press @e7
107108
agent-device fill @e8 "hello"
109+
agent-device diff snapshot
108110
agent-device close SampleApp
109111
```
110112

@@ -122,6 +124,7 @@ Coordinates:
122124
- X increases to the right, Y increases downward.
123125
- `press` is the canonical tap command.
124126
- `click` is an equivalent alias and accepts the same targets (`x y`, `@ref`, selector) and flags.
127+
- `dblclick` is shorthand for `click --double-tap`.
125128

126129
Gesture series examples:
127130

@@ -135,8 +138,8 @@ agent-device swipe 540 1500 540 500 120 --count 8 --pause-ms 30 --pattern ping-p
135138
## Command Index
136139
- `boot`, `open`, `close`, `reinstall`, `home`, `back`, `app-switcher`
137140
- `batch`
138-
- `snapshot`, `find`, `get`
139-
- `press` (alias: `click`), `focus`, `type`, `fill`, `long-press`, `swipe`, `scroll`, `scrollintoview`, `pinch`, `is`
141+
- `snapshot`, `diff`, `find`, `get`
142+
- `press` (aliases: `click`, `dblclick`), `focus`, `type`, `fill`, `long-press`, `swipe`, `scroll`, `scrollintoview`, `pinch`, `is`
140143
- `alert`, `wait`, `screenshot`
141144
- `trace start`, `trace stop`
142145
- `settings wifi|airplane|location on|off`
@@ -149,6 +152,7 @@ Notes:
149152
- iOS snapshots use XCTest on simulators and physical devices.
150153
- Scope snapshots with `-s "<label>"` or `-s @ref`.
151154
- If XCTest returns 0 nodes (e.g., foreground app changed), agent-device fails explicitly.
155+
- `diff snapshot` compares the current snapshot against the previous snapshot in the same session and then updates the baseline.
152156

153157
Flags:
154158
- `--version, -V` print version and exit
@@ -162,7 +166,7 @@ Flags:
162166
- `--interval-ms <ms>` delay between `press` iterations
163167
- `--hold-ms <ms>` hold duration per `press` iteration
164168
- `--jitter-px <n>` deterministic coordinate jitter for `press`
165-
- `--double-tap` use a double-tap gesture per `press`/`click` iteration (cannot be combined with `--hold-ms` or `--jitter-px`)
169+
- `--double-tap` use a double-tap gesture per `press`/`click`/`dblclick` iteration (cannot be combined with `--hold-ms` or `--jitter-px`)
166170
- `--pause-ms <ms>` delay between `swipe` iterations
167171
- `--pattern one-way|ping-pong` repeat pattern for `swipe`
168172
- `--debug` (alias: `--verbose`) for debug diagnostics + daemon/runner logs
@@ -235,7 +239,7 @@ Replay update:
235239
- `replay <path>` runs deterministic replay from `.ad` scripts.
236240
- `replay -u <path>` attempts selector updates on failures and atomically rewrites the same file.
237241
- Refs are the default/core mechanism for interactive agent flows.
238-
- Update targets: `click`, `fill`, `get`, `is`, `wait`.
242+
- Update targets: `click`, `dblclick`, `fill`, `get`, `is`, `wait`.
239243
- Selector matching is a replay-update internal: replay parses `.ad` lines into actions, tries them, snapshots on failure, resolves a better selector, then rewrites that failing line.
240244

241245
Update examples:

skills/agent-device/SKILL.md

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@ npx -y agent-device
2929

3030
1. Open app or deep link: `open [app|url] [url]` (`open` handles target selection + boot/activation in the normal flow)
3131
2. Snapshot: `snapshot` to get refs from accessibility tree
32-
3. Interact using refs (`press @ref`, `fill @ref "text"`; `click` is an alias of `press`)
32+
3. Interact using refs (`press @ref`, `fill @ref "text"`; `click` is an alias of `press`, `dblclick` is an alias of `click --double-tap`)
3333
4. Re-snapshot after navigation/UI changes
3434
5. Close session when done
3535

@@ -115,7 +115,8 @@ agent-device appstate
115115
### Interactions (use @refs from snapshot)
116116

117117
```bash
118-
agent-device press @e1 # Canonical tap command (`click` is an alias)
118+
agent-device press @e1 # Canonical tap command (`click` is an alias, `dblclick` is a double-tap alias)
119+
agent-device dblclick @e1 # Equivalent to: click @e1 --double-tap
119120
agent-device focus @e2
120121
agent-device fill @e2 "text" # Clear then type (Android: verifies value and retries once on mismatch)
121122
agent-device type "text" # Type into focused field without clearing
@@ -230,9 +231,9 @@ agent-device apps --platform android --user-installed
230231

231232
## Best practices
232233

233-
- `press` is the canonical tap command; `click` is an alias with the same behavior.
234-
- `press` (and `click`) accepts `x y`, `@ref`, and selector targets.
235-
- `press`/`click` support gesture series controls: `--count`, `--interval-ms`, `--hold-ms`, `--jitter-px`, `--double-tap`.
234+
- `press` is the canonical tap command; `click` is an alias with the same behavior; `dblclick` is shorthand for `click --double-tap`.
235+
- `press`, `click`, and `dblclick` accept `x y`, `@ref`, and selector targets.
236+
- `press`/`click`/`dblclick` support gesture series controls: `--count`, `--interval-ms`, `--hold-ms`, `--jitter-px`, `--double-tap`.
236237
- `--double-tap` cannot be combined with `--hold-ms` or `--jitter-px`.
237238
- `swipe` supports coordinate + timing controls and repeat patterns: `swipe x1 y1 x2 y2 [durationMs] --count --pause-ms --pattern`.
238239
- `swipe` timing is platform-safe: Android uses requested duration; iOS uses normalized safe timing to avoid long-press side effects.

src/cli.ts

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
import { parseArgs, toDaemonFlags, usage, usageForCommand } from './utils/args.ts';
22
import { asAppError, AppError, normalizeError } from './utils/errors.ts';
3-
import { formatSnapshotText, printHumanError, printJson } from './utils/output.ts';
3+
import { formatSnapshotDiffText, formatSnapshotText, printHumanError, printJson } from './utils/output.ts';
44
import { readVersion } from './utils/version.ts';
55
import { pathToFileURL } from 'node:url';
66
import { sendToDaemon } from './daemon-client.ts';
@@ -189,6 +189,11 @@ export async function runCli(argv: string[], deps: CliDeps = DEFAULT_CLI_DEPS):
189189
if (logTailStopper) logTailStopper();
190190
return;
191191
}
192+
if (command === 'diff' && positionals[0]?.toLowerCase() === 'snapshot') {
193+
process.stdout.write(formatSnapshotDiffText((response.data ?? {}) as Record<string, unknown>));
194+
if (logTailStopper) logTailStopper();
195+
return;
196+
}
192197
if (command === 'get') {
193198
const sub = positionals[0];
194199
if (sub === 'text') {
@@ -235,7 +240,7 @@ export async function runCli(argv: string[], deps: CliDeps = DEFAULT_CLI_DEPS):
235240
if (logTailStopper) logTailStopper();
236241
return;
237242
}
238-
if (command === 'click' || command === 'press') {
243+
if (command === 'click' || command === 'press' || command === 'dblclick') {
239244
const ref = (response.data as any)?.ref ?? '';
240245
const x = (response.data as any)?.x;
241246
const y = (response.data as any)?.y;

src/daemon.ts

Lines changed: 9 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -203,8 +203,15 @@ function finalizeDaemonResponse(response: DaemonResponse): DaemonResponse {
203203
}
204204

205205
function normalizeAliasedCommands(req: DaemonRequest): DaemonRequest {
206-
if (req.command !== 'click') return req;
207-
return { ...req, command: 'press' };
206+
if (req.command === 'dblclick') {
207+
return {
208+
...req,
209+
command: 'press',
210+
flags: { ...(req.flags ?? {}), doubleTap: true },
211+
};
212+
}
213+
if (req.command === 'click') return { ...req, command: 'press' };
214+
return req;
208215
}
209216

210217
function writeInfo(port: number): void {

src/daemon/handlers/__tests__/session.test.ts

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1083,6 +1083,38 @@ test('replay parses press series flags and passes them to invoke', async () => {
10831083
assert.equal(invoked[0]?.flags?.doubleTap, true);
10841084
});
10851085

1086+
test('replay parses dblclick alias and passes click-series flags to invoke', async () => {
1087+
const sessionStore = makeSessionStore();
1088+
const replayRoot = fs.mkdtempSync(path.join(os.tmpdir(), 'agent-device-replay-dblclick-series-'));
1089+
const replayPath = path.join(replayRoot, 'dblclick-series.ad');
1090+
fs.writeFileSync(replayPath, 'dblclick @e5 --count 2\n');
1091+
1092+
const invoked: DaemonRequest[] = [];
1093+
const response = await handleSessionCommands({
1094+
req: {
1095+
token: 't',
1096+
session: 'default',
1097+
command: 'replay',
1098+
positionals: [replayPath],
1099+
flags: {},
1100+
},
1101+
sessionName: 'default',
1102+
logPath: path.join(os.tmpdir(), 'daemon.log'),
1103+
sessionStore,
1104+
invoke: async (req) => {
1105+
invoked.push(req);
1106+
return { ok: true, data: {} };
1107+
},
1108+
});
1109+
1110+
assert.ok(response);
1111+
assert.equal(response?.ok, true);
1112+
assert.equal(invoked.length, 1);
1113+
assert.equal(invoked[0]?.command, 'dblclick');
1114+
assert.deepEqual(invoked[0]?.positionals, ['@e5']);
1115+
assert.equal(invoked[0]?.flags?.count, 2);
1116+
});
1117+
10861118
test('replay inherits parent device selectors for each invoked step', async () => {
10871119
const sessionStore = makeSessionStore();
10881120
const replayRoot = fs.mkdtempSync(path.join(os.tmpdir(), 'agent-device-replay-parent-selectors-'));

src/daemon/script-utils.ts

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -14,8 +14,8 @@ const SWIPE_NUMERIC_FLAG_MAP = new Map<string, 'count' | 'pauseMs'>([
1414
['--pause-ms', 'pauseMs'],
1515
]);
1616

17-
export function isClickLikeCommand(command: string): command is 'click' | 'press' {
18-
return command === 'click' || command === 'press';
17+
export function isClickLikeCommand(command: string): command is 'click' | 'press' | 'dblclick' {
18+
return command === 'click' || command === 'press' || command === 'dblclick';
1919
}
2020

2121
export function formatScriptArg(value: string): string {

src/utils/__tests__/args.test.ts

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -110,6 +110,13 @@ test('parseArgs recognizes click series flags', () => {
110110
assert.equal(parsed.flags.intervalMs, 10);
111111
});
112112

113+
test('parseArgs treats dblclick alias as parser-level command without implicit defaults', () => {
114+
const parsed = parseArgs(['dblclick', '@e5'], { strictFlags: true });
115+
assert.equal(parsed.command, 'dblclick');
116+
assert.deepEqual(parsed.positionals, ['@e5']);
117+
assert.equal(parsed.flags.doubleTap, undefined);
118+
});
119+
113120
test('parseArgs recognizes double-tap flag for repeated press', () => {
114121
const parsed = parseArgs(['press', '201', '545', '--count', '5', '--double-tap'], { strictFlags: true });
115122
assert.equal(parsed.command, 'press');
@@ -149,6 +156,7 @@ test('parseArgs rejects invalid swipe pattern', () => {
149156

150157
test('usage includes --relaunch flag', () => {
151158
assert.match(usage(), /--relaunch/);
159+
assert.match(usage(), /dblclick <x y\|@ref\|selector>/);
152160
assert.match(usage(), /--save-script \[path\]/);
153161
assert.match(usage(), /pinch <scale> \[x\] \[y\]/);
154162
assert.doesNotMatch(usage(), /--metadata/);
@@ -202,6 +210,14 @@ test('snapshot command accepts command-specific flags', () => {
202210
assert.equal(parsed.flags.snapshotScope, 'Login');
203211
});
204212

213+
test('diff snapshot command accepts snapshot flags', () => {
214+
const parsed = parseArgs(['diff', 'snapshot', '--depth', '2', '--raw'], { strictFlags: true });
215+
assert.equal(parsed.command, 'diff');
216+
assert.deepEqual(parsed.positionals, ['snapshot']);
217+
assert.equal(parsed.flags.snapshotDepth, 2);
218+
assert.equal(parsed.flags.snapshotRaw, true);
219+
});
220+
205221
test('unknown short flags are rejected', () => {
206222
assert.throws(
207223
() => parseArgs(['press', '10', '20', '-x'], { strictFlags: true }),
@@ -266,6 +282,7 @@ test('invalid range errors are deterministic', () => {
266282
test('usage includes swipe and press series options', () => {
267283
const help = usage();
268284
assert.match(help, /swipe <x1> <y1> <x2> <y2>/);
285+
assert.match(help, /diff snapshot/);
269286
assert.match(help, /--pattern one-way\|ping-pong/);
270287
assert.match(help, /--interval-ms/);
271288
assert.match(help, /settings <wifi\|airplane\|location\|faceid>/);

src/utils/command-schema.ts

Lines changed: 27 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -71,12 +71,23 @@ const SNAPSHOT_FLAGS = [
7171
'snapshotRaw',
7272
] as const satisfies readonly FlagKey[];
7373

74+
const DIFF_SNAPSHOT_FLAGS = [...SNAPSHOT_FLAGS] as const satisfies readonly FlagKey[];
75+
7476
const SELECTOR_SNAPSHOT_FLAGS = [
7577
'snapshotDepth',
7678
'snapshotScope',
7779
'snapshotRaw',
7880
] as const satisfies readonly FlagKey[];
7981

82+
const CLICK_LIKE_FLAGS = [
83+
'count',
84+
'intervalMs',
85+
'holdMs',
86+
'jitterPx',
87+
'doubleTap',
88+
...SELECTOR_SNAPSHOT_FLAGS,
89+
] as const satisfies readonly FlagKey[];
90+
8091
const FIND_SNAPSHOT_FLAGS = ['snapshotDepth', 'snapshotRaw'] as const satisfies readonly FlagKey[];
8192

8293
export const FLAG_DEFINITIONS: readonly FlagDefinition[] = [
@@ -370,6 +381,12 @@ export const COMMAND_SCHEMAS: Record<string, CommandSchema> = {
370381
positionalArgs: [],
371382
allowedFlags: [...SNAPSHOT_FLAGS],
372383
},
384+
diff: {
385+
usageOverride: 'diff snapshot',
386+
description: 'Compare current snapshot against previous session snapshot',
387+
positionalArgs: ['kind'],
388+
allowedFlags: [...DIFF_SNAPSHOT_FLAGS],
389+
},
373390
devices: {
374391
description: 'List available devices',
375392
positionalArgs: [],
@@ -421,7 +438,15 @@ export const COMMAND_SCHEMAS: Record<string, CommandSchema> = {
421438
description: 'Tap/click by coordinates, snapshot ref, or selector',
422439
positionalArgs: ['target'],
423440
allowsExtraPositionals: true,
424-
allowedFlags: ['count', 'intervalMs', 'holdMs', 'jitterPx', 'doubleTap', ...SELECTOR_SNAPSHOT_FLAGS],
441+
allowedFlags: [...CLICK_LIKE_FLAGS],
442+
},
443+
dblclick: {
444+
usageOverride: 'dblclick <x y|@ref|selector>',
445+
description: 'Alias for click --double-tap',
446+
positionalArgs: ['target'],
447+
allowsExtraPositionals: true,
448+
allowedFlags: [...CLICK_LIKE_FLAGS],
449+
skipCapabilityCheck: true,
425450
},
426451
get: {
427452
usageOverride: 'get text|attrs <@ref|selector>',
@@ -447,7 +472,7 @@ export const COMMAND_SCHEMAS: Record<string, CommandSchema> = {
447472
description: 'Tap/press by coordinates, snapshot ref, or selector (supports repeated series)',
448473
positionalArgs: ['targetOrX', 'y?'],
449474
allowsExtraPositionals: true,
450-
allowedFlags: ['count', 'intervalMs', 'holdMs', 'jitterPx', 'doubleTap', ...SELECTOR_SNAPSHOT_FLAGS],
475+
allowedFlags: [...CLICK_LIKE_FLAGS],
451476
},
452477
'long-press': {
453478
description: 'Long press (where supported)',

0 commit comments

Comments
 (0)