Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
51 changes: 51 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,52 @@ agent-device click @e3
agent-device close
```

## Fast batching (JSON steps)

Use `batch` to execute multiple commands in a single daemon request.

CLI examples:

```bash
agent-device batch \
--session sim \
--platform ios \
--udid 00008150-001849640CF8401C \
--steps-file /tmp/batch-steps.json \
--json
```

Small inline payloads are also supported:

```bash
agent-device batch --steps '[{"command":"open","positionals":["settings"]},{"command":"wait","positionals":["100"]}]'
```

Batch payload format:

```json
[
{ "command": "open", "positionals": ["settings"], "flags": {} },
{ "command": "wait", "positionals": ["label=\"Privacy & Security\"", "3000"], "flags": {} },
{ "command": "click", "positionals": ["label=\"Privacy & Security\""], "flags": {} },
{ "command": "get", "positionals": ["text", "label=\"Tracking\""], "flags": {} }
]
```

Batch response includes:

- `total`, `executed`, `totalDurationMs`
- per-step `results[]` with `durationMs`
- failure context with failing `step` and `partialResults`

Agent usage guidelines:

- Keep each batch to one screen-local workflow.
- Add sync guards (`wait`, `is exists`) after mutating steps (`open`, `click`, `fill`, `swipe`).
- Treat refs/snapshot assumptions as stale after UI mutations.
- Prefer `--steps-file` over inline JSON for reliability.
- Keep batches moderate (about 5-20 steps) and stop on first error.

## CLI Usage

```bash
Expand Down Expand Up @@ -84,6 +130,7 @@ agent-device swipe 540 1500 540 500 120 --count 8 --pause-ms 30 --pattern ping-p

## Command Index
- `boot`, `open`, `close`, `reinstall`, `home`, `back`, `app-switcher`
- `batch`
- `snapshot`, `find`, `get`
- `click`, `focus`, `type`, `fill`, `press`, `long-press`, `swipe`, `scroll`, `scrollintoview`, `pinch`, `is`
- `alert`, `wait`, `screenshot`
Expand Down Expand Up @@ -114,6 +161,10 @@ Flags:
- `--pattern one-way|ping-pong` repeat pattern for `swipe`
- `--verbose` for daemon and runner logs
- `--json` for structured output
- `--steps <json>` batch: JSON array of steps
- `--steps-file <path>` batch: read step JSON from file
- `--on-error stop` batch: stop when a step fails
- `--max-steps <n>` batch: max allowed steps per request

Pinch:
- `pinch` is supported on iOS simulators.
Expand Down
45 changes: 45 additions & 0 deletions skills/agent-device/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -155,6 +155,50 @@ agent-device replay -u ./session.ad # Update selector drift and rewrite .ad sc
`--save-script` path is a file path; parent directories are created automatically.
For ambiguous bare values, use `--save-script=workflow.ad` or `./workflow.ad`.

### Fast batching (JSON steps)

Use `batch` when an agent already has a known short sequence and wants fewer orchestration round trips.

```bash
agent-device batch \
--session sim \
--platform ios \
--udid 00008150-001849640CF8401C \
--steps-file /tmp/batch-steps.json \
--json
```

Inline JSON works for small payloads:

```bash
agent-device batch --steps '[{"command":"open","positionals":["settings"]},{"command":"wait","positionals":["100"]}]'
```

Step format:

```json
[
{ "command": "open", "positionals": ["settings"], "flags": {} },
{ "command": "wait", "positionals": ["label=\"Privacy & Security\"", "3000"], "flags": {} },
{ "command": "click", "positionals": ["label=\"Privacy & Security\""], "flags": {} },
{ "command": "get", "positionals": ["text", "label=\"Tracking\""], "flags": {} }
]
```

Batch best practices:

- Batch one screen-local flow at a time.
- Add sync guards (`wait`, `is exists`) after mutating steps (`open`, `click`, `fill`, `swipe`).
- Treat prior refs/snapshot assumptions as stale after UI mutations.
- Prefer `--steps-file` over inline JSON.
- Keep batches moderate (about 5-20 steps).
- Use failure context (`step`, `partialResults`) to replan from the failed step.

Stale accessibility tree note:

- Rapid mutations can outrun accessibility tree updates.
- Mitigate with explicit waits and phase splitting (navigate, verify/extract, cleanup).

### Trace logs (XCTest)

```bash
Expand Down Expand Up @@ -208,3 +252,4 @@ agent-device apps --platform android --user-installed
- [references/permissions.md](references/permissions.md)
- [references/video-recording.md](references/video-recording.md)
- [references/coordinate-system.md](references/coordinate-system.md)
- [references/batching.md](references/batching.md)
79 changes: 79 additions & 0 deletions skills/agent-device/references/batching.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
# Batching

## When to use batch

- The agent already knows a short sequence of commands.
- Steps belong to one logical screen flow.
- You want one result object with per-step timing and failure context.

## When not to use batch

- Flows are unrelated and should be retried independently.
- The workflow is highly dynamic and requires replanning after each step.
- You need human approvals between steps.

## CLI patterns

From file:

```bash
agent-device batch --session sim --platform ios --steps-file /tmp/batch-steps.json --json
```

Inline (small payloads only):

```bash
agent-device batch --steps '[{"command":"open","positionals":["settings"]}]'
```

## Step payload contract

```json
[
{ "command": "open", "positionals": ["settings"], "flags": {} },
{ "command": "wait", "positionals": ["label=\"Privacy & Security\"", "3000"], "flags": {} },
{ "command": "click", "positionals": ["label=\"Privacy & Security\""], "flags": {} },
{ "command": "get", "positionals": ["text", "label=\"Tracking\""], "flags": {} }
]
```

Rules:

- `positionals` optional, defaults to `[]`.
- `flags` optional, defaults to `{}`.
- nested `batch` and `replay` are rejected.
- stop-on-first-error is the supported mode (`--on-error stop`).

## Response handling

Success includes:

- `total`, `executed`, `totalDurationMs`
- `results[]` entries with `step`, `command`, `durationMs`, and optional `data`

Failure includes:

- `details.step`
- `details.command`
- `details.executed`
- `details.partialResults`

Use these fields to replan from the first failing step.

## Common error categories and agent actions

- `INVALID_ARGS`: payload/step shape issue; fix payload and retry.
- `SESSION_NOT_FOUND`: open or select the correct session, then retry.
- `UNSUPPORTED_OPERATION`: switch command/target to supported operation.
- `AMBIGUOUS_MATCH`: refine selector/locator, then retry failed step.
- `COMMAND_FAILED`: add sync guard (`wait`, `is exists`) and retry from failed step.

## Reliability guardrails

- Add sync guards after mutating steps.
- Assume snapshot/ref drift after navigation.
- Keep batch size moderate (about 5-20 steps).
- Split long workflows into phases:
1. navigate
2. verify/extract
3. cleanup
117 changes: 117 additions & 0 deletions src/__tests__/cli-batch.test.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,117 @@
import test from 'node:test';
import assert from 'node:assert/strict';
import fs from 'node:fs';
import os from 'node:os';
import path from 'node:path';
import { runCli } from '../cli.ts';
import type { DaemonRequest, DaemonResponse } from '../daemon-client.ts';

class ExitSignal extends Error {
public readonly code: number;

constructor(code: number) {
super(`EXIT_${code}`);
this.code = code;
}
}

type RunResult = {
code: number | null;
stdout: string;
stderr: string;
calls: Omit<DaemonRequest, 'token'>[];
};

async function runCliCapture(
argv: string[],
): Promise<RunResult> {
let stdout = '';
let stderr = '';
let code: number | null = null;
const calls: Array<Omit<DaemonRequest, 'token'>> = [];

const originalExit = process.exit;
const originalStdoutWrite = process.stdout.write.bind(process.stdout);
const originalStderrWrite = process.stderr.write.bind(process.stderr);

(process as any).exit = ((nextCode?: number) => {
throw new ExitSignal(nextCode ?? 0);
}) as typeof process.exit;
(process.stdout as any).write = ((chunk: unknown) => {
stdout += String(chunk);
return true;
}) as typeof process.stdout.write;
(process.stderr as any).write = ((chunk: unknown) => {
stderr += String(chunk);
return true;
}) as typeof process.stderr.write;

const sendToDaemon = async (req: Omit<DaemonRequest, 'token'>): Promise<DaemonResponse> => {
calls.push(req);
return { ok: true, data: { total: 1, executed: 1, totalDurationMs: 1 } };
};

try {
await runCli(argv, { sendToDaemon });
} catch (error) {
if (error instanceof ExitSignal) code = error.code;
else throw error;
} finally {
process.exit = originalExit;
process.stdout.write = originalStdoutWrite;
process.stderr.write = originalStderrWrite;
}

return { code, stdout, stderr, calls };
}

test('batch --steps parses JSON and forwards batchSteps only', async () => {
const result = await runCliCapture([
'batch',
'--session',
'sim',
'--platform',
'ios',
'--steps',
'[{"command":"open","positionals":["settings"]}]',
'--json',
]);
assert.equal(result.code, null);
assert.equal(result.calls.length, 1);
const req = result.calls[0];
assert.equal(req.command, 'batch');
assert.equal(req.session, 'sim');
assert.equal(req.flags?.platform, 'ios');
assert.ok(Array.isArray(req.flags?.batchSteps));
assert.equal((req.flags?.batchSteps ?? [])[0]?.command, 'open');
assert.equal(Object.hasOwn(req.flags ?? {}, 'steps'), false);
});

test('batch --steps-file parses file payload', async () => {
const tmpDir = fs.mkdtempSync(path.join(os.tmpdir(), 'agent-device-batch-'));
const stepsPath = path.join(tmpDir, 'steps.json');
fs.writeFileSync(stepsPath, JSON.stringify([{ command: 'wait', positionals: ['100'] }]), 'utf8');
const result = await runCliCapture(['batch', '--steps-file', stepsPath, '--json']);
assert.equal(result.code, null);
assert.equal(result.calls.length, 1);
const req = result.calls[0];
assert.equal(req.command, 'batch');
assert.equal((req.flags?.batchSteps ?? [])[0]?.command, 'wait');
});

test('batch --steps-file returns clear error for missing file', async () => {
const result = await runCliCapture(['batch', '--steps-file', '/tmp/definitely-missing-batch-steps.json']);
assert.equal(result.code, 1);
assert.equal(result.calls.length, 0);
assert.match(result.stderr, /Failed to read --steps-file/);
});

test('batch --steps-file rejects invalid JSON payload', async () => {
const tmpDir = fs.mkdtempSync(path.join(os.tmpdir(), 'agent-device-batch-invalid-'));
const stepsPath = path.join(tmpDir, 'steps.json');
fs.writeFileSync(stepsPath, '{"command":"open"', 'utf8');
const result = await runCliCapture(['batch', '--steps-file', stepsPath]);
assert.equal(result.code, 1);
assert.equal(result.calls.length, 0);
assert.match(result.stderr, /Batch steps must be valid JSON/);
});
53 changes: 53 additions & 0 deletions src/cli.ts
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@ import { sendToDaemon } from './daemon-client.ts';
import fs from 'node:fs';
import os from 'node:os';
import path from 'node:path';
import type { BatchStep } from './core/dispatch.ts';
import { parseBatchStepsJson } from './core/batch.ts';

type CliDeps = {
sendToDaemon: typeof sendToDaemon;
Expand Down Expand Up @@ -59,6 +61,33 @@ export async function runCli(argv: string[], deps: CliDeps = DEFAULT_CLI_DEPS):
const sessionName = flags.session ?? process.env.AGENT_DEVICE_SESSION ?? 'default';
const logTailStopper = flags.verbose && !flags.json ? startDaemonLogTail() : null;
try {
if (command === 'batch') {
if (positionals.length > 0) {
throw new AppError('INVALID_ARGS', 'batch does not accept positional arguments.');
}
const batchSteps = readBatchSteps(flags);
const batchFlags = { ...daemonFlags, batchSteps };
delete (batchFlags as Record<string, unknown>).steps;
delete (batchFlags as Record<string, unknown>).stepsFile;

const response = await deps.sendToDaemon({
session: sessionName,
command: 'batch',
positionals,
flags: batchFlags,
});
if (!response.ok) {
throw new AppError(response.error.code as any, response.error.message, response.error.details);
}
if (flags.json) {
printJson({ success: true, data: response.data ?? {} });
} else {
renderBatchSummary(response.data ?? {});
}
if (logTailStopper) logTailStopper();
return;
}

if (command === 'session') {
const sub = positionals[0] ?? 'list';
if (sub !== 'list') {
Expand Down Expand Up @@ -252,6 +281,30 @@ export async function runCli(argv: string[], deps: CliDeps = DEFAULT_CLI_DEPS):
}
}

function renderBatchSummary(data: Record<string, unknown>): void {
const total = typeof data.total === 'number' ? data.total : 0;
const executed = typeof data.executed === 'number' ? data.executed : 0;
const durationMs = typeof data.totalDurationMs === 'number' ? data.totalDurationMs : undefined;
process.stdout.write(
`Batch completed: ${executed}/${total} steps${durationMs !== undefined ? ` in ${durationMs}ms` : ''}\n`,
);
}

function readBatchSteps(flags: ReturnType<typeof parseArgs>['flags']): BatchStep[] {
let raw = '';
if (flags.steps) {
raw = flags.steps;
} else if (flags.stepsFile) {
try {
raw = fs.readFileSync(flags.stepsFile, 'utf8');
} catch (error) {
const message = error instanceof Error ? error.message : String(error);
throw new AppError('INVALID_ARGS', `Failed to read --steps-file ${flags.stepsFile}: ${message}`);
}
}
return parseBatchStepsJson(raw);
}

function isDaemonStartupFailure(error: AppError): boolean {
if (error.code !== 'COMMAND_FAILED') return false;
if (error.details?.kind === 'daemon_startup_failed') return true;
Expand Down
Loading
Loading