Skip to content

Commit 94477dd

Browse files
authored
feat: add daemon-native JSON batch execution (#69)
* feat: add daemon-native JSON batch command for agent workflows * docs: add agent batching guides across readme website and skills * chore: remove batch stdin source and related docs * fix: harden batch step normalization and flag sanitization * refactor: simplify batch execution and validation reuse * docs: rename batching guidance and links
1 parent c1c50af commit 94477dd

16 files changed

Lines changed: 964 additions & 2 deletions

File tree

README.md

Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -45,6 +45,52 @@ agent-device click @e3
4545
agent-device close
4646
```
4747

48+
## Fast batching (JSON steps)
49+
50+
Use `batch` to execute multiple commands in a single daemon request.
51+
52+
CLI examples:
53+
54+
```bash
55+
agent-device batch \
56+
--session sim \
57+
--platform ios \
58+
--udid 00008150-001849640CF8401C \
59+
--steps-file /tmp/batch-steps.json \
60+
--json
61+
```
62+
63+
Small inline payloads are also supported:
64+
65+
```bash
66+
agent-device batch --steps '[{"command":"open","positionals":["settings"]},{"command":"wait","positionals":["100"]}]'
67+
```
68+
69+
Batch payload format:
70+
71+
```json
72+
[
73+
{ "command": "open", "positionals": ["settings"], "flags": {} },
74+
{ "command": "wait", "positionals": ["label=\"Privacy & Security\"", "3000"], "flags": {} },
75+
{ "command": "click", "positionals": ["label=\"Privacy & Security\""], "flags": {} },
76+
{ "command": "get", "positionals": ["text", "label=\"Tracking\""], "flags": {} }
77+
]
78+
```
79+
80+
Batch response includes:
81+
82+
- `total`, `executed`, `totalDurationMs`
83+
- per-step `results[]` with `durationMs`
84+
- failure context with failing `step` and `partialResults`
85+
86+
Agent usage guidelines:
87+
88+
- Keep each batch to one screen-local workflow.
89+
- Add sync guards (`wait`, `is exists`) after mutating steps (`open`, `click`, `fill`, `swipe`).
90+
- Treat refs/snapshot assumptions as stale after UI mutations.
91+
- Prefer `--steps-file` over inline JSON for reliability.
92+
- Keep batches moderate (about 5-20 steps) and stop on first error.
93+
4894
## CLI Usage
4995

5096
```bash
@@ -84,6 +130,7 @@ agent-device swipe 540 1500 540 500 120 --count 8 --pause-ms 30 --pattern ping-p
84130

85131
## Command Index
86132
- `boot`, `open`, `close`, `reinstall`, `home`, `back`, `app-switcher`
133+
- `batch`
87134
- `snapshot`, `find`, `get`
88135
- `click`, `focus`, `type`, `fill`, `press`, `long-press`, `swipe`, `scroll`, `scrollintoview`, `pinch`, `is`
89136
- `alert`, `wait`, `screenshot`
@@ -114,6 +161,10 @@ Flags:
114161
- `--pattern one-way|ping-pong` repeat pattern for `swipe`
115162
- `--verbose` for daemon and runner logs
116163
- `--json` for structured output
164+
- `--steps <json>` batch: JSON array of steps
165+
- `--steps-file <path>` batch: read step JSON from file
166+
- `--on-error stop` batch: stop when a step fails
167+
- `--max-steps <n>` batch: max allowed steps per request
117168

118169
Pinch:
119170
- `pinch` is supported on iOS simulators.

skills/agent-device/SKILL.md

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -155,6 +155,50 @@ agent-device replay -u ./session.ad # Update selector drift and rewrite .ad sc
155155
`--save-script` path is a file path; parent directories are created automatically.
156156
For ambiguous bare values, use `--save-script=workflow.ad` or `./workflow.ad`.
157157

158+
### Fast batching (JSON steps)
159+
160+
Use `batch` when an agent already has a known short sequence and wants fewer orchestration round trips.
161+
162+
```bash
163+
agent-device batch \
164+
--session sim \
165+
--platform ios \
166+
--udid 00008150-001849640CF8401C \
167+
--steps-file /tmp/batch-steps.json \
168+
--json
169+
```
170+
171+
Inline JSON works for small payloads:
172+
173+
```bash
174+
agent-device batch --steps '[{"command":"open","positionals":["settings"]},{"command":"wait","positionals":["100"]}]'
175+
```
176+
177+
Step format:
178+
179+
```json
180+
[
181+
{ "command": "open", "positionals": ["settings"], "flags": {} },
182+
{ "command": "wait", "positionals": ["label=\"Privacy & Security\"", "3000"], "flags": {} },
183+
{ "command": "click", "positionals": ["label=\"Privacy & Security\""], "flags": {} },
184+
{ "command": "get", "positionals": ["text", "label=\"Tracking\""], "flags": {} }
185+
]
186+
```
187+
188+
Batch best practices:
189+
190+
- Batch one screen-local flow at a time.
191+
- Add sync guards (`wait`, `is exists`) after mutating steps (`open`, `click`, `fill`, `swipe`).
192+
- Treat prior refs/snapshot assumptions as stale after UI mutations.
193+
- Prefer `--steps-file` over inline JSON.
194+
- Keep batches moderate (about 5-20 steps).
195+
- Use failure context (`step`, `partialResults`) to replan from the failed step.
196+
197+
Stale accessibility tree note:
198+
199+
- Rapid mutations can outrun accessibility tree updates.
200+
- Mitigate with explicit waits and phase splitting (navigate, verify/extract, cleanup).
201+
158202
### Trace logs (XCTest)
159203

160204
```bash
@@ -208,3 +252,4 @@ agent-device apps --platform android --user-installed
208252
- [references/permissions.md](references/permissions.md)
209253
- [references/video-recording.md](references/video-recording.md)
210254
- [references/coordinate-system.md](references/coordinate-system.md)
255+
- [references/batching.md](references/batching.md)
Lines changed: 79 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,79 @@
1+
# Batching
2+
3+
## When to use batch
4+
5+
- The agent already knows a short sequence of commands.
6+
- Steps belong to one logical screen flow.
7+
- You want one result object with per-step timing and failure context.
8+
9+
## When not to use batch
10+
11+
- Flows are unrelated and should be retried independently.
12+
- The workflow is highly dynamic and requires replanning after each step.
13+
- You need human approvals between steps.
14+
15+
## CLI patterns
16+
17+
From file:
18+
19+
```bash
20+
agent-device batch --session sim --platform ios --steps-file /tmp/batch-steps.json --json
21+
```
22+
23+
Inline (small payloads only):
24+
25+
```bash
26+
agent-device batch --steps '[{"command":"open","positionals":["settings"]}]'
27+
```
28+
29+
## Step payload contract
30+
31+
```json
32+
[
33+
{ "command": "open", "positionals": ["settings"], "flags": {} },
34+
{ "command": "wait", "positionals": ["label=\"Privacy & Security\"", "3000"], "flags": {} },
35+
{ "command": "click", "positionals": ["label=\"Privacy & Security\""], "flags": {} },
36+
{ "command": "get", "positionals": ["text", "label=\"Tracking\""], "flags": {} }
37+
]
38+
```
39+
40+
Rules:
41+
42+
- `positionals` optional, defaults to `[]`.
43+
- `flags` optional, defaults to `{}`.
44+
- nested `batch` and `replay` are rejected.
45+
- stop-on-first-error is the supported mode (`--on-error stop`).
46+
47+
## Response handling
48+
49+
Success includes:
50+
51+
- `total`, `executed`, `totalDurationMs`
52+
- `results[]` entries with `step`, `command`, `durationMs`, and optional `data`
53+
54+
Failure includes:
55+
56+
- `details.step`
57+
- `details.command`
58+
- `details.executed`
59+
- `details.partialResults`
60+
61+
Use these fields to replan from the first failing step.
62+
63+
## Common error categories and agent actions
64+
65+
- `INVALID_ARGS`: payload/step shape issue; fix payload and retry.
66+
- `SESSION_NOT_FOUND`: open or select the correct session, then retry.
67+
- `UNSUPPORTED_OPERATION`: switch command/target to supported operation.
68+
- `AMBIGUOUS_MATCH`: refine selector/locator, then retry failed step.
69+
- `COMMAND_FAILED`: add sync guard (`wait`, `is exists`) and retry from failed step.
70+
71+
## Reliability guardrails
72+
73+
- Add sync guards after mutating steps.
74+
- Assume snapshot/ref drift after navigation.
75+
- Keep batch size moderate (about 5-20 steps).
76+
- Split long workflows into phases:
77+
1. navigate
78+
2. verify/extract
79+
3. cleanup

src/__tests__/cli-batch.test.ts

Lines changed: 117 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,117 @@
1+
import test from 'node:test';
2+
import assert from 'node:assert/strict';
3+
import fs from 'node:fs';
4+
import os from 'node:os';
5+
import path from 'node:path';
6+
import { runCli } from '../cli.ts';
7+
import type { DaemonRequest, DaemonResponse } from '../daemon-client.ts';
8+
9+
class ExitSignal extends Error {
10+
public readonly code: number;
11+
12+
constructor(code: number) {
13+
super(`EXIT_${code}`);
14+
this.code = code;
15+
}
16+
}
17+
18+
type RunResult = {
19+
code: number | null;
20+
stdout: string;
21+
stderr: string;
22+
calls: Omit<DaemonRequest, 'token'>[];
23+
};
24+
25+
async function runCliCapture(
26+
argv: string[],
27+
): Promise<RunResult> {
28+
let stdout = '';
29+
let stderr = '';
30+
let code: number | null = null;
31+
const calls: Array<Omit<DaemonRequest, 'token'>> = [];
32+
33+
const originalExit = process.exit;
34+
const originalStdoutWrite = process.stdout.write.bind(process.stdout);
35+
const originalStderrWrite = process.stderr.write.bind(process.stderr);
36+
37+
(process as any).exit = ((nextCode?: number) => {
38+
throw new ExitSignal(nextCode ?? 0);
39+
}) as typeof process.exit;
40+
(process.stdout as any).write = ((chunk: unknown) => {
41+
stdout += String(chunk);
42+
return true;
43+
}) as typeof process.stdout.write;
44+
(process.stderr as any).write = ((chunk: unknown) => {
45+
stderr += String(chunk);
46+
return true;
47+
}) as typeof process.stderr.write;
48+
49+
const sendToDaemon = async (req: Omit<DaemonRequest, 'token'>): Promise<DaemonResponse> => {
50+
calls.push(req);
51+
return { ok: true, data: { total: 1, executed: 1, totalDurationMs: 1 } };
52+
};
53+
54+
try {
55+
await runCli(argv, { sendToDaemon });
56+
} catch (error) {
57+
if (error instanceof ExitSignal) code = error.code;
58+
else throw error;
59+
} finally {
60+
process.exit = originalExit;
61+
process.stdout.write = originalStdoutWrite;
62+
process.stderr.write = originalStderrWrite;
63+
}
64+
65+
return { code, stdout, stderr, calls };
66+
}
67+
68+
test('batch --steps parses JSON and forwards batchSteps only', async () => {
69+
const result = await runCliCapture([
70+
'batch',
71+
'--session',
72+
'sim',
73+
'--platform',
74+
'ios',
75+
'--steps',
76+
'[{"command":"open","positionals":["settings"]}]',
77+
'--json',
78+
]);
79+
assert.equal(result.code, null);
80+
assert.equal(result.calls.length, 1);
81+
const req = result.calls[0];
82+
assert.equal(req.command, 'batch');
83+
assert.equal(req.session, 'sim');
84+
assert.equal(req.flags?.platform, 'ios');
85+
assert.ok(Array.isArray(req.flags?.batchSteps));
86+
assert.equal((req.flags?.batchSteps ?? [])[0]?.command, 'open');
87+
assert.equal(Object.hasOwn(req.flags ?? {}, 'steps'), false);
88+
});
89+
90+
test('batch --steps-file parses file payload', async () => {
91+
const tmpDir = fs.mkdtempSync(path.join(os.tmpdir(), 'agent-device-batch-'));
92+
const stepsPath = path.join(tmpDir, 'steps.json');
93+
fs.writeFileSync(stepsPath, JSON.stringify([{ command: 'wait', positionals: ['100'] }]), 'utf8');
94+
const result = await runCliCapture(['batch', '--steps-file', stepsPath, '--json']);
95+
assert.equal(result.code, null);
96+
assert.equal(result.calls.length, 1);
97+
const req = result.calls[0];
98+
assert.equal(req.command, 'batch');
99+
assert.equal((req.flags?.batchSteps ?? [])[0]?.command, 'wait');
100+
});
101+
102+
test('batch --steps-file returns clear error for missing file', async () => {
103+
const result = await runCliCapture(['batch', '--steps-file', '/tmp/definitely-missing-batch-steps.json']);
104+
assert.equal(result.code, 1);
105+
assert.equal(result.calls.length, 0);
106+
assert.match(result.stderr, /Failed to read --steps-file/);
107+
});
108+
109+
test('batch --steps-file rejects invalid JSON payload', async () => {
110+
const tmpDir = fs.mkdtempSync(path.join(os.tmpdir(), 'agent-device-batch-invalid-'));
111+
const stepsPath = path.join(tmpDir, 'steps.json');
112+
fs.writeFileSync(stepsPath, '{"command":"open"', 'utf8');
113+
const result = await runCliCapture(['batch', '--steps-file', stepsPath]);
114+
assert.equal(result.code, 1);
115+
assert.equal(result.calls.length, 0);
116+
assert.match(result.stderr, /Batch steps must be valid JSON/);
117+
});

src/cli.ts

Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,8 @@ import { sendToDaemon } from './daemon-client.ts';
77
import fs from 'node:fs';
88
import os from 'node:os';
99
import path from 'node:path';
10+
import type { BatchStep } from './core/dispatch.ts';
11+
import { parseBatchStepsJson } from './core/batch.ts';
1012

1113
type CliDeps = {
1214
sendToDaemon: typeof sendToDaemon;
@@ -59,6 +61,33 @@ export async function runCli(argv: string[], deps: CliDeps = DEFAULT_CLI_DEPS):
5961
const sessionName = flags.session ?? process.env.AGENT_DEVICE_SESSION ?? 'default';
6062
const logTailStopper = flags.verbose && !flags.json ? startDaemonLogTail() : null;
6163
try {
64+
if (command === 'batch') {
65+
if (positionals.length > 0) {
66+
throw new AppError('INVALID_ARGS', 'batch does not accept positional arguments.');
67+
}
68+
const batchSteps = readBatchSteps(flags);
69+
const batchFlags = { ...daemonFlags, batchSteps };
70+
delete (batchFlags as Record<string, unknown>).steps;
71+
delete (batchFlags as Record<string, unknown>).stepsFile;
72+
73+
const response = await deps.sendToDaemon({
74+
session: sessionName,
75+
command: 'batch',
76+
positionals,
77+
flags: batchFlags,
78+
});
79+
if (!response.ok) {
80+
throw new AppError(response.error.code as any, response.error.message, response.error.details);
81+
}
82+
if (flags.json) {
83+
printJson({ success: true, data: response.data ?? {} });
84+
} else {
85+
renderBatchSummary(response.data ?? {});
86+
}
87+
if (logTailStopper) logTailStopper();
88+
return;
89+
}
90+
6291
if (command === 'session') {
6392
const sub = positionals[0] ?? 'list';
6493
if (sub !== 'list') {
@@ -252,6 +281,30 @@ export async function runCli(argv: string[], deps: CliDeps = DEFAULT_CLI_DEPS):
252281
}
253282
}
254283

284+
function renderBatchSummary(data: Record<string, unknown>): void {
285+
const total = typeof data.total === 'number' ? data.total : 0;
286+
const executed = typeof data.executed === 'number' ? data.executed : 0;
287+
const durationMs = typeof data.totalDurationMs === 'number' ? data.totalDurationMs : undefined;
288+
process.stdout.write(
289+
`Batch completed: ${executed}/${total} steps${durationMs !== undefined ? ` in ${durationMs}ms` : ''}\n`,
290+
);
291+
}
292+
293+
function readBatchSteps(flags: ReturnType<typeof parseArgs>['flags']): BatchStep[] {
294+
let raw = '';
295+
if (flags.steps) {
296+
raw = flags.steps;
297+
} else if (flags.stepsFile) {
298+
try {
299+
raw = fs.readFileSync(flags.stepsFile, 'utf8');
300+
} catch (error) {
301+
const message = error instanceof Error ? error.message : String(error);
302+
throw new AppError('INVALID_ARGS', `Failed to read --steps-file ${flags.stepsFile}: ${message}`);
303+
}
304+
}
305+
return parseBatchStepsJson(raw);
306+
}
307+
255308
function isDaemonStartupFailure(error: AppError): boolean {
256309
if (error.code !== 'COMMAND_FAILED') return false;
257310
if (error.details?.kind === 'daemon_startup_failed') return true;

0 commit comments

Comments
 (0)