Skip to content

Commit 427093d

Browse files
thymikeeclaude
andauthored
fix: return deterministic matched-target metadata for find click responses (#178)
* fix: return deterministic matched-target metadata for find click responses (#170) On success, build the response from the matched snapshot node rather than passing through whatever the underlying click/press handler returns. Response now includes: ref, locator, query, and x/y derived from the matched element's rect — all stable across runs. Also makes `dispatch` injectable in `handleFindCommands` (matching the pattern in `handleInteractionCommands`) and adds tests that assert the deterministic shape and verify that non-deterministic platform runner data does not bleed into the response. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * docs: document find click response shape in selectors and skill references Add response shape documentation for `find "<query>" click --json` to: - website/docs/docs/selectors.md (new "Response shape (click)" section) - skills/agent-device/SKILL.md (guardrail note) - skills/agent-device/references/snapshot-refs.md (new "find click response" section) Follows the fix in #178 that made the response deterministic. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
1 parent 8d4c914 commit 427093d

5 files changed

Lines changed: 157 additions & 7 deletions

File tree

skills/agent-device/SKILL.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -170,6 +170,7 @@ agent-device batch --steps-file /tmp/batch-steps.json --json
170170
- Re-snapshot after UI mutations (navigation/modal/list changes).
171171
- Prefer `snapshot -i`; scope/depth only when needed.
172172
- Use refs for discovery, selectors for replay/assertions.
173+
- `find "<query>" click --json` returns `{ ref, locator, query, x, y }` — all derived from the matched snapshot node. Do not rely on these fields from raw `press`/`click` responses for observability; use `find` instead.
173174
- Use `fill` for clear-then-type semantics; use `type` for focused append typing.
174175
- Use `install` for in-place app upgrades (keep app data when platform permits), and `reinstall` for deterministic fresh-state runs.
175176
- App binary format support for `install`/`reinstall`: Android `.apk`/`.aab`, iOS `.app`/`.ipa`.

skills/agent-device/references/snapshot-refs.md

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -76,6 +76,16 @@ Efficient pattern:
7676

7777
- If refs are unstable after UI transitions, switch to selector-based targeting and stop investing in ref-only flows.
7878

79+
## find click response
80+
81+
`find "<query>" click --json` returns deterministic matched-target metadata:
82+
83+
```json
84+
{ "ref": "@e3", "locator": "any", "query": "Increment", "x": 195, "y": 422 }
85+
```
86+
87+
Fields come from the matched snapshot node, not the platform runner. Use these for observability and replay quality — they are stable across runs for the same UI state.
88+
7989
## Replay note
8090

8191
- Prefer selector-based actions in recorded `.ad` replays.

src/daemon/handlers/__tests__/find.test.ts

Lines changed: 113 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,41 @@
11
import test from 'node:test';
22
import assert from 'node:assert/strict';
3-
import { parseFindArgs } from '../find.ts';
3+
import fs from 'node:fs';
4+
import os from 'node:os';
5+
import path from 'node:path';
6+
import { parseFindArgs, handleFindCommands } from '../find.ts';
47
import { AppError } from '../../../utils/errors.ts';
8+
import { SessionStore } from '../../session-store.ts';
9+
import type { SessionState } from '../../types.ts';
10+
import type { DaemonRequest, DaemonResponse } from '../../types.ts';
11+
12+
function makeSessionStore(): SessionStore {
13+
const root = fs.mkdtempSync(path.join(os.tmpdir(), 'agent-device-find-handler-'));
14+
return new SessionStore(path.join(root, 'sessions'));
15+
}
16+
17+
function makeSession(name: string): SessionState {
18+
return {
19+
name,
20+
device: {
21+
platform: 'ios',
22+
id: 'sim-1',
23+
name: 'iPhone 17 Pro',
24+
kind: 'simulator',
25+
booted: true,
26+
},
27+
createdAt: Date.now(),
28+
actions: [],
29+
};
30+
}
31+
32+
const INCREMENT_NODE = {
33+
type: 'Button',
34+
label: 'Increment',
35+
hittable: true,
36+
rect: { x: 50, y: 0, width: 100, height: 100 },
37+
depth: 0,
38+
};
539

640
test('parseFindArgs defaults to click with any locator', () => {
741
const parsed = parseFindArgs(['Login']);
@@ -97,3 +131,81 @@ test('parseFindArgs with bare locator yields empty query', () => {
97131
assert.equal(parsed.query, '');
98132
assert.equal(parsed.action, 'click');
99133
});
134+
135+
test('handleFindCommands click returns deterministic matched-target metadata', async () => {
136+
const sessionStore = makeSessionStore();
137+
const sessionName = 'default';
138+
sessionStore.set(sessionName, makeSession(sessionName));
139+
140+
const invokeCalls: DaemonRequest[] = [];
141+
const response = await handleFindCommands({
142+
req: {
143+
token: 't',
144+
session: sessionName,
145+
command: 'find',
146+
positionals: ['Increment', 'click'],
147+
flags: {},
148+
},
149+
sessionName,
150+
logPath: '/tmp/test.log',
151+
sessionStore,
152+
invoke: async (req) => {
153+
invokeCalls.push(req);
154+
// Simulate runner returning non-deterministic platform data that should not bleed through
155+
return { ok: true, data: { platformSpecificRef: 'XCUIElementTypeApplication', x: 0, y: 0 } };
156+
},
157+
dispatch: async (_device, command) => {
158+
if (command === 'snapshot') {
159+
return { nodes: [INCREMENT_NODE] };
160+
}
161+
return {};
162+
},
163+
});
164+
165+
assert.ok(response, 'expected a response');
166+
assert.ok(response.ok, 'expected success');
167+
const data = response.data as Record<string, unknown>;
168+
169+
// Deterministic matched-target metadata
170+
assert.equal(data.ref, '@e1', 'ref must match the resolved snapshot node');
171+
assert.equal(data.locator, 'any', 'locator must reflect the find strategy');
172+
assert.equal(data.query, 'Increment', 'query must reflect the search term');
173+
assert.equal(data.x, 100, 'x must be derived from the matched node rect center');
174+
assert.equal(data.y, 50, 'y must be derived from the matched node rect center');
175+
176+
// Non-deterministic platform data must not leak through
177+
assert.equal(data.platformSpecificRef, undefined, 'platform runner data must not appear in response');
178+
179+
// invoke was called with the resolved ref
180+
assert.equal(invokeCalls.length, 1);
181+
assert.equal(invokeCalls[0].positionals?.[0], '@e1');
182+
});
183+
184+
test('handleFindCommands click with explicit label locator returns locator in metadata', async () => {
185+
const sessionStore = makeSessionStore();
186+
const sessionName = 'default';
187+
sessionStore.set(sessionName, makeSession(sessionName));
188+
189+
const response = await handleFindCommands({
190+
req: {
191+
token: 't',
192+
session: sessionName,
193+
command: 'find',
194+
positionals: ['label', 'Increment', 'click'],
195+
flags: {},
196+
},
197+
sessionName,
198+
logPath: '/tmp/test.log',
199+
sessionStore,
200+
invoke: async () => ({ ok: true, data: {} }),
201+
dispatch: async (_device, command) => {
202+
if (command === 'snapshot') return { nodes: [INCREMENT_NODE] };
203+
return {};
204+
},
205+
});
206+
207+
assert.ok(response?.ok);
208+
const data = response!.data as Record<string, unknown>;
209+
assert.equal(data.locator, 'label');
210+
assert.equal(data.query, 'Increment');
211+
});

src/daemon/handlers/find.ts

Lines changed: 14 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -19,8 +19,10 @@ export async function handleFindCommands(params: {
1919
logPath: string;
2020
sessionStore: SessionStore;
2121
invoke: (req: DaemonRequest) => Promise<DaemonResponse>;
22+
dispatch?: typeof dispatchCommand;
2223
}): Promise<DaemonResponse | null> {
2324
const { req, sessionName, logPath, sessionStore, invoke } = params;
25+
const dispatch = params.dispatch ?? dispatchCommand;
2426
const command = req.command;
2527
if (command !== 'find') return null;
2628

@@ -59,7 +61,7 @@ export async function handleFindCommands(params: {
5961
if (lastNodes && now - lastSnapshotAt < 750) {
6062
return { nodes: lastNodes };
6163
}
62-
const data = (await dispatchCommand(device, 'snapshot', [], req.flags?.out, {
64+
const data = (await dispatch(device, 'snapshot', [], req.flags?.out, {
6365
...contextFromFlags(
6466
logPath,
6567
{
@@ -187,15 +189,21 @@ export async function handleFindCommands(params: {
187189
flags: actionFlags,
188190
});
189191
if (!response.ok) return response;
192+
const matchCoords = resolvedNode.rect ? centerOfRect(resolvedNode.rect) : null;
193+
const matchData: Record<string, unknown> = { ref, locator, query };
194+
if (matchCoords) {
195+
matchData.x = matchCoords.x;
196+
matchData.y = matchCoords.y;
197+
}
190198
if (session) {
191199
sessionStore.recordAction(session, {
192200
command,
193201
positionals: req.positionals ?? [],
194202
flags: req.flags ?? {},
195-
result: { ref, action: 'click' },
203+
result: { ref, action: 'click', locator, query },
196204
});
197205
}
198-
return response;
206+
return { ok: true, data: matchData };
199207
}
200208
if (action === 'fill') {
201209
if (!value) {
@@ -224,7 +232,7 @@ export async function handleFindCommands(params: {
224232
if (!coords) {
225233
return { ok: false, error: { code: 'COMMAND_FAILED', message: 'matched element has no bounds' } };
226234
}
227-
const response = await dispatchCommand(device, 'focus', [String(coords.x), String(coords.y)], req.flags?.out, {
235+
const response = await dispatch(device, 'focus', [String(coords.x), String(coords.y)], req.flags?.out, {
228236
...contextFromFlags(logPath, req.flags, session?.appBundleId, session?.trace?.outPath),
229237
});
230238
if (session) {
@@ -245,10 +253,10 @@ export async function handleFindCommands(params: {
245253
if (!coords) {
246254
return { ok: false, error: { code: 'COMMAND_FAILED', message: 'matched element has no bounds' } };
247255
}
248-
await dispatchCommand(device, 'focus', [String(coords.x), String(coords.y)], req.flags?.out, {
256+
await dispatch(device, 'focus', [String(coords.x), String(coords.y)], req.flags?.out, {
249257
...contextFromFlags(logPath, req.flags, session?.appBundleId, session?.trace?.outPath),
250258
});
251-
const response = await dispatchCommand(device, 'type', [value], req.flags?.out, {
259+
const response = await dispatch(device, 'type', [value], req.flags?.out, {
252260
...contextFromFlags(logPath, req.flags, session?.appBundleId, session?.trace?.outPath),
253261
});
254262
if (session) {

website/docs/docs/selectors.md

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,3 +20,22 @@ Tips:
2020
- Use `find ... wait <timeoutMs>` to wait for UI to appear.
2121
- Combine with scoped snapshots using `snapshot -s "<label>"` for speed.
2222
- [Android] If a matched node is not hittable, agent-device will click/focus the nearest hittable ancestor.
23+
24+
## Response shape (click)
25+
26+
`find "<query>" click --json` returns deterministic matched-target metadata derived from the resolved snapshot node:
27+
28+
```json
29+
{
30+
"ref": "@e3",
31+
"locator": "any",
32+
"query": "Sign In",
33+
"x": 195,
34+
"y": 422
35+
}
36+
```
37+
38+
- `ref` — snapshot ref of the matched (or nearest hittable ancestor) element.
39+
- `locator` — find strategy used (`any`, `text`, `label`, `value`, `role`, `id`).
40+
- `query` — the search term as provided.
41+
- `x`, `y` — tap coordinates derived from the matched element's rect center.

0 commit comments

Comments
 (0)