Skip to content

feat: find/get digest response-views + batch-step elision — Phase 4#955

Merged
thymikee merged 2 commits into
mainfrom
phase4-find-get-batch-digest
Jun 30, 2026
Merged

feat: find/get digest response-views + batch-step elision — Phase 4#955
thymikee merged 2 commits into
mainfrom
phase4-find-get-batch-digest

Conversation

@thymikee

Copy link
Copy Markdown
Member

What

Completes the two remaining Phase 4 agent-cost grafts (plans/perfect-shape.md §5.4):

  1. find / get digest response-views — a token-cheap digest view registered for both selector reads.
  2. Typed batch-step elision — intermediate batch steps collapse to digest while the final step returns at the requested level.

Both are strictly opt-in: they only activate when a non-default responseLevel (digest/full) is requested. The default wire shape is byte-identical to today.

Why

North-star goal #2 (fast + cheap for AI agents): leveled/digest payloads cut tokens, and a multi-step batch collapses its intermediate noise. find/get are the last high-token selector reads without a digest (they carry a full matched node on the wire); batch is the canonical N-round-trips-into-1 path where intermediate snapshot/find/get steps dominate the response.

How

find / get digest (src/daemon/response-views.ts)

find and get share a wire shape (ref/selector + text for a text read, + node for an attrs read, plus found/waitedMs/coordinate signals), so a single selectorReadView is registered under both command names (mirrors the existing snapshotView/screenshotView pattern):

  • text read → keeps ref/selector + text, drops the redundant verbose node (the text is the answer).
  • attrs read → keeps a compacted node: semantic attributes only (role/type/subrole/label/value/identifier/enabled/selected/focused/hittable); the token sink — geometry (rect), tree indices (index/parentIndex/depth), and process/app plumbing (pid/bundleId/appName/windowTitle/surface/…) — is dropped.
  • exists / wait / click → keeps the cheap actionable signals (found/waitedMs/locator/query/x/y).
  • default and full return today's shape unchanged (same reference).

(Note: on the success path find matches at most one element — ambiguity is an AMBIGUOUS_MATCH error, not success data — so there is no count/unique field to surface; the digest keeps the single matched ref + label/text.)

Batch-step elision (src/core/batch.ts)

When req.meta.responseLevel is a non-default level, runBatchStep overrides the per-step responseLevel to digest for every step except the last, which keeps the requested level. The existing per-command views (snapshot/screenshot/find/get) then collapse those intermediate steps automatically.

Byte-identical-default safety argument

  • Views: applyResponseLevelView (request-router) only invokes a registered view when responseLevel is digest/full; for default/absent it returns the response untouched. selectorReadView itself early-returns the original data reference for any non-digest level. So find/get default output is unchanged.
  • Batch: batchStepMeta returns the same req.meta reference unless a non-default level was requested (gated on isNonDefaultResponseLevel). With no responseLevel (or default) every step is invoked with identical meta — byte-for-byte today's behavior. A unit test asserts the default path passes [undefined, undefined, undefined] through.

Verification (local)

  • Typecheck tsc -p tsconfig.json: pass
  • Lint oxlint . --deny-warnings: pass
  • Format oxfmt --check: pass
  • Build rslib build: pass
  • Fallow audit --base origin/main: pass (No issues in changed files)
  • Tests (changed files + selector/router/batch consumers): 63 pass

Add opt-in leveled response views for the find and get selector reads and
elide intermediate batch steps to digest, completing the two remaining
Phase 4 agent-cost grafts. All additions activate only when a non-default
responseLevel (digest/full) is requested; the default wire shape is
byte-identical to today (Maestro .ad recompare safe).

- response-views: register a shared selectorReadView under find and get.
  A text read keeps ref/selector + text and drops the redundant verbose
  node; an attrs read keeps a compacted node (semantic attributes only,
  geometry/index/process plumbing dropped); exists/wait/click keep their
  cheap actionable signals. default/full return today's shape unchanged.
- batch: when a non-default level is requested, intermediate steps are
  forced to digest while the final step keeps the requested level. With no
  responseLevel the per-step meta is passed through unchanged.
- tests mirror the existing response-views / response-level suites.
@github-actions

github-actions Bot commented Jun 30, 2026

Copy link
Copy Markdown

Size Report

Metric Base Current Diff
JS raw 1.4 MB 1.4 MB +532 B
JS gzip 455.9 kB 456.1 kB +203 B
npm tarball 557.7 kB 557.9 kB +174 B
npm unpacked 2.0 MB 2.0 MB +532 B

Startup median (7 runs, lower is better):

Scenario Base Current Diff
CLI --version 27.1 ms 27.1 ms +0.1 ms
CLI --help 48.0 ms 47.9 ms -0.1 ms

Top changed chunks:

Chunk Raw diff Gzip diff
dist/src/9722.js +375 B +115 B

@thymikee

Copy link
Copy Markdown
Member Author

I don’t think this is ready yet. The new find response view is registered at the command level, but find is not only a selector read: find fill, find focus, and find type return the underlying interaction response, and those responses can carry cheap actionable fields such as warning from Android dialog recovery or direct-selector fallback paths. The digest allowlist currently keeps only found/ref/selector/text/waitedMs/locator/query/x/y, so --level digest can silently drop warnings from a mutating find action.\n\nPlease either preserve these cheap diagnostic/action signals in the selector-read digest (at least warning, and likely other existing success/message fields that are not token sinks) or gate the view so it only collapses the read-only/click shapes it explicitly handles. Default shape is protected, but opt-in digest still needs to keep agent-critical warnings.

Review feedback on #955: `find` is registered command-wide, but
`find fill/focus/type` return the underlying INTERACTION response, which can
carry cheap, agent-critical signals (notably `warning` from Android
blocking-dialog recovery, plus `message`). The previous allowlist-based digest
silently dropped those under --level digest.

The only token sink in a find/get result is the verbose matched snapshot
`node`, which appears solely on a selector READ (text/attrs). The view is now
conservative: it acts ONLY on a result carrying such a node and otherwise
returns the data UNCHANGED, so node-less shapes (exists/wait/click and the
fill/focus/type interaction responses) are never narrowed. For a text read the
redundant node is dropped; for an attrs read the node is compacted; in both
cases every other cheap field (e.g. `warning`) is preserved verbatim.

Adds a regression test asserting a `find fill` response carrying a `warning`
is returned unchanged under digest.
@thymikee

Copy link
Copy Markdown
Member Author

Good catch — fixed in dcedd0e.

The bug was that find is registered command-wide, but find fill/focus/type return the underlying interaction response, and my allowlist-based digest silently dropped cheap, agent-critical signals from those — notably warning (Android blocking-dialog recovery / direct-selector fallback) and message.

Fix: the view is now conservative. The only token sink in a find/get result is the verbose matched snapshot node, which appears solely on a selector READ (text/attrs). The view now acts only on a result that carries such a node and otherwise returns the data unchanged — so node-less shapes (exists/wait/click and the fill/focus/type interaction responses) are never narrowed.

  • text read → drop the redundant node, keep every other cheap field (incl. warning);
  • attrs read → compact the node to its semantic attributes only, keep every other cheap field;
  • everything else (no node) → returned unchanged (same reference).

I confirmed against src/contracts/interaction.ts and the touch/interaction handlers that interaction responses build responseData from backendResult/x/y/text/message/warning/ref and never carry a snapshot node, so the node-presence discriminator cleanly separates reads from interactions.

Added a regression test: a find fill success response carrying a warning is returned unchanged under --level digest (asserts same reference + deep-equal). Existing tests stay green.

Local gates re-run: tsc, oxlint --deny-warnings, oxfmt --check, rslib build, fallow audit (no issues in changed files), and vitest (response-views + router-response-level + batch) all pass — 32 tests.

@thymikee

Copy link
Copy Markdown
Member Author

Re-reviewed dcedd0e after the warning-loss fix. The response view now only acts on selector-read payloads carrying a snapshot node; node-less interaction shapes pass through unchanged, and text/attrs keep non-node cheap fields. Batch response-level routing still preserves default meta and elides only intermediate non-default steps. Checks are green. No remaining blockers; marking ready-for-human.

@thymikee thymikee added the ready-for-human Valid work that needs human implementation, judgment, or maintainer merge label Jun 30, 2026
@thymikee thymikee merged commit afcf79a into main Jun 30, 2026
21 checks passed
@thymikee thymikee deleted the phase4-find-get-batch-digest branch June 30, 2026 11:55
@github-actions

Copy link
Copy Markdown
PR Preview Action v1.8.1
Preview removed because the pull request was closed.
2026-06-30 11:55 UTC

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready-for-human Valid work that needs human implementation, judgment, or maintainer merge

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant