|
| 1 | +# ADR 0004: iOS Snapshot Backend Strategy |
| 2 | + |
| 3 | +## Status |
| 4 | + |
| 5 | +Accepted |
| 6 | + |
| 7 | +## Context |
| 8 | + |
| 9 | +Agent Device exposes iOS UI state through snapshots produced by the long-lived XCTest runner. The |
| 10 | +runner has two different snapshot needs: |
| 11 | + |
| 12 | +- rich diagnostics and selector disambiguation, where a recursive XCTest snapshot is useful because |
| 13 | + it preserves hierarchy, static text, wrappers, scroll containers, and ancestry; |
| 14 | +- agent-facing compact interactive context, where the important contract is fast, bounded discovery |
| 15 | + of visible controls and stable refs for the next action. |
| 16 | + |
| 17 | +These needs should not share one capture strategy blindly. Recursive `XCUIElement.snapshot()` is |
| 18 | +rich, but some real simulator app trees can make XCTest fail with `kAXErrorIllegalArgument` while |
| 19 | +the same app remains visually usable and can be inspected by lower-level simulator accessibility |
| 20 | +services. Bluesky is the current known example: Argent's `ax-service` can describe the screen, but |
| 21 | +XCTest recursive snapshots and typed `XCUIElementQuery` enumeration can degrade to no useful child |
| 22 | +nodes. |
| 23 | + |
| 24 | +This is different from presentation filtering. The daemon's snapshot presentation can hide noisy |
| 25 | +or inaccessible nodes, but it cannot recover nodes that XCTest never returns. More filters, |
| 26 | +Maestro-specific heuristics, or retries in the daemon would only make this failure slower and less |
| 27 | +predictable. |
| 28 | + |
| 29 | +## Decision |
| 30 | + |
| 31 | +Keep XCTest as the default iOS automation runner and split iOS snapshot capture into explicit |
| 32 | +strategies: |
| 33 | + |
| 34 | +- **Full tree strategy**: use recursive XCTest snapshots for normal/full snapshots, raw snapshots, |
| 35 | + diagnostics, and cases that need hierarchy. If XCTest reports a real AX serialization failure, |
| 36 | + preserve that error instead of pretending the UI is empty. |
| 37 | +- **Compact interactive strategy**: for `snapshot -i -c`, use a bounded flat XCTest query strategy |
| 38 | + that avoids recursive root snapshots and app/window property reads. It should prefer fast, |
| 39 | + one-screen actionability over hierarchy fidelity and should return a sparse root quickly when |
| 40 | + XCTest cannot enumerate controls. |
| 41 | +- **Future simulator AX-service strategy**: treat Bluesky-class failures as evidence that XCTest is |
| 42 | + not a complete semantic snapshot backend. A robust semantic fix should add a host-side simulator |
| 43 | + accessibility backend, similar in role to `idb` accessibility commands or Argent's `ax-service`, |
| 44 | + and normalize its output into the same `SnapshotNode` model. That backend can be simulator-only; |
| 45 | + physical devices can continue using XCTest unless a supported lower-level API exists. |
| 46 | + |
| 47 | +The daemon should make degraded compact output observable. If an iOS compact interactive snapshot |
| 48 | +contains only the synthetic application root, surface a warning so agents know the snapshot is |
| 49 | +bounded fallback output rather than proof that the screen has no controls. |
| 50 | + |
| 51 | +## Regression Notes |
| 52 | + |
| 53 | +PR #639 made XCTest AX serialization failures explicit instead of swallowing them as empty |
| 54 | +snapshots. That was the correct diagnostic change, but it exposed apps whose accessibility trees |
| 55 | +XCTest cannot serialize. |
| 56 | + |
| 57 | +The first compact fallback then still paid several XCTest reads (`app.label`, `app.identifier`, |
| 58 | +`app.frame`, window frame lookup) before enumerating flat controls. On broken trees those reads can |
| 59 | +hit the same AX failure path, which made `snapshot -i -c` much slower than the plain snapshot in |
| 60 | +some apps. PR #700 changed compact interactive snapshots to enter the flat strategy immediately and |
| 61 | +avoid those app/window reads. |
| 62 | + |
| 63 | +## Consequences |
| 64 | + |
| 65 | +Compact interactive snapshots are allowed to be less complete than full snapshots, but they must be |
| 66 | +bounded and honest. They should never block for the full daemon snapshot timeout because one app has |
| 67 | +a pathological AX tree. |
| 68 | + |
| 69 | +Full snapshots remain the right tool when hierarchy matters. They may still fail loudly on |
| 70 | +XCTest-broken trees; that failure is useful because retrying the same recursive capture is unlikely |
| 71 | +to reveal a different tree. |
| 72 | + |
| 73 | +A future AX-service backend is the correct place to regain Bluesky-class semantic coverage. It |
| 74 | +should be added as a platform backend with its own lifecycle, protocol, normalization, timing |
| 75 | +metrics, and fallback rules, not as another special case inside the XCTest runner. |
| 76 | + |
| 77 | +When adding new iOS snapshot behavior, maintainers should first decide which strategy owns it. If a |
| 78 | +change tries to make compact snapshots rich by reintroducing recursive snapshots, or tries to make |
| 79 | +full snapshots fast by hiding XCTest failures, it is probably crossing strategy boundaries. |
0 commit comments