Skip to content

iOS: HealthKit cross-process sheet fix + O(N) hierarchy walk#3313

Open
Leland-Takamine wants to merge 6 commits into
mainfrom
sg-healthkit-integration
Open

iOS: HealthKit cross-process sheet fix + O(N) hierarchy walk#3313
Leland-Takamine wants to merge 6 commits into
mainfrom
sg-healthkit-integration

Conversation

@Leland-Takamine
Copy link
Copy Markdown
Contributor

Summary

iOS cross-process system sheets (HealthKit, share sheet, photo picker, etc.) are stitched into the host app's snapshot, but the foreign process reports frames in its own window's local coordinates rather than screen coordinates. Maestro's coordinate-based tap was landing in the wrong place — most visibly, the HealthKit authorization sheet was untappable.

What changed

Cross-process frame correction in the iOS XCTest runner. ViewHierarchyHandler now walks the XCUIElementSnapshot tree directly and, at each cross-process window boundary, accumulates a coordinate-system offset that's inherited through the subtree. Descendant frames are rewritten to screen coordinates before being shipped to the host.

Boundary detection requires three signals to align before the correction is applied:

  • windowContextID transition between parent and descendant (both non-zero)
  • The descendant subtree contains a snapshot marked isRemote = 1
  • visibleFrame is finite

Requiring the remote signal in addition to the windowContextID transition guards against benign in-process boundaries (e.g. UITextEffectsWindow) where a non-zero visibleFrame delta is ordinary clipping rather than a coordinate-system mismatch.

The offset itself is visibleFrame.origin − frame.origin at the boundary node, accumulated into descendants.

Attributes are read from the snapshot's own properties (label, frame, identifier, sizes, etc.) rather than via snapshot.dictionaryRepresentation, because that call eagerly serializes the entire subtree and was making the walk O(N·D) when applied per-node. The two private attributes (windowContextID, displayID) go through KVC, matching the pattern already used for isRemote / visibleFrame.

Demo app additions

  • New Health Access button on the demo app home screen exercises the HealthKit authorization sheet via a MethodChannel to the iOS side, which calls HKHealthStore.requestAuthorization(toShare:read:) against the standard quantity types.
  • New Maestro flow e2e/demo_app/.maestro/issues/fail_health_access.yaml reproduces the bug: launches the app, opens the HealthKit sheet, taps "Turn On All", asserts the toggles flipped, taps "Allow". Tagged failing-when-implemented originally; passes once the fix is applied.

Perf

Direct /viewHierarchy benchmark against a real-world application with a very large view hierarchy (~1.3 MB JSON), iOS 26.1 sim, 50 samples (p50 / p90 / p99 ms):

p50 p90 p99
main 1750 1759 1766
final (this branch) 1717 1731 1742

Parity with main while correctly handling cross-process sheets.

simon-gilmurray and others added 6 commits May 21, 2026 12:37
When a host app snapshot merges a remote window (e.g. HealthKit authorization), descendant frames stay in window-local coords while taps use screen coords. Traverse the XCTest snapshot, detect windowContextID boundaries whose subtree contains remote elements, and apply visibleFrame − frame origin as an inherited offset. AXElement accepts an optional frame override from the walker.

Update the failing HealthKit e2e flow to assert dismissal via "Turn On All" rather than ambiguous "Health Access" text.

Rebuild checked-in Simulator driver zips.

Co-authored-by: Cursor <cursoragent@cursor.com>
The cross-process boundary fix called snapshot.dictionaryRepresentation
at every recursion level. dictionaryRepresentation eagerly serializes
the entire subtree per call, so the walk was O(N*D) (worst case O(N^2))
on deep trees. Benchmarked against a HealthKit screen on iOS 26.1 sim:
p50 /viewHierarchy latency was 2673ms vs 1750ms on main (+52.7%).

Walk the snapshot tree directly and read each node's attributes from
the snapshot's own properties (label, frame, identifier, etc.) plus
KVC for the two non-public ones (windowContextID, displayID). The
cross-process heuristic is unchanged.

Restores per-call latency to main parity (p50 1717ms).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants