|
| 1 | +# ADR 0002: Persistent Platform Helper Sessions |
| 2 | + |
| 3 | +## Status |
| 4 | + |
| 5 | +Accepted |
| 6 | + |
| 7 | +## Context |
| 8 | + |
| 9 | +Some platform automation backends are expensive to start but cheap to reuse. iOS already uses a |
| 10 | +long-lived XCTest runner session with an HTTP transport. That model avoids paying `xcodebuild`, |
| 11 | +runner boot, and XCTest readiness costs for every command, while still allowing the daemon to |
| 12 | +invalidate the runner when the device, app, bundle, or runner process changes. |
| 13 | + |
| 14 | +Android snapshot capture initially used a one-shot instrumentation helper. Every snapshot launched |
| 15 | +`adb shell am instrument`, connected `UiAutomation`, captured the tree, emitted XML, and exited. |
| 16 | +Recent Android snapshot optimizations reduced XML size, idle waiting, extra file I/O, and hidden |
| 17 | +content hint work, but a throwaway prototype still showed that process/session startup dominates |
| 18 | +steady-state latency: |
| 19 | + |
| 20 | +- launcher snapshot: one-shot p50 `227ms`, persistent socket p50 `5.8ms` |
| 21 | +- React Navigation playground snapshot: one-shot p50 `265.7ms`, persistent socket p50 `16.5ms` |
| 22 | + |
| 23 | +The same pressure can appear on new platform adapters. HarmonyOS or other device backends may have |
| 24 | +host tools, test runners, accessibility services, or bridge processes with the same shape: expensive |
| 25 | +startup, cheap repeated commands, and a need for strict invalidation. |
| 26 | + |
| 27 | +## Decision |
| 28 | + |
| 29 | +Use persistent platform helper sessions when a backend has high startup cost and a reusable |
| 30 | +automation context. |
| 31 | + |
| 32 | +A helper session is an optimization layer owned by the daemon, not a replacement for command |
| 33 | +correctness. It may keep processes, sockets, runner state, accessibility service flags, or device |
| 34 | +forwards warm. It must still execute each command against fresh platform state unless a separate |
| 35 | +cache contract has explicit invalidation. |
| 36 | + |
| 37 | +The session pattern is: |
| 38 | + |
| 39 | +- start lazily on the first command that benefits from reuse |
| 40 | +- bind the session to a device identity and helper/runner identity |
| 41 | +- communicate through a small validated protocol with request ids and version metadata |
| 42 | +- reuse the session while the identity and protocol remain valid |
| 43 | +- invalidate on device disconnect, helper reinstall/version change, process exit, socket/protocol |
| 44 | + failure, app/session identity change, or capture options that affect command semantics |
| 45 | +- fall back to the existing one-shot path for the current command when reuse fails |
| 46 | +- make shutdown best effort and make stale sessions disposable |
| 47 | + |
| 48 | +For Android snapshots, productize a persistent helper mode that keeps `UiAutomation` alive and |
| 49 | +serves fresh snapshot requests over an `adb forward` socket. Do not add snapshot result caching as |
| 50 | +part of that first step. The first reliable win is infrastructure reuse, not data reuse. |
| 51 | + |
| 52 | +For iOS, keep the XCTest runner session as the reference implementation for lifecycle and |
| 53 | +invalidation behavior. Android does not need to copy iOS internals, but it should reuse the same |
| 54 | +daemon-side ideas: per-device session manager, readiness checks, structured protocol errors, |
| 55 | +fallback/invalidation, and request-scoped observability. |
| 56 | + |
| 57 | +For future platforms such as HarmonyOS, prefer designing adapters around this same helper-session |
| 58 | +contract when their native automation layer is runner-like. Avoid embedding platform-specific |
| 59 | +startup assumptions directly in command handlers. |
| 60 | + |
| 61 | +## Alternatives Considered |
| 62 | + |
| 63 | +- Keep one-shot helpers only: simplest and robust, but Android measurements show it leaves an order |
| 64 | + of magnitude of steady-state snapshot performance on the table. |
| 65 | +- Cache snapshots in the daemon: faster for repeated reads, but unsafe after mutations, animations, |
| 66 | + navigation, system dialogs, or app process changes unless a mutation generation contract exists. |
| 67 | + Cache infrastructure can be added later; it should not be mixed with helper-session reuse. |
| 68 | +- Promote an abstract cross-platform runner immediately: tempting, but premature. iOS XCTest, |
| 69 | + Android instrumentation, macOS helper, Linux AT-SPI, and future HarmonyOS backends have different |
| 70 | + startup and transport mechanics. Share the daemon lifecycle contract first, then extract common |
| 71 | + code only where repetition appears. |
| 72 | +- Replace Android instrumentation with a normal app service: potentially useful, but Android |
| 73 | + `UiAutomation` access is instrumentation-owned. A persistent instrumentation process keeps the |
| 74 | + required privilege model while removing repeated process startup. |
| 75 | + |
| 76 | +## Consequences |
| 77 | + |
| 78 | +Persistent helper sessions should be measured before being productized. A prototype or benchmark |
| 79 | +should show meaningful wall-clock improvement on a realistic app state, not just a trivial screen. |
| 80 | + |
| 81 | +Session managers need more lifecycle tests than one-shot helpers: startup, ready protocol, reuse, |
| 82 | +timeout, malformed response, helper version mismatch, device disconnect, install invalidation, |
| 83 | +shutdown, and one-shot fallback. |
| 84 | + |
| 85 | +Observability should report whether a command used a persistent session, started one, reused one, |
| 86 | +invalidated one, or fell back to one-shot. This keeps CI and user bug reports diagnosable when a |
| 87 | +fast path fails. |
| 88 | + |
| 89 | +Persistent sessions should not make direct interactive commands unexpectedly slow. Use short |
| 90 | +connect/request timeouts for the persistent path, then fall back to the existing one-shot timeout |
| 91 | +budget. |
| 92 | + |
| 93 | +The daemon remains the owner of session lifecycle. Platform modules may expose helper-session |
| 94 | +operations, but command handlers should not directly manage long-lived helper processes or raw host |
| 95 | +tool state. |
| 96 | + |
| 97 | +This ADR does not require every backend to implement a persistent session. It defines the preferred |
| 98 | +shape when the backend has the same startup/reuse economics that iOS and Android snapshots now |
| 99 | +demonstrate. |
0 commit comments