Skip to content

Commit 1cfb649

Browse files
committed
docs: record persistent helper session architecture
1 parent 54c5f8f commit 1cfb649

1 file changed

Lines changed: 99 additions & 0 deletions

File tree

Lines changed: 99 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,99 @@
1+
# ADR 0002: Persistent Platform Helper Sessions
2+
3+
## Status
4+
5+
Accepted
6+
7+
## Context
8+
9+
Some platform automation backends are expensive to start but cheap to reuse. iOS already uses a
10+
long-lived XCTest runner session with an HTTP transport. That model avoids paying `xcodebuild`,
11+
runner boot, and XCTest readiness costs for every command, while still allowing the daemon to
12+
invalidate the runner when the device, app, bundle, or runner process changes.
13+
14+
Android snapshot capture initially used a one-shot instrumentation helper. Every snapshot launched
15+
`adb shell am instrument`, connected `UiAutomation`, captured the tree, emitted XML, and exited.
16+
Recent Android snapshot optimizations reduced XML size, idle waiting, extra file I/O, and hidden
17+
content hint work, but a throwaway prototype still showed that process/session startup dominates
18+
steady-state latency:
19+
20+
- launcher snapshot: one-shot p50 `227ms`, persistent socket p50 `5.8ms`
21+
- React Navigation playground snapshot: one-shot p50 `265.7ms`, persistent socket p50 `16.5ms`
22+
23+
The same pressure can appear on new platform adapters. HarmonyOS or other device backends may have
24+
host tools, test runners, accessibility services, or bridge processes with the same shape: expensive
25+
startup, cheap repeated commands, and a need for strict invalidation.
26+
27+
## Decision
28+
29+
Use persistent platform helper sessions when a backend has high startup cost and a reusable
30+
automation context.
31+
32+
A helper session is an optimization layer owned by the daemon, not a replacement for command
33+
correctness. It may keep processes, sockets, runner state, accessibility service flags, or device
34+
forwards warm. It must still execute each command against fresh platform state unless a separate
35+
cache contract has explicit invalidation.
36+
37+
The session pattern is:
38+
39+
- start lazily on the first command that benefits from reuse
40+
- bind the session to a device identity and helper/runner identity
41+
- communicate through a small validated protocol with request ids and version metadata
42+
- reuse the session while the identity and protocol remain valid
43+
- invalidate on device disconnect, helper reinstall/version change, process exit, socket/protocol
44+
failure, app/session identity change, or capture options that affect command semantics
45+
- fall back to the existing one-shot path for the current command when reuse fails
46+
- make shutdown best effort and make stale sessions disposable
47+
48+
For Android snapshots, productize a persistent helper mode that keeps `UiAutomation` alive and
49+
serves fresh snapshot requests over an `adb forward` socket. Do not add snapshot result caching as
50+
part of that first step. The first reliable win is infrastructure reuse, not data reuse.
51+
52+
For iOS, keep the XCTest runner session as the reference implementation for lifecycle and
53+
invalidation behavior. Android does not need to copy iOS internals, but it should reuse the same
54+
daemon-side ideas: per-device session manager, readiness checks, structured protocol errors,
55+
fallback/invalidation, and request-scoped observability.
56+
57+
For future platforms such as HarmonyOS, prefer designing adapters around this same helper-session
58+
contract when their native automation layer is runner-like. Avoid embedding platform-specific
59+
startup assumptions directly in command handlers.
60+
61+
## Alternatives Considered
62+
63+
- Keep one-shot helpers only: simplest and robust, but Android measurements show it leaves an order
64+
of magnitude of steady-state snapshot performance on the table.
65+
- Cache snapshots in the daemon: faster for repeated reads, but unsafe after mutations, animations,
66+
navigation, system dialogs, or app process changes unless a mutation generation contract exists.
67+
Cache infrastructure can be added later; it should not be mixed with helper-session reuse.
68+
- Promote an abstract cross-platform runner immediately: tempting, but premature. iOS XCTest,
69+
Android instrumentation, macOS helper, Linux AT-SPI, and future HarmonyOS backends have different
70+
startup and transport mechanics. Share the daemon lifecycle contract first, then extract common
71+
code only where repetition appears.
72+
- Replace Android instrumentation with a normal app service: potentially useful, but Android
73+
`UiAutomation` access is instrumentation-owned. A persistent instrumentation process keeps the
74+
required privilege model while removing repeated process startup.
75+
76+
## Consequences
77+
78+
Persistent helper sessions should be measured before being productized. A prototype or benchmark
79+
should show meaningful wall-clock improvement on a realistic app state, not just a trivial screen.
80+
81+
Session managers need more lifecycle tests than one-shot helpers: startup, ready protocol, reuse,
82+
timeout, malformed response, helper version mismatch, device disconnect, install invalidation,
83+
shutdown, and one-shot fallback.
84+
85+
Observability should report whether a command used a persistent session, started one, reused one,
86+
invalidated one, or fell back to one-shot. This keeps CI and user bug reports diagnosable when a
87+
fast path fails.
88+
89+
Persistent sessions should not make direct interactive commands unexpectedly slow. Use short
90+
connect/request timeouts for the persistent path, then fall back to the existing one-shot timeout
91+
budget.
92+
93+
The daemon remains the owner of session lifecycle. Platform modules may expose helper-session
94+
operations, but command handlers should not directly manage long-lived helper processes or raw host
95+
tool state.
96+
97+
This ADR does not require every backend to implement a persistent session. It defines the preferred
98+
shape when the backend has the same startup/reuse economics that iOS and Android snapshots now
99+
demonstrate.

0 commit comments

Comments
 (0)