Skip to content

Commit ad6268d

Browse files
committed
docs: plan ios runner protocol optimizations
1 parent 16b8c7d commit ad6268d

4 files changed

Lines changed: 147 additions & 10 deletions

File tree

Lines changed: 144 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,144 @@
1+
# iOS runner protocol optimization plan
2+
3+
Issue #656 is now split into protocol infrastructure plus follow-up optimizations. The lifecycle
4+
protocol makes commands identifiable, but the performance wins come from changing when the daemon
5+
uses `uptime`, retries, invalidates sessions, and asks the runner for lifecycle status.
6+
7+
## Work slices
8+
9+
### 1. Status-before-invalidate recovery
10+
11+
Status: in progress on `codex/ios-runner-status-recovery`.
12+
13+
Goal: when a command has been sent and the HTTP response is lost, ask the runner for
14+
`status(statusCommandId)` before invalidating the session or surfacing an ambiguous transport
15+
failure.
16+
17+
Acceptance criteria:
18+
19+
- Post-send retryable transport failures issue one bounded `status` probe with the original
20+
`commandId` before session invalidation.
21+
- `completed` with retained small response JSON returns the recovered command result without
22+
invalidating or resending the command.
23+
- `failed` returns the runner failure code/message/hint instead of a generic transport failure.
24+
- `notAccepted`, status timeout, or status transport failure preserves the existing invalidation
25+
behavior.
26+
- Read-only commands whose response was not retained keep the existing retry behavior.
27+
- Status recovery probes are short-budget and do not consume the full command timeout.
28+
29+
iOS simulator validation:
30+
31+
- Unit: `pnpm exec vitest run src/platforms/ios/__tests__/runner-command-retry.test.ts`.
32+
- Unit bundle: `pnpm exec vitest run src/platforms/ios/__tests__/runner-client.test.ts src/platforms/ios/__tests__/runner-session.test.ts src/platforms/ios/__tests__/runner-command-retry.test.ts src/platforms/ios/__tests__/runner-provider.test.ts`.
33+
- Build: `pnpm build:xcuitest`.
34+
- Manual sim smoke after build:
35+
- `pnpm build`
36+
- `pnpm clean:daemon`
37+
- run a simple iOS simulator session against Settings with `open`, `snapshot -i`, one selector
38+
interaction, and `close`.
39+
- confirm there is no visible behavior change and diagnostics show no unexpected session
40+
invalidation.
41+
42+
### 2. Adaptive `uptime` preflight policy
43+
44+
Goal: stop paying eager `uptime` before low-risk mutating commands when the runner has recently
45+
completed a command, relying on status-before-invalidate recovery for the rare ambiguous transport
46+
failure.
47+
48+
Acceptance criteria:
49+
50+
- Existing first-command/startup readiness behavior is preserved.
51+
- Existing failed-preflight stale-session recovery is preserved.
52+
- Repeated hot interactions skip `uptime` when the runner has a recent successful response.
53+
- Commands that still need conservative readiness checks remain preflighted until measured.
54+
- A transport failure after skipping preflight runs status recovery before invalidation.
55+
- Diagnostics expose whether a command used, skipped, or recovered from a readiness preflight.
56+
57+
iOS simulator validation:
58+
59+
- Start a fresh simulator session and run one interaction: verify the first mutating command still
60+
preflights.
61+
- Run a hot loop of repeated selector interactions against the same visible control: verify only
62+
the first command pays `uptime`, subsequent commands emit `ios_runner_readiness_preflight_skipped`,
63+
and the UI still responds correctly.
64+
- Compare median command latency for a hot interaction loop before and after the change. A useful
65+
threshold is at least one fewer runner request per hot command and no increase in failure rate.
66+
67+
### 3. Status-visible transport path
68+
69+
Goal: make `accepted` and `started` states practically observable while a command is still running.
70+
The Swift journal already records these states, but the runner currently serializes connection
71+
handling, so a concurrent status request can be blocked behind the command it is querying.
72+
73+
Acceptance criteria:
74+
75+
- `status` can be answered while another runner command is waiting on main-thread XCTest work.
76+
- The status path remains journal-only and does not touch app activation, XCTest dispatch, or
77+
command retry logic.
78+
- Long-running command status can report `accepted` or `started` before the command reaches a
79+
terminal state.
80+
- Existing command execution remains serial where mutation ordering matters.
81+
82+
iOS simulator validation:
83+
84+
- Run a deliberately long runner command in one request.
85+
- While it is in flight, query `status(statusCommandId)` from another request.
86+
- Verify status returns before the long command completes and reports `accepted` or `started`.
87+
- Verify normal command ordering is unchanged for back-to-back mutating commands.
88+
89+
### 4. Session invalidation reduction
90+
91+
Goal: avoid tearing down otherwise healthy runner sessions when lifecycle status proves the command
92+
completed or failed cleanly.
93+
94+
Acceptance criteria:
95+
96+
- Completed/failed lifecycle status suppresses invalidation for ambiguous post-send transport
97+
errors when the runner remains reachable.
98+
- Unknown status states still invalidate to preserve current safety.
99+
- Diagnostics record why invalidation was skipped or retained.
100+
- No command is replayed after an observed mutating `accepted`, `started`, `completed`, or `failed`
101+
state.
102+
103+
iOS simulator validation:
104+
105+
- Inject or simulate a lost response after a command completes.
106+
- Verify status recovery prevents runner restart.
107+
- Run the next command in the same session and verify it succeeds without re-launching xcodebuild.
108+
109+
### 5. Response retention tuning
110+
111+
Goal: retain enough small command results for useful recovery without making the runner retain large
112+
snapshots or binary-like payloads.
113+
114+
Acceptance criteria:
115+
116+
- Small scalar responses can be recovered from `lifecycleResponseJson`.
117+
- Snapshot node trees and screenshots are not serialized or retained in the journal.
118+
- The journal memory cap remains bounded by entry count and response JSON size.
119+
- Retention policy is documented in tests or runner fixtures so future commands do not accidentally
120+
store large payloads.
121+
122+
iOS simulator validation:
123+
124+
- Run small-result commands and verify status can recover retained JSON.
125+
- Run snapshot-heavy commands and verify status reports terminal state without retained response JSON.
126+
- Confirm the runner remains responsive after repeated snapshots.
127+
128+
## Suggested ordering
129+
130+
1. Land status-before-invalidate recovery first. It is the safety net needed before reducing
131+
defensive preflights.
132+
2. Add diagnostics/metrics for preflight use, skipped preflights, status recovery, and invalidation
133+
reason. This can happen alongside slice 1 or 2.
134+
3. Reduce `uptime` for hot interaction loops with a conservative command allowlist.
135+
4. Make the status transport path observable during long-running commands.
136+
5. Broaden the preflight policy only after simulator measurements show stable behavior.
137+
138+
## Side-by-side work
139+
140+
- Status recovery and diagnostics can be developed together or separately.
141+
- Transport status visibility can proceed independently once the protocol is on `main`.
142+
- Adaptive `uptime` should wait for status recovery, because it relies on the same recovery path for
143+
ambiguous post-send failures.
144+
- Response retention tuning can proceed independently as long as it preserves the current caps.

src/compat/maestro/__tests__/runtime-assertions.test.ts

Lines changed: 2 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -123,9 +123,7 @@ test('invokeMaestroAssertVisible does not dismiss React Native overlays during n
123123
});
124124

125125
assert.equal(response.ok, false);
126-
assert.deepEqual(calls, [
127-
['wait', ['Ready', '60000']],
128-
]);
126+
assert.deepEqual(calls, [['wait', ['Ready', '60000']]]);
129127
});
130128

131129
test('invokeMaestroAssertVisible uses snapshot resolution for short iOS assertions', async () => {
@@ -267,9 +265,7 @@ test('invokeMaestroAssertVisible fails fast when a RedBox has no dismiss target'
267265
if (!response.ok) {
268266
assert.match(response.error.message, /React Native overlay is covering app content/);
269267
}
270-
assert.deepEqual(calls, [
271-
['snapshot', []],
272-
]);
268+
assert.deepEqual(calls, [['snapshot', []]]);
273269
});
274270

275271
test('invokeMaestroAssertNotVisible passes after a slow hidden sample exhausts the timeout', async () => {

src/compat/maestro/runtime-assertions.ts

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -151,9 +151,7 @@ function handleFailedVisibleSample(
151151
args: MaestroVisibilityAssertionArgs,
152152
sample: Exclude<MaestroVisibilitySample, { visible: true }>,
153153
startedAt: number,
154-
):
155-
| { kind: 'continue' }
156-
| { kind: 'return'; response: DaemonResponse } {
154+
): { kind: 'continue' } | { kind: 'return'; response: DaemonResponse } {
157155
if (isReactNativeOverlayBlockingAssertion(sample.response)) {
158156
return { kind: 'return', response: sample.response };
159157
}

src/compat/maestro/runtime-interactions.ts

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -481,7 +481,6 @@ async function clickMaestroSnapshotTarget(
481481
};
482482
}
483483

484-
485484
async function invokeMaestroFuzzyTapOn(
486485
params: MaestroTapOnParams,
487486
query: string,

0 commit comments

Comments
 (0)