…d (Phase 3 step c prep)
Replaces the manual "run with --debug, hand-count the runner phases" check with
an automated, committed assertion so the Phase 3 step (c) runner relocation (and
future runner refactors) can prove byte-identical runner request behavior.
- src/daemon/runner-request-count.ts: pure, unit-testable counter. Parses the
daemon --debug diagnostics ndjson and counts the iOS-runner round-trip phases,
plus baseline parse/compare logic. Owns RUNNER_ROUND_TRIP_PHASES as the single
source of truth, now imported by request-router.ts (was a local const) so the
in-process cost graft and the external counter never drift.
- src/daemon/__tests__/runner-request-count.test.ts: 13 unit tests over synthetic
ndjson fixtures (tolerant parse, counting, baseline parse/compare). Run in the
normal unit suite; no hardware.
- scripts/runner-request-count/: assertion harness (run.ts) + committed baseline
(expected-counts.json). Drives the existing smoke-ios replay scenario with
--debug in an isolated --state-dir, counts runner round-trips from daemon.log,
and asserts against the baseline. --update regenerates the baseline. Infra
hiccups are inconclusive (don't fail); only a real count drift fails.
- .github/workflows/ios.yml: new "Assert iOS runner request count" step in the
smoke-ios job, reusing the booted simulator.
- package.json: `validate:runner-count` script. .fallowrc.json: harness entry.
The baseline ships unarmed (established=false); the harness records observed
counts (printed + uploaded as a test/artifacts artifact) without failing, so the
maintainer arms it once from a real CI run.
Why
Phase 3 step (c) (unwinding macOS out of
platforms/iosinto anapple/family — seeplans/apple-platform-consolidation.md/plans/phase3-platform-plugin-progress.md/ ADR-0009) relocates the shared Apple XCTest runner and must prove the iOS runner request count is unchanged before/after. Today that check is manual: a human runs commands with--debug, reads the per-request ndjson, and hand-counts the runner phases. This PR makes it an automated, committed assertion.There was no existing standalone harness for this — the request count only existed in-process (the
--costgraft'srunnerRoundTrips). This PR closes that gap; it does not duplicate anything.What it counts (and where the daemon emits it)
The daemon already defines the canonical phase set — this PR makes it the single source of truth shared by the in-process cost graft and the new external counter:
ios_runner_command_send— emitted insrc/platforms/ios/runner-session.ts:629(sendRunnerCommandAfterPreflight, viawithDiagnosticTimer): the command round-trip itself.ios_runner_readiness_preflight— emitted insrc/platforms/ios/runner-session.ts:663(runRunnerReadinessPreflight): the pre-commanduptimeprobe (a real network round-trip).The
..._skipped/..._recoveredmarkers do not hit the runner and are excluded (matching the comment +RUNNER_ROUND_TRIP_PHASESpreviously atsrc/daemon/request-router.ts, now moved to the counter module and imported back). Each--debugrequest appends one JSON object per line to<state-dir>/daemon.log(src/utils/diagnostics.tsemitDiagnostic).What this PR adds
1. Pure counter —
src/daemon/runner-request-count.tsRUNNER_ROUND_TRIP_PHASES(single source of truth;request-router.tsnow imports it instead of defining a local copy — byte-identical behavior).parseDiagnosticNdjson(tolerant: skips plain daemon-log lines, blank/malformed lines, strips the[agent-device][diag]stderr prefix),countRunnerRequests, and pure baselineparse/build/comparehelpers. No I/O, no hardware.2. Unit tests —
src/daemon/__tests__/runner-request-count.test.ts(13 tests)request-router-cost.test.ts(1 preflight + 2 command_send + skipped + unrelated → 3), baseline validation, and drift diffs. Run in the normal unit suite (vitest --project unit).3. Assertion harness —
scripts/runner-request-count/run.ts+expected-counts.jsontest/integration/replays/ios/simulator/01-settings.ad) — no new app flow.--state-dir;prepare ios-runnerwithout--debug(so prepare diagnostics don't pollute the count), truncatesdaemon.log, runs the scenario with--debug --retries 0(single deterministic attempt), counts runner round-trips fromdaemon.log, and asserts against the committed baseline.--update/--saveregenerates the baseline.npm/pnpmscript:validate:runner-count.4. CI wiring —
.github/workflows/ios.ymlAssert iOS runner request countstep in thesmoke-iosjob, reusing the booted simulator (pnpm clean:daemonfirst to release the smoke daemon's UDID lease, then the harness runs its own isolated daemon).Expected-count baseline (storage + regeneration)
Committed at
scripts/runner-request-count/expected-counts.json, keyed by scenario, with per-phase counts + total. Regenerate withnode --experimental-strip-types scripts/runner-request-count/run.ts --udid <UDID> --update(orpnpm validate:runner-count --udid <UDID> --update) on a host with a booted simulator + prepared runner.It ships unarmed (
established: false) because real counts can only be captured on a simulator (not available locally). While unarmed, the harness records observed counts (printed to the step log + written totest/artifacts/runner-request-count/expected-counts.observed.json, which the existingUpload iOS artifactsstep uploads) and does not fail. To arm the gate: after this lands, read the observed counts from the firstsmoke-iosrun and commit them as the baseline (setestablished: true) — or run--updatein CI. Once armed, any count drift fails the step loudly.Flakiness mitigation
--retries 0) — retries would double the count.--state-dir) +clean:daemonto avoid runner-lease contention with the smoke daemon.--strictflips inconclusive to a hard failure for local debugging.Local vs. CI
Validated locally (no simulator): full unit suite (2884 tests, incl. the 13 new),
request-router-cost.test.ts(const move sanity), typecheck,oxlint --deny-warnings,oxfmt --check,rslib build,fallow audit(clean on changed files), workflow YAML parse, and the harness end-to-end on the no-udid/--help/--strict/ baseline-parse paths.Validated only in CI (needs a booted sim + test runner): the actual simulator scenario run,
daemon.logndjson capture, and the real counts. That leg runs in thesmoke-iosjob.Scope
Observability/CI tooling only — no runner/leaf behavior changes; the
--debugndjson contract is consumed, not modified. The onesrc/runtime touch is movingRUNNER_ROUND_TRIP_PHASESinto the counter module and importing it back intorequest-router.ts(behaviorless).