fix(device-record): refuse multi-sim ambiguity, thread deviceId through script (closes #173 sub-issue 1)#177
Merged
Conversation
…gh script Closes #173 (sub-issue 1). Symptom: device_record action=start silently picked one of multiple booted simulators non-deterministically, costing the user a 175s recording when it captured iPhone 15 Pro's home screen while iPhone 17 Pro was being driven by Maestro. The shell script's `xcrun simctl io booted recordVideo` and `adb shell screenrecord` both use implicit device resolution that breaks under multi-device conditions. Fix shape: 1. New `deviceId` arg on device_record start (iOS UDID or Android serial). 2. Pre-flight resolver in the TS handler: - 0 candidates → existing NO_DEVICE path - 1 candidate, no deviceId → auto-select; mark autoSelected: true - explicit deviceId (any candidate count) → must match a candidate or refuse with code=DEVICE_AMBIGUOUS listing all candidates. An explicit selection is authoritative — a stale or typo'd UDID is a failure mode, not a hint. - >1 candidates, no deviceId → DEVICE_AMBIGUOUS with the candidate list so the caller knows the valid identifiers (the GH #173 surface) 3. Echo `deviceId` and `autoSelected` in the start response so callers can verify they got the device they meant. 4. record_proof.sh accepts `--udid <UDID>` (iOS) and `--serial <SERIAL>` (Android) flags, threads them into `xcrun simctl io <UDID>` and `adb -s <SERIAL>`. Falls back to `booted`/auto when no flag passed (single-device case, backward-compatible). 5. Android stop path now reads the persisted serial sidecar and uses `adb -s <serial>` for pkill/pull/rm. Without this, multi-device stop would hit "ambiguous device" or operate on the wrong target — codex-pair caught the gap during review. 6. Defensive: on every Android start, unconditionally write OR clear the serial sidecar. Leaving a stale sidecar from a prior recording would misroute the next stop to a now-disconnected device — also codex-pair. Implementation details: - parseAllBootedIosDevices: returns ALL booted simulators (not just first) - parseAllAdbDevices: returns ALL connected devices including physical (parseAdbDevicesEmu in device-screenshot-raw.ts only matches emulator-*; for device_record's ambiguity check we need physical devices too) - resolveTargetDevice: pure function exported for unit tests - Backward-compatible: single-device callers see no change (autoSelected, no flag forwarded to the script) Tests (1481/1481 passing, +10 net new): - parseAllBootedIosDevices: returns all booted, skips Shutdown - parseAllBootedIosDevices: empty/malformed JSON → [] - parseAllAdbDevices: returns emulators AND physical devices (key distinction from device-screenshot-raw's emu-only parser) - parseAllAdbDevices: skips header and empty lines - resolveTargetDevice: 1 candidate, no deviceId → auto-select - resolveTargetDevice: 1 candidate + mismatched deviceId → AMBIGUOUS (the design fix codex-pair pushed back on — explicit-selection-is- authoritative) - resolveTargetDevice: 1 candidate + matching deviceId → use it (not auto-selected) - resolveTargetDevice: >1 candidates + matching deviceId → use it - resolveTargetDevice: >1 candidates + no deviceId → AMBIGUOUS with full candidate list (THE GH #173 fix surface) - resolveTargetDevice: >1 candidates + non-matching deviceId → AMBIGUOUS (typo surfaces fast) Out of scope (not introduced by this PR; flagged by codex-pair as pre-existing issues in record_proof.sh): - /tmp PID file used for `kill` without strict validation (HIGH) - /tmp venv path predictable (HIGH) - Literal `$` instead of `$$` in raw_file/device_path names — typo'd unique suffix (MED) These should be addressed in a separate hardening pass. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes the first sub-issue in #173 —
device_recordaction=start silently picking the wrong simulator when more than one is booted, costing the reporter a 175s recording (captured iPhone 15 Pro's home screen while iPhone 17 Pro was being driven by Maestro).Root cause
The shell script's
xcrun simctl io booted recordVideoandadb shell screenrecordboth use implicit device resolution that's non-deterministic under multi-device conditions. The TS handler did no pre-flight ambiguity check, so the bug was invisible until users compared the saved recording to what they thought they were capturing.Fix
Layered TS pre-flight + shell-script flag thread-through:
New
deviceIdarg ondevice_record(iOS UDID or Android serial).Pre-flight resolver in the TS handler with these rules:
deviceIdNO_DEVICEpathautoSelected: trueechoedautoSelected: falseDEVICE_AMBIGUOUS(typo surfaces)DEVICE_AMBIGUOUS+ full candidate list (the GH Session feedback (IX-2997, ~6h): 5 wins + 5 friction items #173 surface)autoSelected: falseDEVICE_AMBIGUOUS+ full listThe key invariant: explicit
deviceIdis always authoritative. A stale UDID from a saved workflow or a typo is treated as a failure mode, not a hint.record_proof.shaccepts--udid <UDID>(iOS) and--serial <SERIAL>(Android) flags, threads them intoxcrun simctl io <UDID>andadb -s <SERIAL>. Falls back to implicit selection when no flag passed (single-device case, backward-compatible).Android stop path now reads the persisted serial sidecar and uses
adb -s <serial>forpkill,pull, andrm— without this, multi-device stop would hit "ambiguous device" or operate on the wrong target. codex-pair caught this gap during review.Defensive: on every Android start, unconditionally write OR clear the serial sidecar. Leaving a stale sidecar from a prior recording would misroute the next stop to a now-disconnected device. Also codex-pair.
Response echoes
deviceIdandautoSelectedso callers can verify they got the device they meant.Test plan
parseAllBootedIosDevicesreturns ALL booted simulators (not just first — distinct from the parser indevice-screenshot-raw.ts)parseAllAdbDevicesreturns emulators AND physical devices (the screenshot parser was emulator-only; for ambiguity detection we need physical devices to count)resolveTargetDevicecovers all 7 rule branches above, including the key design call (1 candidate + mismatcheddeviceIdis AMBIGUOUS, not auto-selected)device_record startwithoutdeviceId→ expectDEVICE_AMBIGUOUS; pass the listed UDID → recording captures that exact deviceOut of scope
Pre-existing issues in
record_proof.shcodex-pair flagged during this review (not introduced by this PR; worth a separate hardening pass):/tmpPID file used forkillwithout strict validation/tmpvenv path predictable\$instead of\$\$inraw_file/device_path— pre-existing typo, broken unique-suffix expansionAlso out of scope: similar shape probably affects
device_screenshotfor >1 booted devices. The screenshot path takesplatformnotdeviceId, and the GH #136 fix from PR #174 already shifted to hard-fail-on-explicit-platform — a follow-up could adddeviceIdsymmetry to screenshot too.Refs
device_screenshot's explicit-platform path🤖 Generated with Claude Code