Skip to content

fix(device-record): refuse multi-sim ambiguity, thread deviceId through script (closes #173 sub-issue 1)#177

Merged
Lykhoyda merged 1 commit into
mainfrom
fix/gh-173-device-record-multi-sim
May 20, 2026
Merged

fix(device-record): refuse multi-sim ambiguity, thread deviceId through script (closes #173 sub-issue 1)#177
Lykhoyda merged 1 commit into
mainfrom
fix/gh-173-device-record-multi-sim

Conversation

@Lykhoyda
Copy link
Copy Markdown
Owner

Summary

Closes the first sub-issue in #173device_record action=start silently picking the wrong simulator when more than one is booted, costing the reporter a 175s recording (captured iPhone 15 Pro's home screen while iPhone 17 Pro was being driven by Maestro).

Root cause

The shell script's xcrun simctl io booted recordVideo and adb shell screenrecord both use implicit device resolution that's non-deterministic under multi-device conditions. The TS handler did no pre-flight ambiguity check, so the bug was invisible until users compared the saved recording to what they thought they were capturing.

Fix

Layered TS pre-flight + shell-script flag thread-through:

  1. New deviceId arg on device_record (iOS UDID or Android serial).

  2. Pre-flight resolver in the TS handler with these rules:

    Candidates deviceId Result
    0 any existing NO_DEVICE path
    1 not passed auto-select; autoSelected: true echoed
    1 matches use it; autoSelected: false
    1 does NOT match refuse with DEVICE_AMBIGUOUS (typo surfaces)
    ≥2 not passed refuse with DEVICE_AMBIGUOUS + full candidate list (the GH Session feedback (IX-2997, ~6h): 5 wins + 5 friction items #173 surface)
    ≥2 matches one use it; autoSelected: false
    ≥2 does NOT match refuse with DEVICE_AMBIGUOUS + full list

    The key invariant: explicit deviceId is always authoritative. A stale UDID from a saved workflow or a typo is treated as a failure mode, not a hint.

  3. record_proof.sh accepts --udid <UDID> (iOS) and --serial <SERIAL> (Android) flags, threads them into xcrun simctl io <UDID> and adb -s <SERIAL>. Falls back to implicit selection when no flag passed (single-device case, backward-compatible).

  4. Android stop path now reads the persisted serial sidecar and uses adb -s <serial> for pkill, pull, and rm — without this, multi-device stop would hit "ambiguous device" or operate on the wrong target. codex-pair caught this gap during review.

  5. Defensive: on every Android start, unconditionally write OR clear the serial sidecar. Leaving a stale sidecar from a prior recording would misroute the next stop to a now-disconnected device. Also codex-pair.

  6. Response echoes deviceId and autoSelected so callers can verify they got the device they meant.

Test plan

  • 1481/1481 cdp-bridge unit tests passing (+10 net new)
  • parseAllBootedIosDevices returns ALL booted simulators (not just first — distinct from the parser in device-screenshot-raw.ts)
  • parseAllAdbDevices returns emulators AND physical devices (the screenshot parser was emulator-only; for ambiguity detection we need physical devices to count)
  • resolveTargetDevice covers all 7 rule branches above, including the key design call (1 candidate + mismatched deviceId is AMBIGUOUS, not auto-selected)
  • CI green
  • Live verification: boot 2 simulators, call device_record start without deviceId → expect DEVICE_AMBIGUOUS; pass the listed UDID → recording captures that exact device

Out of scope

Pre-existing issues in record_proof.sh codex-pair flagged during this review (not introduced by this PR; worth a separate hardening pass):

  • HIGH: /tmp PID file used for kill without strict validation
  • HIGH: /tmp venv path predictable
  • MED: Literal \$ instead of \$\$ in raw_file/device_path — pre-existing typo, broken unique-suffix expansion

Also out of scope: similar shape probably affects device_screenshot for >1 booted devices. The screenshot path takes platform not deviceId, and the GH #136 fix from PR #174 already shifted to hard-fail-on-explicit-platform — a follow-up could add deviceId symmetry to screenshot too.

Refs

🤖 Generated with Claude Code

…gh script

Closes #173 (sub-issue 1).

Symptom: device_record action=start silently picked one of multiple booted
simulators non-deterministically, costing the user a 175s recording when
it captured iPhone 15 Pro's home screen while iPhone 17 Pro was being
driven by Maestro. The shell script's `xcrun simctl io booted recordVideo`
and `adb shell screenrecord` both use implicit device resolution that
breaks under multi-device conditions.

Fix shape:

1. New `deviceId` arg on device_record start (iOS UDID or Android serial).
2. Pre-flight resolver in the TS handler:
   - 0 candidates → existing NO_DEVICE path
   - 1 candidate, no deviceId → auto-select; mark autoSelected: true
   - explicit deviceId (any candidate count) → must match a candidate
     or refuse with code=DEVICE_AMBIGUOUS listing all candidates. An
     explicit selection is authoritative — a stale or typo'd UDID is
     a failure mode, not a hint.
   - >1 candidates, no deviceId → DEVICE_AMBIGUOUS with the candidate
     list so the caller knows the valid identifiers (the GH #173 surface)
3. Echo `deviceId` and `autoSelected` in the start response so callers
   can verify they got the device they meant.
4. record_proof.sh accepts `--udid <UDID>` (iOS) and `--serial <SERIAL>`
   (Android) flags, threads them into `xcrun simctl io <UDID>` and
   `adb -s <SERIAL>`. Falls back to `booted`/auto when no flag passed
   (single-device case, backward-compatible).
5. Android stop path now reads the persisted serial sidecar and uses
   `adb -s <serial>` for pkill/pull/rm. Without this, multi-device stop
   would hit "ambiguous device" or operate on the wrong target — codex-pair
   caught the gap during review.
6. Defensive: on every Android start, unconditionally write OR clear the
   serial sidecar. Leaving a stale sidecar from a prior recording would
   misroute the next stop to a now-disconnected device — also codex-pair.

Implementation details:
- parseAllBootedIosDevices: returns ALL booted simulators (not just first)
- parseAllAdbDevices: returns ALL connected devices including physical
  (parseAdbDevicesEmu in device-screenshot-raw.ts only matches emulator-*;
  for device_record's ambiguity check we need physical devices too)
- resolveTargetDevice: pure function exported for unit tests
- Backward-compatible: single-device callers see no change (autoSelected,
  no flag forwarded to the script)

Tests (1481/1481 passing, +10 net new):
- parseAllBootedIosDevices: returns all booted, skips Shutdown
- parseAllBootedIosDevices: empty/malformed JSON → []
- parseAllAdbDevices: returns emulators AND physical devices (key
  distinction from device-screenshot-raw's emu-only parser)
- parseAllAdbDevices: skips header and empty lines
- resolveTargetDevice: 1 candidate, no deviceId → auto-select
- resolveTargetDevice: 1 candidate + mismatched deviceId → AMBIGUOUS
  (the design fix codex-pair pushed back on — explicit-selection-is-
  authoritative)
- resolveTargetDevice: 1 candidate + matching deviceId → use it (not
  auto-selected)
- resolveTargetDevice: >1 candidates + matching deviceId → use it
- resolveTargetDevice: >1 candidates + no deviceId → AMBIGUOUS with
  full candidate list (THE GH #173 fix surface)
- resolveTargetDevice: >1 candidates + non-matching deviceId →
  AMBIGUOUS (typo surfaces fast)

Out of scope (not introduced by this PR; flagged by codex-pair as
pre-existing issues in record_proof.sh):
- /tmp PID file used for `kill` without strict validation (HIGH)
- /tmp venv path predictable (HIGH)
- Literal `$` instead of `$$` in raw_file/device_path names — typo'd
  unique suffix (MED)
These should be addressed in a separate hardening pass.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@Lykhoyda Lykhoyda merged commit b3324a7 into main May 20, 2026
7 checks passed
@Lykhoyda Lykhoyda deleted the fix/gh-173-device-record-multi-sim branch May 20, 2026 11:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant