Skip to content

fix(ble): cleanup races discovered while reviewing #5207#5221

Merged
jamesarich merged 2 commits intomainfrom
fix/ble-cleanup-followups
Apr 22, 2026
Merged

fix(ble): cleanup races discovered while reviewing #5207#5221
jamesarich merged 2 commits intomainfrom
fix/ble-cleanup-followups

Conversation

@jamesarich
Copy link
Copy Markdown
Collaborator

Two follow-up cleanup races found while reviewing the GATT-133 reconnect fix in #5207. Both are latent today (narrow timing windows) but become more likely under aggressive reconnect / multi-device scenarios.

Changes

  • KableBleConnection.connect() ownership install — Kable allocates a per-peripheral coroutine scope and Bluetooth-state observer eagerly when Peripheral(...) is constructed. Cancellation between peripheral construction, field assignment, and ActiveBleConnection.active = … could strand the new peripheral with no field referencing it for cleanup. Wrap the three-step ownership install in withContext(NonCancellable). _deviceFlow.emit() is intentionally left outside the block to avoid hanging teardown on a slow collector.

  • KableBleConnection.disconnect() identity checkdisconnect() cleared ActiveBleConnection.active unconditionally. A stale disconnect from an earlier connection attempt's exception handler running after a newer connection had installed itself as active would clobber the newer entry. Capture the owned peripheral at the start of disconnect() and only clear ActiveBleConnection if it still references that same instance.

  • BleRadioTransport exception handler cleanup scope — the CoroutineExceptionHandler on connectionScope launched its last-ditch bleConnection.disconnect() on the per-transport scope. When upper layers tear scope down in response to the same uncaught exception that triggered the handler, scope.launch can fail to start at all, defeating the cleanup and leaking BluetoothGatt (status 133 on the next reconnect). Use a dedicated cleanupScope built from scope's context minus its Job plus a fresh SupervisorJob so the cleanup launch is independent of the parent's cancellation state. Cancelled in close() once our own disconnect completes.

Verification

  • ./gradlew :core:network:spotlessApply :core:ble:spotlessApply :core:network:detekt :core:ble:detekt
  • ./gradlew :core:network:allTests :core:ble:allTests

Follow-up to #5207.

jamesarich and others added 2 commits April 22, 2026 15:58
…hip install

Two related races discovered while reviewing the GATT-133 reconnect crash fix
(#5207):

1. KableBleConnection.connect() created a new Peripheral via Kable, then
   performed cleanUpPeripheral() / peripheral = p / ActiveBleConnection.active
   = ... in three separate cancellable steps. Kable allocates a per-peripheral
   coroutine scope and Bluetooth-state observer eagerly at Peripheral
   construction time, so cancellation between any of these steps could strand
   the new peripheral with no field referencing it for cleanup. Wrap the
   ownership install in withContext(NonCancellable) so it completes as a
   single unit. _deviceFlow.emit() is intentionally left outside the block to
   avoid hanging teardown on a slow collector.

2. KableBleConnection.disconnect() unconditionally cleared
   ActiveBleConnection.active. If a stale disconnect from an earlier
   connection attempt's exception handler ran after a newer connection had
   already installed itself as active, it would clobber the newer entry.
   Capture the peripheral we own at the start of disconnect() and only clear
   ActiveBleConnection if it still references that same instance.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ndler

The CoroutineExceptionHandler attached to BleRadioTransport.connectionScope
launched its last-ditch bleConnection.disconnect() on `scope` (the
per-transport scope passed in via constructor). When upper layers tear down
`scope` in response to the same uncaught exception that triggered the
handler, scope.launch can fail to start at all — defeating the cleanup and
leaking BluetoothGatt (status 133 on the next reconnect).

Introduce a dedicated cleanupScope built from `scope`'s context minus its
Job plus a fresh SupervisorJob, so the cleanup launch is independent of the
parent's cancellation state. Cancel cleanupScope in close() once our own
disconnect has completed and the safety net is no longer required.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions github-actions Bot added the bugfix PR tag label Apr 22, 2026
@jamesarich jamesarich merged commit 20e078d into main Apr 22, 2026
12 checks passed
@jamesarich jamesarich deleted the fix/ble-cleanup-followups branch April 22, 2026 21:22
Copilot AI pushed a commit that referenced this pull request Apr 24, 2026
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: garthvh <1795163+garthvh@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bugfix PR tag

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant