fix(ble): cleanup races discovered while reviewing #5207#5221
Merged
jamesarich merged 2 commits intomainfrom Apr 22, 2026
Merged
fix(ble): cleanup races discovered while reviewing #5207#5221jamesarich merged 2 commits intomainfrom
jamesarich merged 2 commits intomainfrom
Conversation
…hip install Two related races discovered while reviewing the GATT-133 reconnect crash fix (#5207): 1. KableBleConnection.connect() created a new Peripheral via Kable, then performed cleanUpPeripheral() / peripheral = p / ActiveBleConnection.active = ... in three separate cancellable steps. Kable allocates a per-peripheral coroutine scope and Bluetooth-state observer eagerly at Peripheral construction time, so cancellation between any of these steps could strand the new peripheral with no field referencing it for cleanup. Wrap the ownership install in withContext(NonCancellable) so it completes as a single unit. _deviceFlow.emit() is intentionally left outside the block to avoid hanging teardown on a slow collector. 2. KableBleConnection.disconnect() unconditionally cleared ActiveBleConnection.active. If a stale disconnect from an earlier connection attempt's exception handler ran after a newer connection had already installed itself as active, it would clobber the newer entry. Capture the peripheral we own at the start of disconnect() and only clear ActiveBleConnection if it still references that same instance. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ndler The CoroutineExceptionHandler attached to BleRadioTransport.connectionScope launched its last-ditch bleConnection.disconnect() on `scope` (the per-transport scope passed in via constructor). When upper layers tear down `scope` in response to the same uncaught exception that triggered the handler, scope.launch can fail to start at all — defeating the cleanup and leaking BluetoothGatt (status 133 on the next reconnect). Introduce a dedicated cleanupScope built from `scope`'s context minus its Job plus a fresh SupervisorJob, so the cleanup launch is independent of the parent's cancellation state. Cancel cleanupScope in close() once our own disconnect has completed and the safety net is no longer required. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Two follow-up cleanup races found while reviewing the GATT-133 reconnect fix in #5207. Both are latent today (narrow timing windows) but become more likely under aggressive reconnect / multi-device scenarios.
Changes
KableBleConnection.connect()ownership install — Kable allocates a per-peripheral coroutine scope and Bluetooth-state observer eagerly whenPeripheral(...)is constructed. Cancellation between peripheral construction, field assignment, andActiveBleConnection.active = …could strand the new peripheral with no field referencing it for cleanup. Wrap the three-step ownership install inwithContext(NonCancellable)._deviceFlow.emit()is intentionally left outside the block to avoid hanging teardown on a slow collector.KableBleConnection.disconnect()identity check —disconnect()clearedActiveBleConnection.activeunconditionally. A stale disconnect from an earlier connection attempt's exception handler running after a newer connection had installed itself as active would clobber the newer entry. Capture the owned peripheral at the start ofdisconnect()and only clearActiveBleConnectionif it still references that same instance.BleRadioTransportexception handler cleanup scope — theCoroutineExceptionHandleronconnectionScopelaunched its last-ditchbleConnection.disconnect()on the per-transportscope. When upper layers tearscopedown in response to the same uncaught exception that triggered the handler,scope.launchcan fail to start at all, defeating the cleanup and leaking BluetoothGatt (status 133 on the next reconnect). Use a dedicatedcleanupScopebuilt fromscope's context minus itsJobplus a freshSupervisorJobso the cleanup launch is independent of the parent's cancellation state. Cancelled inclose()once our own disconnect completes.Verification
./gradlew :core:network:spotlessApply :core:ble:spotlessApply :core:network:detekt :core:ble:detekt✅./gradlew :core:network:allTests :core:ble:allTests✅Follow-up to #5207.