Skip to content

fix: address top Crashlytics crashes and non-fatals for build 29320984#5684

Merged
jamesarich merged 3 commits into
mainfrom
jamesarich/fix-maps-clustering-crashes
May 31, 2026
Merged

fix: address top Crashlytics crashes and non-fatals for build 29320984#5684
jamesarich merged 3 commits into
mainfrom
jamesarich/fix-maps-clustering-crashes

Conversation

@jamesarich

Copy link
Copy Markdown
Collaborator

Motivation

Build 29320984 (beta/alpha) shows two categories of Crashlytics issues:

  • Fatal crashes (1900+ events/week): Maps clustering ComposeView lifecycle failures
  • Non-fatal noise (26K+ events/week): Expected BLE disconnections recorded as exceptions

Both degrade signal-to-noise in Crashlytics, making real issues harder to spot.

Approach

Maps clustering crashes (2 issues, ~1900 events, ~1000 users)

The android-maps-compose library creates internal ComposeView instances to render markers as bitmaps. These views require a ViewTreeLifecycleOwner but can race with lifecycle transitions when the cluster renderer's async Handler fires after the activity stops.

Fixes:

  • NodeClusterMarkers.kt -- Added lifecycle state guard (isAtLeast(STARTED)) to prevent the Clustering composable from rendering when the lifecycle is inactive, avoiding the async race
  • InlineMap.kt -- Added the same ViewTreeLifecycleOwner/SavedStateRegistryOwner propagation workaround that the main map already had, plus defaultMinSize on marker content to prevent zero-size measurement

BLE non-fatal noise (1 issue, 26K events, 1494 users)

Kable (BLE library) logs expected operational events like failed connections and disconnect requests at error level. Our KermitLogEngine bridge routes these to Logger.e(), and the CrashlyticsLogWriter records any Error+throwable as a non-fatal via recordException().

Fix:

  • KermitLogEngine.kt -- Downgraded error() and assert() to Logger.w(). Kable "errors" are normal BLE lifecycle events (connection timeouts, already-disconnected), not app bugs. They still appear in the Crashlytics log buffer for debugging but no longer pollute the non-fatals dashboard.

Non-obvious decisions

  • The lifecycle guard in NodeClusterMarkers uses an early return rather than wrapping Clustering in a conditional block -- this ensures no cluster state is retained across lifecycle transitions that could trigger stale renders.
  • Downgrading ALL Kable errors to warn is intentional -- Kable's error level maps to "this BLE operation failed" which is an expected condition in radio environments, not "the app has a bug."

References: googlemaps/android-maps-compose#858, googlemaps/android-maps-compose#875

jamesarich and others added 2 commits May 31, 2026 08:23
Address two Crashlytics crashes affecting build 29320984:

1. ComposeUiClusterRenderer.renderViewToBitmapDescriptor (1810 events, 970 users)
   - IllegalStateException: ViewTreeLifecycleOwner not propagated
   - Added lifecycle state guard to skip rendering when lifecycle < STARTED,
     preventing the async MarkerModifier Handler from racing with lifecycle stops

2. renderComposableToBitmapDescriptor in InlineMap (92 events, 43 users)
   - IllegalStateException: ComposeView measured to zero width/height
   - Added ViewTreeLifecycleOwner propagation workaround (same pattern as
     NodeClusterMarkers) so the internal ComposeView can find the lifecycle
   - Added explicit defaultMinSize on marker content as safety net

Both crashes stem from android-maps-compose's internal ComposeView bitmap
rendering pipeline, which requires an active lifecycle owner that may not
be available during async marker rendering or on detached views.

Fixes: googlemaps/android-maps-compose#858, googlemaps/android-maps-compose#875

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…cording

Kable logs expected BLE operational events (connection failures, disconnect
requests) at error level. Our KermitLogEngine bridges these to Kermit, and
the CrashlyticsLogWriter records any Error-level log with a throwable as a
non-fatal exception via recordException().

This caused 26K+ spurious non-fatal events per week (NotConnectedException,
connection timeouts) that are normal BLE lifecycle events, not app bugs.

Fix: downgrade KermitLogEngine.error() and .assert() to Logger.w() so Kable
internal logs still appear in the Crashlytics log buffer for debugging but
don't trigger recordException().

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions github-actions Bot added the bugfix PR tag label May 31, 2026
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@codecov

codecov Bot commented May 31, 2026

Copy link
Copy Markdown

❌ 1 Tests Failed:

Tests completed Failed Passed Skipped
2528 1 2527 0
View the top 1 failed test(s) by shortest run time
org.meshtastic.core.model.NodeTest::isOnline_usesStrictThresholdBoundary
Stack Traces | 0.129s run time
java.lang.AssertionError: Expected value to be true.
	at org.junit.Assert.fail(Assert.java:89)
	at kotlin.test.junit.JUnitAsserter.fail(JUnitSupport.kt:56)
	at kotlin.test.Asserter.assertTrue(Assertions.kt:766)
	at kotlin.test.junit.JUnitAsserter.assertTrue(JUnitSupport.kt:30)
	at kotlin.test.Asserter.assertTrue(Assertions.kt:776)
	at kotlin.test.junit.JUnitAsserter.assertTrue(JUnitSupport.kt:30)
	at kotlin.test.AssertionsKt__AssertionsKt.assertTrue(Assertions.kt:44)
	at kotlin.test.AssertionsKt.assertTrue(Unknown Source)
	at kotlin.test.AssertionsKt__AssertionsKt.assertTrue$default(Assertions.kt:42)
	at kotlin.test.AssertionsKt.assertTrue$default(Unknown Source)
	at org.meshtastic.core.model.NodeTest.isOnline_usesStrictThresholdBoundary(NodeTest.kt:37)
	at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
	at java.base/java.lang.reflect.Method.invoke(Method.java:580)
	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
	at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
	at org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
	at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
	at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
	at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
	at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
	at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
	at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
	at org.junit.runner.JUnitCore.run(JUnitCore.java:115)
	at org.gradle.api.internal.tasks.testing.junit.JUnitTestExecutor.runRequest(JUnitTestExecutor.java:175)
	at org.gradle.api.internal.tasks.testing.junit.JUnitTestExecutor.accept(JUnitTestExecutor.java:84)
	at org.gradle.api.internal.tasks.testing.junit.JUnitTestExecutor.accept(JUnitTestExecutor.java:47)
	at org.gradle.api.internal.tasks.testing.junit.AbstractJUnitTestDefinitionProcessor.processTestDefinition(AbstractJUnitTestDefinitionProcessor.java:65)
	at org.gradle.api.internal.tasks.testing.SuiteTestDefinitionProcessor.processTestDefinition(SuiteTestDefinitionProcessor.java:53)
	at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
	at java.base/java.lang.reflect.Method.invoke(Method.java:580)
	at org.gradle.internal.dispatch.MethodInvocation.invokeOn(MethodInvocation.java:77)
	at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:28)
	at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:19)
	at org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:33)
	at org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:88)
	at jdk.proxy1/jdk.proxy1.$Proxy4.processTestDefinition(Unknown Source)
	at org.gradle.api.internal.tasks.testing.worker.TestWorker$2.run(TestWorker.java:178)
	at org.gradle.api.internal.tasks.testing.worker.TestWorker.executeAndMaintainThreadName(TestWorker.java:126)
	at org.gradle.api.internal.tasks.testing.worker.TestWorker.execute(TestWorker.java:103)
	at org.gradle.api.internal.tasks.testing.worker.TestWorker.execute(TestWorker.java:63)
	at org.gradle.process.internal.worker.child.ActionExecutionWorker.execute(ActionExecutionWorker.java:56)
	at org.gradle.process.internal.worker.child.SystemApplicationClassLoaderWorker.call(SystemApplicationClassLoaderWorker.java:122)
	at org.gradle.process.internal.worker.child.SystemApplicationClassLoaderWorker.call(SystemApplicationClassLoaderWorker.java:72)
	at worker.org.gradle.process.internal.worker.GradleWorkerMain.run(GradleWorkerMain.java:69)
	at worker.org.gradle.process.internal.worker.GradleWorkerMain.main(GradleWorkerMain.java:74)

To view more test analytics, go to the Test Analytics Dashboard
📋 Got 3 mins? Take this short survey to help us improve Test Analytics.

@jamesarich jamesarich added this pull request to the merge queue May 31, 2026
Merged via the queue into main with commit cca7c27 May 31, 2026
18 checks passed
@jamesarich jamesarich deleted the jamesarich/fix-maps-clustering-crashes branch May 31, 2026 14:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bugfix PR tag

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant