Skip to content

Commit 3188423

Browse files
antonisclaude
andauthored
fix(android): Stop Hermes profiler on React instance teardown (#6035)
* fix(android): Stop Hermes sampling profiler on React instance teardown (#5441) HermesSamplingProfiler.enable() starts a global timer thread that fires pthread_kill(jsThread, SIGPROF) every ~10ms to sample stacks. Before this change, nothing stopped it when the ReactInstance was torn down (Metro reload, orderly Activity destroy, CodePush swap), so the sampler could call pthread_kill on a pthread that was already joined by RN's teardown sequence, resulting in SIGABRT in libhermes.so at hermes::vm::sampling_profiler::Sampler::platformSuspendVMAndWalkStack. Hermes's own Sampler::disable() synchronously joins the timer thread before returning. Calling it from the module's invalidate() hook closes the window before the JS thread is destroyed. - Override invalidate() in both new-arch and old-arch wrappers to call a cleanup hook on RNSentryModuleImpl. - Defensive HermesSamplingProfiler.disable() before enable() in startProfiling() to flush any leaked registration from a prior run. - Track isProfiling as volatile and snatch the AndroidProfiler reference locally in invalidate() to avoid racing with a concurrent stopProfiling on the JS thread. - Delete the android profiler trace file in invalidate() to avoid leaking it in cacheDir when the profile is being discarded. Upstream race: facebook/hermes#1853. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Fix changelog * fix(android): Address review: atomic state + startProfiling unwind + stopProfiling TOCTOU Apply review feedback from Sentry Warden and Cursor Bugbot: - Replace volatile fields with AtomicBoolean/AtomicReference so PMD's AvoidUsingVolatile rule passes and to make the cross-thread gating explicit. startProfiling/stopProfiling run on the JS thread; invalidate runs on the NativeModules teardown thread. - startProfiling now sets isProfiling=true immediately after HermesSamplingProfiler.enable() succeeds, before androidProfiler.start(). The catch block unwinds via getAndSet(false) + disable() so we can't leave the Hermes sampler running while the flag reads false. - stopProfiling snapshots androidProfiler.get() into a local, closing the TOCTOU where a concurrent invalidate() could null the field between the null-check and endAndCollect(). - invalidate gates on isProfiling.getAndSet(false) and atomically snatches the profiler reference with androidProfiler.getAndSet(null). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(android): Isolate Hermes disable from Android profiler cleanup in invalidate Address two new review findings: - High severity (Cursor Bugbot): the previous invalidate() shared a single try/catch with endAndCollect(). If endAndCollect threw, the HermesSamplingProfiler.disable() call on the line below was skipped, leaving the sampler's timer thread alive — defeating the entire purpose of invalidate(). Reorder so the crash-critical Hermes disable runs first in its own try/catch; the Android platform profiler cleanup follows in an independent try/catch. - Medium severity (Cursor Bugbot / reviewer): stopProfiling could race with invalidate on the same AndroidProfiler instance (both calling endAndCollect concurrently). Gate stopProfiling on isProfiling.compareAndSet(true, false) and take ownership of the profiler reference via androidProfiler.getAndSet(null), mirroring invalidate's pattern. Also add two small idempotency tests that verify invalidate() is a no-op on a fresh module without crossing the Hermes JNI boundary. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Update changelog --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 4b4114f commit 3188423

5 files changed

Lines changed: 145 additions & 8 deletions

File tree

CHANGELOG.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,12 @@
66
> make sure you follow our [migration guide](https://docs.sentry.io/platforms/react-native/migration/) first.
77
<!-- prettier-ignore-end -->
88
9+
## Unreleased
10+
11+
### Fixes
12+
13+
- Stop the Hermes sampling profiler on React instance teardown to prevent `pthread_kill` SIGABRT when the JS thread is torn down with profiling active ([#6035](https://github.com/getsentry/sentry-react-native/pull/6035))
14+
915
## 8.9.1
1016

1117
### Features
Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
package io.sentry.react
2+
3+
import org.junit.Test
4+
import org.junit.runner.RunWith
5+
import org.robolectric.RobolectricTestRunner
6+
7+
/**
8+
* Tests for the profiling lifecycle on the RNSentryModuleImpl.
9+
*
10+
* Note: the Hermes sampling profiler is a JNI-backed static API and cannot be loaded in a pure
11+
* JVM unit-test environment. These tests exercise only the paths that do not cross the JNI
12+
* boundary — specifically, the invalidate() guard that returns early when profiling is not
13+
* active. If the guard were to regress and call into Hermes, the tests would fail with
14+
* UnsatisfiedLinkError.
15+
*/
16+
@RunWith(RobolectricTestRunner::class)
17+
class RNSentryProfilingTest {
18+
@Test
19+
fun `invalidate is a no-op on a fresh module`() {
20+
val module = Utils.createRNSentryModuleWithMockedContext()
21+
// isProfiling starts false; invalidate() should return early without touching
22+
// HermesSamplingProfiler. If it didn't, the JNI call would throw UnsatisfiedLinkError.
23+
module.invalidate()
24+
}
25+
26+
@Test
27+
fun `invalidate is idempotent`() {
28+
val module = Utils.createRNSentryModuleWithMockedContext()
29+
module.invalidate()
30+
module.invalidate()
31+
module.invalidate()
32+
}
33+
}

packages/core/android/src/main/java/io/sentry/react/RNSentryModuleImpl.java

Lines changed: 94 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -77,6 +77,8 @@
7777
import java.util.Map;
7878
import java.util.Properties;
7979
import java.util.concurrent.CountDownLatch;
80+
import java.util.concurrent.atomic.AtomicBoolean;
81+
import java.util.concurrent.atomic.AtomicReference;
8082
import java.util.regex.Pattern;
8183
import org.jetbrains.annotations.NotNull;
8284
import org.jetbrains.annotations.Nullable;
@@ -114,7 +116,15 @@ public class RNSentryModuleImpl {
114116
*/
115117
private int profilingTracesHz = 101;
116118

117-
private AndroidProfiler androidProfiler = null;
119+
// Atomic because invalidate() (NativeModules/teardown thread) can null the reference
120+
// concurrently with stopProfiling() (JS thread). getAndSet gives us an atomic snapshot-and-clear.
121+
private final AtomicReference<AndroidProfiler> androidProfiler = new AtomicReference<>();
122+
123+
// Atomic so that invalidate() can claim cleanup responsibility via compareAndSet/getAndSet.
124+
// startProfiling/stopProfiling run on the JS thread (@ReactMethod isBlockingSynchronousMethod);
125+
// invalidate runs on the NativeModules/teardown thread. Without atomicity, invalidate could
126+
// observe a stale false and skip disabling a running Hermes sampler.
127+
private final AtomicBoolean isProfiling = new AtomicBoolean(false);
118128

119129
private boolean isProguardDebugMetaLoaded = false;
120130
private @Nullable String proguardUuid = null;
@@ -773,29 +783,45 @@ private void initializeAndroidProfiler() {
773783
}
774784
final String tracesFilesDirPath = getProfilingTracesDirPath();
775785

776-
androidProfiler =
786+
androidProfiler.set(
777787
new AndroidProfiler(
778788
tracesFilesDirPath,
779789
(int) SECONDS.toMicros(1) / profilingTracesHz,
780790
new SentryFrameMetricsCollector(reactApplicationContext, logger, buildInfo),
781791
() -> executorService,
782-
logger);
792+
logger));
783793
}
784794

785795
public WritableMap startProfiling(boolean platformProfilers) {
786796
final WritableMap result = new WritableNativeMap();
787-
if (androidProfiler == null && platformProfilers) {
797+
if (androidProfiler.get() == null && platformProfilers) {
788798
initializeAndroidProfiler();
789799
}
790800

791801
try {
802+
// Defensive reset: the Hermes sampling profiler is global with no state
803+
// introspection, so flush any leaked registration from a prior run before
804+
// enabling. See https://github.com/facebook/hermes/issues/1853.
805+
HermesSamplingProfiler.disable();
792806
HermesSamplingProfiler.enable();
793-
if (androidProfiler != null) {
794-
androidProfiler.start();
807+
// Mark profiling active immediately after enable() succeeds so a concurrent
808+
// invalidate() observes the running state even if androidProfiler.start() below throws.
809+
isProfiling.set(true);
810+
final AndroidProfiler profiler = androidProfiler.get();
811+
if (profiler != null) {
812+
profiler.start();
795813
}
796814

797815
result.putBoolean("started", true);
798816
} catch (Throwable e) { // NOPMD - We don't want to crash in any case
817+
// Unwind Hermes if we enabled it but failed before returning. getAndSet(false) makes
818+
// this idempotent with a concurrent invalidate()/stopProfiling().
819+
if (isProfiling.getAndSet(false)) {
820+
try {
821+
HermesSamplingProfiler.disable();
822+
} catch (Throwable ignored) { // NOPMD - Best-effort unwind.
823+
}
824+
}
799825
result.putBoolean("started", false);
800826
result.putString("error", e.toString());
801827
}
@@ -805,11 +831,20 @@ public WritableMap startProfiling(boolean platformProfilers) {
805831
public WritableMap stopProfiling() {
806832
final boolean isDebug = ScopesAdapter.getInstance().getOptions().isDebug();
807833
final WritableMap result = new WritableNativeMap();
834+
// Claim cleanup ownership atomically. If invalidate() got here first, it already
835+
// disabled Hermes and collected the Android profiler; there's nothing to return.
836+
if (!isProfiling.compareAndSet(true, false)) {
837+
result.putString("error", "Profiling not active");
838+
return result;
839+
}
840+
// Take sole ownership of the AndroidProfiler reference so a concurrent invalidate()
841+
// cannot also call endAndCollect() on the same instance.
842+
final AndroidProfiler profiler = androidProfiler.getAndSet(null);
808843
File output = null;
809844
try {
810845
AndroidProfiler.ProfileEndData end = null;
811-
if (androidProfiler != null) {
812-
end = androidProfiler.endAndCollect(false, null);
846+
if (profiler != null) {
847+
end = profiler.endAndCollect(false, null);
813848
}
814849
HermesSamplingProfiler.disable();
815850

@@ -852,6 +887,57 @@ public WritableMap stopProfiling() {
852887
return result;
853888
}
854889

890+
/**
891+
* Cleanup hook invoked from the module wrappers when the React instance is destroyed (e.g. Metro
892+
* reload, orderly Activity teardown). Calls HermesSamplingProfiler.disable(), which synchronously
893+
* joins the sampler's timer thread, so no pthread_kill can fire after the JS thread is torn down.
894+
*
895+
* <p>Correctness depends on RN invoking module invalidate() before joining the JS message queue
896+
* thread. That is the documented contract for both old-arch CatalystInstance shutdown and
897+
* new-arch TurboModule teardown.
898+
*
899+
* <p>See https://github.com/getsentry/sentry-react-native/issues/5441 and
900+
* https://github.com/facebook/hermes/issues/1853.
901+
*/
902+
public void invalidate() {
903+
// Atomic gate: only one caller (invalidate vs stopProfiling vs a re-entrant invalidate)
904+
// wins the right to clean up; the rest no-op.
905+
if (!isProfiling.getAndSet(false)) {
906+
return;
907+
}
908+
// getAndSet(null) atomically snatches the reference so a concurrent stopProfiling()
909+
// sees null and skips its own endAndCollect() on the same instance.
910+
final AndroidProfiler profiler = androidProfiler.getAndSet(null);
911+
912+
// Crash-critical: disable the Hermes sampler first. disable() synchronously joins the
913+
// timer thread, so once it returns no pthread_kill can fire after the JS thread is
914+
// joined. Isolated in its own try/catch so a failure in the Android profiler cleanup
915+
// below cannot skip this call.
916+
try {
917+
HermesSamplingProfiler.disable();
918+
logger.log(SentryLevel.INFO, "Stopped Hermes sampling profiler on React instance destroy.");
919+
} catch (Throwable e) { // NOPMD - Guards the JNI boundary; does not catch native SIGABRT.
920+
logger.log(SentryLevel.WARNING, "Failed to stop Hermes sampling profiler on teardown: " + e);
921+
}
922+
923+
// Android platform profiler is independent of Hermes; a failure here must not leak
924+
// back into the Hermes shutdown above. The profile is being discarded; delete the
925+
// trace file to avoid leaking it in cacheDir.
926+
if (profiler != null) {
927+
try {
928+
final AndroidProfiler.ProfileEndData end = profiler.endAndCollect(false, null);
929+
if (end != null && end.traceFile != null) {
930+
try {
931+
end.traceFile.delete();
932+
} catch (Throwable ignored) { // NOPMD - File cleanup is best-effort.
933+
}
934+
}
935+
} catch (Throwable e) { // NOPMD - Android profiler cleanup is best-effort on teardown.
936+
logger.log(SentryLevel.WARNING, "AndroidProfiler cleanup failed during invalidate: " + e);
937+
}
938+
}
939+
}
940+
855941
private @Nullable String getProguardUuid() {
856942
if (isProguardDebugMetaLoaded) {
857943
return proguardUuid;

packages/core/android/src/newarch/java/io/sentry/react/RNSentryModule.java

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -233,4 +233,10 @@ public void enableShakeDetection() {
233233
public void disableShakeDetection() {
234234
this.impl.disableShakeDetection();
235235
}
236+
237+
@Override
238+
public void invalidate() {
239+
this.impl.invalidate();
240+
super.invalidate();
241+
}
236242
}

packages/core/android/src/oldarch/java/io/sentry/react/RNSentryModule.java

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -233,4 +233,10 @@ public void enableShakeDetection() {
233233
public void disableShakeDetection() {
234234
this.impl.disableShakeDetection();
235235
}
236+
237+
@Override
238+
public void invalidate() {
239+
this.impl.invalidate();
240+
super.invalidate();
241+
}
236242
}

0 commit comments

Comments
 (0)