Skip to content

Commit 4b0aef9

Browse files
bobbyg603claude
andauthored
feat: fatal main-thread hang detection (#56)
* feat: add fatal main-thread hang detection Add opt-in detection and reporting of fatal main-thread hangs on iOS, tvOS, and macOS. A new BugSplatHangTracker class runs a low-QoS watchdog thread that dispatches a "ping" block to the main queue every threshold/5 seconds (clamped to at least 100ms). The block, when serviced, resets an atomic counter; if the counter accumulates enough unanswered pings to cover `hangDetectionThreshold` seconds, the main thread is considered hung. When the main thread later services a ping, a recovery callback fires so the persisted report can be discarded - only fatal hangs survive to the next launch. BugSplat persists the report as a synthetic PLCrashReport-style live report (marking the main thread as crashed via its captured Mach port) in the crashes directory with bugsplat-hang-* attributes and a per- launch UUID for correlation with any crash from the same launch. Reuses the existing next-launch scanner and upload pipeline. Public API on BugSplat: * `enableHangDetection` (BOOL, default NO) - opt in before -start * `hangDetectionThreshold` (NSTimeInterval, default 2.0) - tune the unresponsive duration that counts as a hang; clamped to >= 0.1s Detection is suppressed when a debugger is attached or the app is inactive (iOS/tvOS background). No-op inside app extensions. -start must be invoked on the main thread when hang detection is enabled so the main thread's Mach port can be captured. Tests cover the watchdog state machine (detection, throttle, recovery, suspension guard) deterministically via injectable clock and dispatch blocks, and the BugSplat-side persistence (file layout, metadata, recovery cleanup). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(samples): demonstrate hang detection in example apps Wire enableHangDetection into each example app and add a "Simulate Hang" button that blocks the main thread forever (sleep loop with observable side effects so the C++ forward-progress rule can't elide it). On the next launch, a fatal-hang report uploads automatically. Apps updated: SwiftUI, UIKit-Swift, UIKit-ObjC, macOS-UIKit-ObjC, and the macOS C++ command-line tool. The CLI tool's loop alternates between mainObjCRunLoop pumping and getline; the recovery callback discards any hang report queued during idle prompt waits. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(hang-detector): address Copilot review feedback * Gate ping dispatch on an outstanding-ping flag so a long hang doesn't accumulate an unbounded backlog of ping blocks on the main queue (they would all flush at once when main resumed). * Use millisecond precision for hang report filenames so a same-second hang -> recover -> hang cycle can't overwrite a previous report. * Drop unused imports (pthread, mach, UIKit) left over from the CFRunLoopObserver design and add an explicit <math.h> for ceil(). * Sample apps' "Simulate Hang" buttons sleep inside their loops instead of busy-spinning at 100% CPU - the main thread is still blocked, which is what the demo needs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent bdc863b commit 4b0aef9

18 files changed

Lines changed: 1583 additions & 133 deletions

File tree

BugSplat+Testing.h

Lines changed: 19 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,18 +11,20 @@
1111

1212
#import "BugSplatTestSupport.h"
1313
#import "BugSplatUploadService.h"
14+
#import "BugSplatHangTracker.h"
1415

1516
NS_ASSUME_NONNULL_BEGIN
1617

1718
/**
1819
* Class extension to expose private methods for testing.
1920
* These methods are implemented in the main BugSplat class.
2021
*/
21-
@interface BugSplat ()
22+
@interface BugSplat () <BugSplatHangTrackerDelegate>
2223

2324
- (BOOL)shouldSendCrashSilently:(NSDictionary *)metadata;
2425
- (NSString *)resolvedApplicationName;
2526
- (NSString *)resolvedApplicationVersion;
27+
- (nullable NSString *)crashesDirectoryPath;
2628

2729
@end
2830

@@ -83,6 +85,22 @@ NS_ASSUME_NONNULL_BEGIN
8385
*/
8486
- (void)setDebuggerAttachedOverride:(nullable NSNumber *)value;
8587

88+
#pragma mark - Hang Detection Testing
89+
90+
/**
91+
* Create the internal plumbing (serial queue, launch id) that `-start` would
92+
* normally set up for hang detection. Lets tests exercise the hang delegate
93+
* methods without starting the real tracker or enabling the crash reporter.
94+
*/
95+
- (void)setupHangInfrastructureForTesting;
96+
97+
/// Serial queue that hang delegate callbacks dispatch onto; `dispatch_sync` on
98+
/// this queue to wait for pending hang work to drain.
99+
- (nullable dispatch_queue_t)hangQueueForTesting;
100+
101+
/// The basename of the hang report most recently persisted by the hang delegate.
102+
- (nullable NSString *)currentHangFilename;
103+
86104
@end
87105

88106
NS_ASSUME_NONNULL_END

BugSplat.h

Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -165,6 +165,49 @@ NS_ASSUME_NONNULL_BEGIN
165165
*/
166166
@property (nonatomic, assign) BOOL autoSubmitCrashReport;
167167

168+
/**
169+
* Enable detection and reporting of fatal main-thread hangs.
170+
*
171+
* When set to YES before `-start` is invoked, BugSplat monitors the main runloop for
172+
* prolonged unresponsive periods while the app is active in the foreground. If a hang
173+
* is detected and the app is subsequently terminated without the main thread recovering
174+
* (launch/resume watchdog kills, user force-quit), a hang report is uploaded on the next
175+
* launch using the same pipeline as crash reports.
176+
*
177+
* If the main thread resumes after a hang is detected, the persisted report is discarded -
178+
* non-fatal hangs are not reported in this version.
179+
*
180+
* Hang reports carry the exception name `App Hang (Fatal)` and include attributes prefixed
181+
* with `bugsplat-hang-` (duration, detection time, app state, launch id) that can be used
182+
* to correlate with crashes from the same launch.
183+
*
184+
* Detection is suppressed when a debugger is attached or the app is not active. As a
185+
* consequence, hangs that begin while the app is in the background (including those
186+
* terminated by background-task expiration) are not reported.
187+
*
188+
* This property is a no-op inside app extensions.
189+
*
190+
* When this property is YES, `-start` must be invoked on the main thread - the
191+
* main thread's Mach port is captured there so the hang report identifies the
192+
* correct thread. Debug builds assert; Release builds will silently capture the
193+
* wrong thread.
194+
*
195+
* Default: NO
196+
*/
197+
@property (nonatomic, assign) BOOL enableHangDetection;
198+
199+
/**
200+
* Threshold in seconds for declaring the main thread hung when `enableHangDetection` is YES.
201+
*
202+
* Must be set before `-start` is invoked. Values below 0.1 are clamped to 0.1 by the
203+
* underlying tracker. Typical production values are 1.0-5.0 seconds; choose a value above
204+
* any work the app may legitimately do on the main thread (image decoding, JSON parsing,
205+
* etc.) to avoid false positives.
206+
*
207+
* Default: 2.0
208+
*/
209+
@property (nonatomic, assign) NSTimeInterval hangDetectionThreshold;
210+
168211
/**
169212
* Add an attribute and value to a dictionary of attributes that will potentially be included in a crash report.
170213
* If the attribute is an invalid XML entity name, or the attribute+value pair cannot be set,

0 commit comments

Comments
 (0)