Skip to content

Prevent single-sample ANR profile chunks from being silently dropped #5511

@linear-code

Description

@linear-code

Description

Customers occasionally see ANR events with no attached profile despite having sufficient quota and ANR profiling enabled. This happens because Relay drops any profile chunk with fewer than 2 samples, and a chunk can have exactly 1 sample if the main thread is caught on the very first poll after the 1-second suspicion threshold.

The drop is silent: it appears in client reports and outcomes, but not in the issue UI.

Root cause

Relay enforces a "≥2 samples" minimum on profile_chunk items — a reasonable guard for continuous profiling, where a 1-point series has no meaningful time axis. For ANR profiles, a single sample is highly actionable: it shows exactly where the main thread was stuck when the ANR began.

The single-sample case occurs at the IDLE→SUSPICIOUS edge: the first getStackTrace() on the suspicious thread is captured on the first 66 ms tick. If the process is killed before the second tick, exactly 1 sample exists.

Proposed fixes

Option 1 — Synthesize a second sample (SDK, ships independently):
In StackTraceConverter, when AnrProfile.stacks.size() == 1, emit two SentrySample entries with the same stack_id and timestamps spaced by POLLING_INTERVAL_MS / 2 (33 ms). The flamegraph renders the single stack at 100%; Relay's threshold is cleared. Two lines of change, no server dependency.

Option 2 — Halve the polling interval on SUSPICIOUS (SDK):
Reduce POLLING_INTERVAL_MS from 66 ms to 33 ms at the IDLE→SUSPICIOUS transition. This doubles sample density during the suspicious window and makes the single-sample case statistically rare, at the cost of slightly more wakeup overhead during that window. The state machine already commits its wakeup budget once it rolls into SUSPICIOUS, so the marginal cost is bounded.

Option 3 — Relay policy exemption (server-side, follow-up):
Once the profile_kind: "anr" schema field from the billing issue's longer-term work is in place, Relay can lower the minimum-sample threshold to 1 for ANR chunks. This is the semantically correct fix.

Option 1 (synthesize) is the right tactical move for the near term — ships from sentry-java alone, is reversible, and doesn't block on any server changes.

Difficulty

Low (Options 1 & 2 — SDK only); Medium (Option 3 — Relay)

Repos affected

sentry-java (tactical fix); getsentry/relay (follow-up policy fix)

Metadata

Metadata

Assignees

No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions