Skip to content

Commit 4faf11b

Browse files
authored
fix: improve Sentry issue grouping to eliminate duplicate issues (#1028)
## Problem The CLI's Sentry project had 300+ unresolved issues with ~30 duplicate groups caused by gaps in the fingerprinting system: 1. **Missing `cli_error.kind` tags** — `CliError` (base), `HostScopeError`, and `WizardError` had no grouping key, so every unique error message created a separate Sentry issue 2. **Non-CliError exceptions** (TypeError, Error, WizardCancelledError) only got `cli_error.class` with no kind tag 3. **`extractResourceKind()`** didn't strip bare slugs (`in my-org`) or entity-name prefixed slugs (`Organization my-company`) 4. **EBADF noise** (~2,867 events across 3 duplicate issues) from stdin reopen fd errors 5. **No API endpoint normalization** for sub-grouping ApiError by endpoint shape ## Solution ### Code changes (2 source files) **`src/lib/error-reporting.ts`:** - Extract `deriveErrorKind()` from `setGroupingTags()` to reduce cognitive complexity - Add `cli_error.kind` for `HostScopeError` (`host_scope`), `WizardError` (`wizard`), and base `CliError` (4-word message prefix) - Add `normalizeEndpoint()` — parameterizes variable API path segments, sets `cli_error.api_endpoint` tag - Expand `extractResourceKind()` — strip `in <slug>` (not just `in org/project`), strip hyphenated slugs after entity names - Enrich `enrichEventWithGroupingTags()` — set `cli_error.kind` from message prefix for non-CliError exceptions **`src/lib/telemetry.ts`:** - Add `isEbadfError()` — detects EBADF errors (bad file descriptor from stdin reopen) - Drop EBADF events in `beforeSend` (same class of OS noise as EPIPE) ### Sentry project housekeeping (already done) - **Merged 18 duplicate groups** (~50 issues merged into canonical issues) - **Resolved 4 issues** as `resolved in next release` (EBADF ×3, replaceAll TypeError ×1) ### Tests - 34 new tests (unit + property) covering all new grouping logic - All 7258 existing tests continue to pass ## Expected Impact | Metric | Before | After | |--------|--------|-------| | Error classes without `cli_error.kind` | 5 + all non-CliError | 0 | | EBADF events/week | ~2,867 | 0 (dropped) | | Duplicate issue groups | ~30 | 0 (merged + prevented) |
1 parent d7440ff commit 4faf11b

6 files changed

Lines changed: 511 additions & 46 deletions

File tree

.lore.md

Lines changed: 51 additions & 0 deletions
Large diffs are not rendered by default.

src/lib/error-reporting.ts

Lines changed: 133 additions & 37 deletions
Original file line numberDiff line numberDiff line change
@@ -21,14 +21,17 @@ import * as Sentry from "@sentry/node-core/light";
2121
import {
2222
ApiError,
2323
AuthError,
24+
CliError,
2425
ContextError,
2526
DeviceFlowError,
27+
HostScopeError,
2628
OutputError,
2729
ResolutionError,
2830
SeerError,
2931
TimeoutError,
3032
UpgradeError,
3133
ValidationError,
34+
WizardError,
3235
} from "./errors.js";
3336

3437
// ---------------------------------------------------------------------------
@@ -101,6 +104,43 @@ function recordSilencedError(error: unknown, reason: SilenceReason): void {
101104
// Grouping tags
102105
// ---------------------------------------------------------------------------
103106

107+
/** Endpoint normalization patterns — compiled once at module scope. */
108+
const ENDPOINT_PATTERNS: [RegExp, string][] = [
109+
[/\/organizations\/[^/]+/, "/organizations/{org}"],
110+
[/\/projects\/[^/]+\/[^/]+/, "/projects/{org}/{project}"],
111+
[/\/issues\/[^/]+/, "/issues/{id}"],
112+
[/\/events\/[^/]+/, "/events/{id}"],
113+
[/\/groups\/[^/]+/, "/groups/{id}"],
114+
[/\/releases\/[^/]+/, "/releases/{version}"],
115+
[/\/teams\/[^/]+\/[^/]+/, "/teams/{org}/{team}"],
116+
[/\/dashboards\/[^/]+/, "/dashboards/{id}"],
117+
[/\/customers\/[^/]+/, "/customers/{org}"],
118+
];
119+
120+
/**
121+
* Strip remaining bare numeric segments (e.g. /12345/) but preserve
122+
* the API version prefix /0/ which is always the second segment.
123+
*/
124+
const BARE_NUMERIC_SEGMENT_RE = /(?<=\/api\/0\/.*)\/\d+(?=\/|$)/g;
125+
126+
/**
127+
* Normalize an API endpoint path by parameterizing variable segments.
128+
*
129+
* Replaces org slugs, project slugs, issue IDs, event IDs, and other
130+
* entity identifiers with placeholders so that server-side fingerprint
131+
* rules can sub-group `ApiError` by endpoint shape rather than exact path.
132+
*
133+
* `"/api/0/projects/my-org/my-project/events/abc123/"` →
134+
* `"/api/0/projects/{org}/{project}/events/{id}/"`
135+
*/
136+
export function normalizeEndpoint(endpoint: string): string {
137+
let result = endpoint;
138+
for (const [pattern, replacement] of ENDPOINT_PATTERNS) {
139+
result = result.replace(pattern, replacement);
140+
}
141+
return result.replace(BARE_NUMERIC_SEGMENT_RE, "/{id}");
142+
}
143+
104144
/**
105145
* Strip quoted substrings, numeric/hex IDs, and org/project paths from a
106146
* resource string to produce a stable "kind" for grouping.
@@ -111,14 +151,24 @@ function recordSilencedError(error: unknown, reason: SilenceReason): void {
111151
* `"not found in neurio/installer-app"` → `"not found"`
112152
*/
113153
export function extractResourceKind(resource: string): string {
114-
return resource
115-
.replace(/'[^']*'/g, "")
116-
.replace(/"[^"]*"/g, "")
117-
.replace(/\b[0-9a-f]{16,32}\b/gi, "")
118-
.replace(/\bin\s+[\w-]+\/[\w-]+/g, "")
119-
.replace(/\b\d+\b/g, "")
120-
.replace(/\s+/g, " ")
121-
.trim();
154+
return (
155+
resource
156+
.replace(/'[^']*'/g, "")
157+
.replace(/"[^"]*"/g, "")
158+
.replace(/\b[0-9a-f]{16,32}\b/gi, "")
159+
.replace(/\bin\s+[\w-]+(?:\/[\w-]+)*/g, "")
160+
// Strip hyphenated slugs after known entity names (e.g., "Organization my-company").
161+
// Requires at least one hyphen to avoid stripping English words ("Project not found").
162+
// Safe for current callers: resource values with slugs use quotes (stripped above),
163+
// and headline values don't start with entity names.
164+
.replace(
165+
/\b(Organization|Dashboard|Dashboards|Project|Team)\s+[\w][\w-]*-[\w-]*/gi,
166+
"$1"
167+
)
168+
.replace(/\b\d+\b/g, "")
169+
.replace(/\s+/g, " ")
170+
.trim()
171+
);
122172
}
123173

124174
/**
@@ -140,6 +190,64 @@ export function extractMessagePrefix(message: string, maxWords = 3): string {
140190
.join(" ");
141191
}
142192

193+
/**
194+
* Derive a stable `cli_error.kind` grouping key from an error instance.
195+
*
196+
* Returns `undefined` when the error is not a recognized CLI error class
197+
* (the caller should still set `cli_error.class` for basic grouping).
198+
*/
199+
function deriveErrorKind(error: Error): string | undefined {
200+
if (error instanceof ContextError) {
201+
return error.resource;
202+
}
203+
if (error instanceof ResolutionError) {
204+
return (
205+
extractResourceKind(error.resource) +
206+
" " +
207+
extractResourceKind(error.headline)
208+
);
209+
}
210+
// Fall back to the first few words of the message when no field is set
211+
// (e.g. validateHexId throws with no `field`, so using field would
212+
// collapse every unfielded ValidationError into one group).
213+
if (error instanceof ValidationError) {
214+
return error.field ?? extractMessagePrefix(error.message);
215+
}
216+
if (error instanceof ApiError) {
217+
return String(error.status);
218+
}
219+
if (error instanceof SeerError) {
220+
return error.reason;
221+
}
222+
if (error instanceof AuthError) {
223+
return error.reason;
224+
}
225+
if (error instanceof UpgradeError) {
226+
return error.reason;
227+
}
228+
if (error instanceof DeviceFlowError) {
229+
return error.code;
230+
}
231+
if (error instanceof TimeoutError) {
232+
return "timeout";
233+
}
234+
if (error instanceof HostScopeError) {
235+
return "host_scope";
236+
}
237+
if (error instanceof WizardError) {
238+
return "wizard";
239+
}
240+
// Catch-all for bare CliError — must be checked AFTER all subclasses
241+
// because instanceof matches the entire prototype chain.
242+
// ConfigError and OutputError intentionally fall through here:
243+
// ConfigError has no structured field beyond message; OutputError is
244+
// silenced by classifySilenced() before reaching deriveErrorKind().
245+
if (error instanceof CliError) {
246+
return extractMessagePrefix(error.message, 4);
247+
}
248+
return;
249+
}
250+
143251
/**
144252
* Set `cli_error.*` tags on a Sentry scope for an error that will be
145253
* captured. These tags are matched by server-side fingerprint rules to
@@ -149,6 +257,7 @@ export function extractMessagePrefix(message: string, maxWords = 3): string {
149257
* - `cli_error.class` — error class name (e.g. `"ContextError"`)
150258
* - `cli_error.kind` — stable grouping key derived from structured fields
151259
* - `cli_error.api_status` — HTTP status (ApiError only)
260+
* - `cli_error.api_endpoint` — normalized API path (ApiError only)
152261
*/
153262
function setGroupingTags(scope: Sentry.Scope, error: unknown): void {
154263
if (!(error instanceof Error)) {
@@ -157,36 +266,16 @@ function setGroupingTags(scope: Sentry.Scope, error: unknown): void {
157266

158267
scope.setTag("cli_error.class", error.name);
159268

160-
if (error instanceof ContextError) {
161-
scope.setTag("cli_error.kind", error.resource);
162-
} else if (error instanceof ResolutionError) {
163-
scope.setTag(
164-
"cli_error.kind",
165-
extractResourceKind(error.resource) +
166-
" " +
167-
extractResourceKind(error.headline)
168-
);
169-
} else if (error instanceof ValidationError) {
170-
// Fall back to the first few words of the message when no field is set
171-
// (e.g. validateHexId throws with no `field`, so using field would
172-
// collapse every unfielded ValidationError into one group).
173-
scope.setTag(
174-
"cli_error.kind",
175-
error.field ?? extractMessagePrefix(error.message)
176-
);
177-
} else if (error instanceof ApiError) {
269+
const kind = deriveErrorKind(error);
270+
if (kind !== undefined) {
271+
scope.setTag("cli_error.kind", kind);
272+
}
273+
274+
if (error instanceof ApiError) {
178275
scope.setTag("cli_error.api_status", String(error.status));
179-
scope.setTag("cli_error.kind", String(error.status));
180-
} else if (error instanceof SeerError) {
181-
scope.setTag("cli_error.kind", error.reason);
182-
} else if (error instanceof AuthError) {
183-
scope.setTag("cli_error.kind", error.reason);
184-
} else if (error instanceof UpgradeError) {
185-
scope.setTag("cli_error.kind", error.reason);
186-
} else if (error instanceof DeviceFlowError) {
187-
scope.setTag("cli_error.kind", error.code);
188-
} else if (error instanceof TimeoutError) {
189-
scope.setTag("cli_error.kind", "timeout");
276+
if (error.endpoint) {
277+
scope.setTag("cli_error.api_endpoint", normalizeEndpoint(error.endpoint));
278+
}
190279
}
191280
}
192281

@@ -273,5 +362,12 @@ export function enrichEventWithGroupingTags(
273362
event.tags = event.tags ?? {};
274363
event.tags["cli_error.class"] = exc.type;
275364

365+
// Set kind from exception message prefix so server-side rules can group
366+
// non-CliError exceptions (TypeError, Error, WizardCancelledError, etc.)
367+
// that bypass reportCliError (uncaught exceptions, unhandled rejections).
368+
if (exc.value) {
369+
event.tags["cli_error.kind"] = extractMessagePrefix(exc.value, 4);
370+
}
371+
276372
return event;
277373
}

src/lib/telemetry.ts

Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -313,6 +313,35 @@ export function isEpipeError(event: Sentry.ErrorEvent): boolean {
313313
return false;
314314
}
315315

316+
/**
317+
* Detect EBADF (bad file descriptor) errors in Sentry events.
318+
*
319+
* These occur when the init wizard's stdin reopen (`stdin-reopen.ts`) queues
320+
* reads on a destroyed file descriptor. Same class of OS-level noise as EPIPE
321+
* — not actionable, just different fd numbers producing duplicate issues.
322+
*
323+
* @internal Exported for testing
324+
*/
325+
export function isEbadfError(event: Sentry.ErrorEvent): boolean {
326+
const exceptions = event.exception?.values;
327+
if (exceptions) {
328+
for (const ex of exceptions) {
329+
if (ex.value?.includes("EBADF")) {
330+
return true;
331+
}
332+
}
333+
}
334+
335+
const systemError = event.contexts?.node_system_error as
336+
| { code?: string }
337+
| undefined;
338+
if (systemError?.code === "EBADF") {
339+
return true;
340+
}
341+
342+
return false;
343+
}
344+
316345
/**
317346
* Check if an error is a user-caused (401–499) API error.
318347
*
@@ -612,6 +641,13 @@ export function initSentry(
612641
return null;
613642
}
614643

644+
// EBADF errors come from the init wizard's stdin reopen queuing reads
645+
// on a destroyed fd. Same class of OS-level noise as EPIPE — different
646+
// fd numbers just produce duplicate issues. Not actionable — drop them.
647+
if (isEbadfError(event)) {
648+
return null;
649+
}
650+
615651
// Normalize relative frame paths to absolute. Bun's compiled binaries
616652
// with sourcemap: "linked" produce relative paths like "dist-bin/bin.js"
617653
// in Error.stack. Sentry's symbolicator only matches absolute paths

test/lib/error-reporting.property.test.ts

Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,9 @@ const slugArb = array(
2525
.map((chars) => chars.join(""))
2626
.filter((s) => !(s.startsWith("-") || s.endsWith("-")) && s.length > 0);
2727

28+
/** Slug that contains at least one hyphen (matches the entity-name strip regex). */
29+
const hyphenatedSlugArb = slugArb.filter((s) => s.includes("-"));
30+
2831
/** 32-character lowercase hex id (trace/event/log id). */
2932
const hexIdArb = stringMatching(/^[0-9a-f]{32}$/).filter(
3033
(s) => s.length === 32
@@ -130,4 +133,45 @@ describe("extractResourceKind — property tests", () => {
130133
{ numRuns: DEFAULT_NUM_RUNS }
131134
);
132135
});
136+
137+
test("bare slug after 'in' (no slash) is stripped for any slug", () => {
138+
fcAssert(
139+
property(slugArb, slugArb, (a, b) => {
140+
expect(extractResourceKind(`not found in ${a}`)).toBe(
141+
extractResourceKind(`not found in ${b}`)
142+
);
143+
expect(extractResourceKind(`not found in ${a}`)).toBe("not found");
144+
}),
145+
{ numRuns: DEFAULT_NUM_RUNS }
146+
);
147+
});
148+
149+
test("hyphenated slug after entity name is stripped for any slug", () => {
150+
fcAssert(
151+
property(hyphenatedSlugArb, hyphenatedSlugArb, (a, b) => {
152+
expect(extractResourceKind(`Organization ${a}`)).toBe(
153+
extractResourceKind(`Organization ${b}`)
154+
);
155+
expect(extractResourceKind(`Organization ${a}`)).toBe("Organization");
156+
}),
157+
{ numRuns: DEFAULT_NUM_RUNS }
158+
);
159+
});
160+
161+
test("Dashboard with numeric ID and slug produces same kind", () => {
162+
fcAssert(
163+
property(
164+
numericIdArb,
165+
slugArb,
166+
numericIdArb,
167+
slugArb,
168+
(n1, s1, n2, s2) => {
169+
expect(extractResourceKind(`Dashboard ${n1} in ${s1}`)).toBe(
170+
extractResourceKind(`Dashboard ${n2} in ${s2}`)
171+
);
172+
}
173+
),
174+
{ numRuns: DEFAULT_NUM_RUNS }
175+
);
176+
});
133177
});

0 commit comments

Comments
 (0)