Skip to content

fix(ai): preserve Error message, stack, and cause chain on $ai_error#3584

Open
Ashut0sh-mishra wants to merge 2 commits into
PostHog:mainfrom
Ashut0sh-mishra:fix/ai-error-serialization
Open

fix(ai): preserve Error message, stack, and cause chain on $ai_error#3584
Ashut0sh-mishra wants to merge 2 commits into
PostHog:mainfrom
Ashut0sh-mishra:fix/ai-error-serialization

Conversation

@Ashut0sh-mishra
Copy link
Copy Markdown

Problem

JSON.stringify on a JavaScript Error instance returns {} because
name, message, stack, and cause are non-enumerable. The
captureAiGeneration path stored JSON.stringify(options.error) directly
on ``, so events captured via withTracing ended up with a
stripped-down blob — no message, no stack, no usable cause chain — making
`GatewayResponseError` and friends nearly impossible to debug from the
dashboard.

Closes #3556

Fix

  • New serializeError helper in packages/ai/src/utils.ts that
    normalizes Error instances to plain objects (name / message /
    stack / cause plus own enumerable properties such as statusCode,
    response, validationError), recurses through the cause chain and
    through nested plain objects, and guards against circular cause graphs
    with a WeakSet.
  • captureAiGeneration now passes the error through serializeError
    before JSON.stringify, so `` keeps the same string shape
    (backwards compatible with anything that already reads it as JSON text)
    but carries the actual error data.

Tests

Added to packages/ai/tests/captureAiGeneration.test.ts:

  • A three-level cause chain Error is captured and the test asserts
    message, stack, statusCode, and the recursive cause levels
    are all present on the parsed `` JSON.
  • A circular cause chain is captured and the test asserts the inner cycle
    is replaced with '[Circular]' rather than overflowing the stack.

The existing error path parametric tests are untouched and still
assert `` is a string, so this is backwards compatible.

Reproducer (from the issue)

The original reproducer aborts mid-stream against @ai-sdk/gateway;
after this change the dashboard's `` payload contains the
GatewayResponseError `message`, `statusCode`, and the full
`validationError` -> `ZodError` cause chain instead of a blob like
`{ statusCode: 500, cause: {}, name: 'GatewayResponseError' }`.

JSON.stringify on an Error instance returns {} because
ame,
message, stack, and cause are non-enumerable properties. As a result,
$ai_generation events captured via withTracing were storing a stripped
down blob on $ai_error (and on the deeper cause levels), making it
nearly impossible to debug what actually went wrong when an SDK like
@ai-sdk/gateway threw a chained error.

Add a serializeError helper that explicitly normalizes Error instances to
plain objects (name / message / stack / cause + own enumerable properties
like statusCode, 
esponse, �alidationError), recursing into the cause
chain and into nested plain objects. A WeakSet of seen references guards
against circular cause graphs.

captureAiGeneration now passes the error through serializeError before
JSON.stringify, so $ai_error keeps the same string shape (backwards
compatible with consumers that expect a string) but carries the real data.

Closes PostHog#3556

Co-authored-by: nik464 <nik464@users.noreply.github.com>
@vercel
Copy link
Copy Markdown

vercel Bot commented May 12, 2026

@Ashut0sh-mishra is attempting to deploy a commit to the PostHog Team on Vercel.

A member of the Team first needs to authorize it.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 12, 2026

Prompt To Fix All With AI
Fix the following 1 code review issue. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 1
packages/ai/src/utils.ts:605-638
**`seen` marks shared (non-circular) references as `[Circular]`**

The `WeakSet` is shared across the entire object-graph traversal, not just the current call-stack path. This means any object that appears at two distinct locations in the error structure — a "diamond" reference (e.g. the same `response` object attached both to `err.response` and `err.cause.response`) — will serialize correctly the first time and then produce `'[Circular]'` the second time, even though there is no actual cycle. A path-tracking approach (passing the current ancestor set only, or cloning the set at each branch point) would give accurate output, though it comes at a higher allocation cost. For the primary cause-chain use case the current approach is fine, but it's worth documenting this limitation in the JSDoc since the function is exported.

Reviews (1): Last reviewed commit: "fix(ai): preserve Error message, stack, ..." | Re-trigger Greptile

Comment thread packages/ai/src/utils.ts
Addresses greptile review on PostHog#3584: the WeakSet is shared across the whole
traversal, so a diamond (non-circular) reference will be reported as
`[Circular]` after its first visit. Documented this as intentional
(keeps allocation low; correct for the primary linear cause-chain use case)
so future callers aren't surprised.

Co-authored-by: nik464 <nik464@users.noreply.github.com>
@Ashut0sh-mishra
Copy link
Copy Markdown
Author

Thanks @greptile-apps! Good point on the diamond-reference case. Documented the limitation in the JSDoc as intentional (keeps allocation O(visited); the linear cause-chain is the primary use case). Pushed.

1 similar comment
@Ashut0sh-mishra
Copy link
Copy Markdown
Author

Thanks @greptile-apps! Good point on the diamond-reference case. Documented the limitation in the JSDoc as intentional (keeps allocation O(visited); the linear cause-chain is the primary use case). Pushed.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 12, 2026

The JSDoc update is clear and thorough — the note about seen being shared across the full traversal (vs. scoped to the current ancestor path) and the explicit callout of the diamond-reference trade-off is exactly the right level of documentation for an intentional behavioral constraint. The implementation looks solid. LGTM 👍

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 12, 2026

The JSDoc reads clearly — the trade-off is well-stated and the "primary use case (linear cause chains)" framing sets the right expectation. The implementation looks solid: WeakSet allocation is indeed O(visited) and the behavior is deterministic and predictable for the actual production case (SDK error chains). LGTM on this approach. 👍

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 12, 2026

Prompt To Fix All With AI
Fix the following 1 code review issue. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 1
packages/ai/src/utils.ts:633-637
The `key in obj` check walks up the prototype chain of `obj`, so any enumerable own property on the Error whose key coincidentally matches a property on `Object.prototype` (e.g. `toString`, `valueOf`, `constructor`) would be silently skipped. `Object.hasOwn` limits the check to own properties of `obj`, which is what's actually intended here.

```suggestion
    for (const key of Object.keys(err)) {
      if (!Object.hasOwn(obj, key)) {
        obj[key] = serializeError((err as Record<string, unknown>)[key], seen)
      }
    }
```

Reviews (2): Last reviewed commit: "docs(ai): document shared-seen-set limit..." | Re-trigger Greptile

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

@posthog/ai: $ai_error loses error fields (message, stack) and cause chain because json.stringify is used directly on error instances

1 participant