feat(ai): add $ai_completion_id, $ai_system_fingerprint, $ai_request_id to OpenAI events#3306
feat(ai): add $ai_completion_id, $ai_system_fingerprint, $ai_request_id to OpenAI events#3306johnsykim wants to merge 2 commits into
Conversation
…id to OpenAI events
Captures OpenAI response metadata (response.id, system_fingerprint, _request_id)
in $ai_generation events, enabling direct correlation between PostHog events
and OpenAI's Logs dashboard (platform.openai.com/logs/{completion_id}).
|
@johnsykim is attempting to deploy a commit to the PostHog Team on Vercel. A member of the Team first needs to authorize it. |
Important Files Changed
Prompt To Fix All With AIThis is a comment left during a code review.
Path: packages/ai/tests/openai.test.ts
Line: 638
Comment:
**Missing `$ai_request_id` coverage for Responses API paths**
The PR description's test plan says "Verify `$ai_completion_id` appears for Responses API (create + parse)" but only the `parse` path is tested here. The `responses.create` non-streaming path (tested at ~line 1619 for web search) has no assertion for `$ai_completion_id`.
Additionally, the implementation adds `requestId: (result as any)._request_id` for both the `responses.create` and `responses.parse` paths (`index.ts` lines 557–561 and 630–634, `azure.ts` lines 489–493 and 558–560), but there is no test verifying that `$ai_request_id` is captured when `_request_id` is present on a Responses API result. The `mockOpenAiParsedResponse` does not set `_request_id`, so the positive case is never exercised.
Consider adding `_request_id` to `mockOpenAiParsedResponse` and asserting `$ai_request_id` here (and in the `responses.create` web-search test, or in a dedicated `responses.create` test):
```suggestion
expect(properties['$ai_completion_id']).toBe('test-parsed-response-id')
expect(properties['$ai_request_id']).toBe('req_test-parsed-request-id')
```
(After also adding `_request_id: 'req_test-parsed-request-id'` to `mockOpenAiParsedResponse`.)
How can I resolve this? If you propose a fix, please make it concise.Reviews (3): Last reviewed commit: "fix: address Greptile review feedback" | Re-trigger Greptile |
- Remove dead `| null` from streaming systemFingerprintFromResponse type - Add inline comments explaining _request_id is the x-request-id header
|
@greptile review |
|
This PR hasn't seen activity in a week! Should it be merged, closed, or further worked on? If you want to keep it open, post a comment or remove the |
|
@haacked @danielbachhuber Sorry for random tagging (please tell me if there is a better PostHog POC to reach out to). What's the best way to get PostHog team's attention on this PR? Is there a certain review and release process I have to follow? |
|
@johnsykim Hi! I'm no longer with PostHog but, based on the history of the files, @richardsolomou, @carlos-marchal-ph, or @Radu-Raicea might be able to give you a review. |
|
This PR hasn't seen activity in a week! Should it be merged, closed, or further worked on? If you want to keep it open, post a comment or remove the |
richardsolomou
left a comment
There was a problem hiding this comment.
Thanks for putting this together — the motivation (correlating PostHog events to OpenAI's logs dashboard) is a real pain point and the implementation is clean. Chatted with the team about the schema shape, and we'd like to adjust the approach before merging.
Design ask
- Move OpenAI-specific fields under a
$ai_provider_metadatablob —$ai_*has been provider-agnostic so far, andsystem_fingerprint/request_idare OpenAI-only concepts. Rather than living at the top level, they should go under a new$ai_provider_metadataproperty so each provider wrapper can surface its own metadata without polluting the shared namespace. Something like:This also sets the pattern for Anthropic / Gemini wrappers to follow later.$ai_provider_metadata: { system_fingerprint: '...', request_id: '...', }
- Keep
$ai_completion_idat the top level — response IDs generalize cleanly (Anthropic and Gemini both have equivalents), so this one belongs in the shared schema.
On the app side: we think rendering a link to the OpenAI log from the event inspector is valuable and we'll pick that up separately once this is merged!
Blocking
- Build fails with TS2353 on
packages/ai/tests/openai.test.ts:292—_request_idisn't a public property ofChatCompletion, so the literal is rejected under strict TS.pnpm buildfails on this branch and passes onmain. CI for external PRs only runs Wiz/Graphite/Vercel, which is why this slipped past. Easiest fix:let mockOpenAiChatResponse: ChatCompletion & { _request_id?: string } = {} as ChatCompletion
Suggestions
- Duplicated
(result as any)._request_idpattern — appears 6 times acrosspackages/ai/src/openai/index.tsandpackages/ai/src/openai/azure.ts. A small helper (extractRequestId(result)) would consolidate the cast and the explanatory comment in one place. - Streaming paths don't capture
requestId— onlycompletionId/systemFingerprintare pulled from chunks. The OpenAI SDK exposes_request_idon the aggregated stream response (via.withResponse()), so there's room to capture it for streams too. Fine as a follow-up.
Let me know if anything's unclear — happy to answer questions as you work through the restructure.
|
This PR hasn't seen activity in a week! Should it be merged, closed, or further worked on? If you want to keep it open, post a comment or remove the |
|
Thanks a lot @richardsolomou. I'll raise a revision sometime soon. |
|
This PR hasn't seen activity in a week! Should it be merged, closed, or further worked on? If you want to keep it open, post a comment or remove the |
|
Hi @johnsykim, taking this over from @richardsolomou! I agree with Richard's review. I'm going to mark this as a draft in the meantime, un-draft it yourself when it's ready for us again, and we'll get pinged to take a second look :) |
Summary
$ai_generationevents for OpenAI and Azure OpenAI wrappers:$ai_completion_id—response.id(e.g.chatcmpl-xxx,resp_xxx)$ai_system_fingerprint—response.system_fingerprint$ai_request_id—response._request_id(from OpenAI SDK'sx-request-idheader)platform.openai.com/logs/{completion_id})Test plan
$ai_completion_id,$ai_system_fingerprint,$ai_request_idappear in captured event properties for non-streaming Chat Completions$ai_completion_id,$ai_system_fingerprintappear for streaming Chat Completions$ai_completion_idappears for Responses API (create + parse)Background & motivation
Upstream: Add
$ai_completion_idto@posthog/aiOpenAI WrapperThe
@posthog/aipackage wraps the OpenAI SDK and auto-emits$ai_generationevents to PostHog. However, it does not capture the OpenAI response ID (response.id), which is the primary key used in OpenAI's Logs dashboard atplatform.openai.com/logs/{completion_id}.This makes it impossible to navigate from a PostHog
$ai_generationevent directly to the corresponding OpenAI log entry.Why This Matters
response.id, there is no way to join PostHog events with OpenAI's own logging system.x-request-id— having this in PostHog makes it easy to file tickets with the right correlation ID.Current Workaround
We use a multi-hop join via entity IDs:
rolePlayId)ProviderCostsdatabase table by the same entity IDcompletionIdfrom thecontextJSON fieldplatform.openai.com/logs/{completionId}This only works for the ~6 call sites that write to
ProviderCostswith acompletionId. It does not scale to all call sites.References
https://platform.openai.com/logs?api=chat-completionschatcmpl-{base62}(Chat Completions API),resp_{base62}(Responses API)_request_id: automatically attached from thex-request-idresponse header by theopenainpm package