docs: clarify OpenAI Python parse vs response_format guidance by jannikmaierhoefer · Pull Request #2884 · langfuse/langfuse-docs

jannikmaierhoefer · 2026-04-30T06:53:57Z

Summary

Update the OpenAI Python integration page and the structured output cookbook to recommend client.chat.completions.parse(...) for openai>=1.92.0 and scope the beta caveat to older SDK versions.
Note that the Langfuse Python SDK instruments both the stable (openai.resources.chat.completions.Completions.parse) and the legacy beta path, so Langfuse attributes (name, metadata, langfuse_session_id, …) work on either.
Keep the response_format + type_to_response_format_param example as a fallback for users who cannot upgrade openai.

Why

Reported by David Traina (Ramp) in Pylon #1339. OpenAI moved parse/stream out of beta in openai-python v1.92.0 ~10 months ago, but our docs still warned against the beta API and pushed users to response_format + chat.completions.create. The SDK has already supported the stable path for a while — only the docs were stale.

Test plan

pnpm dev and verify the Structured Output section on /integrations/model-providers/openai-py renders correctly.
Verify the regenerated /guides/cookbook/integration_openai_structured_output page renders the updated note and parse example.

🤖 Generated with Claude Code

Disclaimer: Experimental PR review

Greptile Summary

This PR corrects stale documentation that incorrectly told users to avoid client.chat.completions.parse in favour of response_format + create. It updates both the integration page and the cookbook to recommend the stable parse API (available since openai-python v1.92.0) and preserves a type_to_response_format_param fallback for users who cannot upgrade.

Confidence Score: 4/5

Safe to merge — documentation-only changes with accurate technical content and only minor style observations.

All three files are docs/notebook updates with no runtime code. The guidance is factually correct. The only findings are P2: a private-API import risk in the legacy fallback (pre-existing pattern, not introduced here) and mildly ambiguous phrasing in one note.

No files require special attention; the private import in openai-py.mdx is worth a comment but is not blocking.

Important Files Changed

Filename	Overview
content/integrations/model-providers/openai-py.mdx	Structured Output section rewritten to recommend the stable `parse` API (openai>=1.92.0) and retain `type_to_response_format_param` as a legacy fallback; the fallback imports from a private internal module (`openai.lib._parsing._completions`).
content/guides/cookbook/integration_openai_structured_output.md	Note updated to clarify both `parse` paths are instrumented; `Alternative` section switched from `client.beta.chat.completions.parse` to stable `client.chat.completions.parse` with a Langfuse `name` attribute; phrasing in the note is slightly ambiguous.
cookbook/integration_openai_structured_output.ipynb	Notebook reformatted to 1-space indentation and updated to mirror the .md changes: stable `parse` path, `name` attribute added, old output cells preserved.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[User wants Structured Output\nwith Langfuse tracing] --> B{openai SDK version?}
    B -- ">=1.92.0\n(recommended)" --> C["client.chat.completions.parse(...)\nresponse_format=PydanticModel\nname='...' metadata={...}"]
    B -- "<1.92.0\n(legacy)" --> D{Pydantic model needed?}
    D -- Yes --> E["client.beta.chat.completions.parse(...)\nresponse_format=PydanticModel\n(re-routed to stable on >=1.92.0)"]
    D -- No / can't upgrade --> F["type_to_response_format_param(Model)\n→ client.chat.completions.create(...)\nresponse_format=schema_dict"]
    C --> G[Langfuse traces both name\nand metadata attributes ✓]
    E --> G
    F --> G

Prompt To Fix All With AI

Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 2
content/integrations/model-providers/openai-py.mdx:101
**Private internal API in legacy fallback**

`openai.lib._parsing._completions` is an underscore-prefixed internal module — it is not part of OpenAI's public API surface and can be removed or renamed without a semver-breaking release. Users who follow this fallback path are silently depending on an implementation detail that could break on any minor OpenAI SDK bump, even within `<1.92.0`. Consider noting this risk explicitly, or suggesting users pin their OpenAI version when using this path.

### Issue 2 of 2
content/guides/cookbook/integration_openai_structured_output.md:58
**Slightly ambiguous parenthetical in the note**

The clause "for older SDK versions, where `parse` is re-routed to the stable method on newer SDKs" embeds a forward-reference to newer-SDK behaviour inside the description of the older-SDK path, which can read as contradictory. Consider splitting the two facts into separate sentences, e.g. "…the legacy `client.beta.chat.completions.parse(...)` (available on `openai<1.92.0`). On `openai>=1.92.0` the OpenAI SDK re-routes beta calls to the stable method, so either path reaches the same instrumented function."

_{Reviews (1): Last reviewed commit: "docs: clarify OpenAI Python parse vs res..." | Re-trigger Greptile}

openai-python>=1.92.0 graduated parse/stream out of beta. The Langfuse SDK already instruments both client.chat.completions.parse (stable) and client.beta.chat.completions.parse (legacy), so the previous "use response_format with chat.completions.create instead of the Beta API" guidance was outdated. Update the OpenAI Python integration page and the structured output cookbook to recommend the stable parse helper and scope the beta caveat to older SDKs. Reported by David Traina (Ramp) via support. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Langfuse-Session: https://cloud.langfuse.com/project/cloramnkj0002jz088vzn1ja4/sessions/b55454d8-1e5e-477d-a6ba-f81f65fe959c

vercel · 2026-04-30T06:54:02Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
langfuse-docs	Ready	Preview, Comment	Apr 30, 2026 6:55am

review-notebook-app · 2026-04-30T06:54:03Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

github-actions · 2026-04-30T06:54:13Z

@claude review

claude

Docs-only clarification of the OpenAI Python parse vs response_format guidance — the inline nits are minor wording suggestions and the underlying changes are correct.

Extended reasoning...

Overview

This PR updates the OpenAI Python integration docs (content/integrations/model-providers/openai-py.mdx) and the matching structured-output cookbook (.md and .ipynb) to recommend client.chat.completions.parse(...) for openai>=1.92.0, scope the beta caveat to older SDK versions, and keep the response_format + type_to_response_format_param example as a fallback. It also adds a new parse example with name/metadata Langfuse attributes.

Security risks

None. This change touches only Markdown/MDX/notebook content with no runtime, auth, or user-input handling implications.

Level of scrutiny

Low — documentation-only, no code paths or configuration are affected. The factual claim being introduced (parse graduated out of beta in openai-python v1.92.0 and the Langfuse SDK instruments both the stable and beta parse paths) is consistent with the linked release notes and the rest of the integration docs.

Other factors

The two inline nits posted are wording-level: (1) the new bullet groups langfuse_session_id alongside direct kwargs even though it is a metadata key, and (2) the #### Structured Output subsection now lives under an ### OpenAI Beta APIs parent whose intro still says beta APIs require manual @observe() wrapping. Neither is incorrect documentation per se — the canonical 'Custom trace properties' table and a correct metadata={...} example are right above and below the new prose — and a Vercel preview is already building for visual verification. These are the kind of small editorial tweaks a maintainer can take or leave; they don't gate approval.

claude · 2026-04-30T07:02:13Z

+- **`openai>=1.92.0` (recommended):** use `client.chat.completions.parse(...)`. OpenAI graduated `parse` and `stream` out of beta in [v1.92.0](https://github.com/openai/openai-python/releases/tag/v1.92.0), and Langfuse wraps the stable `openai.resources.chat.completions.Completions.parse` (and the async variant). You can pass a Pydantic model directly via `response_format` and still set Langfuse attributes such as `name`, `metadata`, `langfuse_session_id`, etc.
+- **`openai<1.92.0` (legacy):** the parse helper is only available under `client.beta.chat.completions.parse(...)`. Langfuse also wraps the beta path on these older versions, so attributes like `name` and `metadata` work there too.


🟡 The new note groups name, metadata, and langfuse_session_id together as 'Langfuse attributes' (line 386 and again in the cookbook .md/.ipynb), implying all three are direct kwargs of chat.completions.parse(...). However, the same MDX file's 'Custom trace properties' table (lines 206-211) lists only name/metadata/trace_id/parent_observation_id as direct kwargs, and the 'Setting trace attributes' section (line 230) shows that langfuse_session_id, langfuse_user_id, and langfuse_tags must live inside the metadata dict. A reader could try parse(..., langfuse_session_id='x') and have it silently dropped. Suggest rewording to something like 'attributes such as name and metadata (with langfuse_session_id, langfuse_user_id, langfuse_tags etc. nested inside metadata)' — note the existing example at line 416 already demonstrates the correct pattern with metadata={"langfuse_tags": [...]}.

Extended reasoning...

The bug. The new prose at content/integrations/model-providers/openai-py.mdx lines 386-387 (and the same wording duplicated in content/guides/cookbook/integration_openai_structured_output.md at lines 58 and 193, plus the matching .ipynb cells) reads:

You can pass a Pydantic model directly via response_format and still set Langfuse attributes such as name, metadata, langfuse_session_id, etc.

This sentence flattens two different things into one comma-separated list: name and metadata are real keyword arguments accepted by the Langfuse-wrapped OpenAI call, but langfuse_session_id is not — it is a key inside the metadata dict.

Why this contradicts the rest of the file. The same MDX has a 'Custom trace properties' table at lines 204-211 that lists exactly four direct kwargs: name, metadata, trace_id, parent_observation_id. langfuse_session_id is deliberately absent from that table. The 'Setting trace attributes (session_id, user_id, tags)' section that follows (lines 213-236) makes the correct usage explicit:

metadata={ "langfuse_session_id": "session_123", "langfuse_user_id": "user_456", "langfuse_tags": ["calculator"], ... }

So the new prose contradicts the canonical documentation just 150 lines above it.

The new prose also contradicts its own example. The very next code block (line 416 in the MDX) does the right thing:

completion = openai.chat.completions.parse( ..., name="extract-calendar-event", metadata={"langfuse_tags": ["structured-output"]}, )

Here langfuse_tags is correctly nested inside metadata, not passed as a direct kwarg. The example demonstrates the correct pattern; the prose above it does not.

Step-by-step proof of how a reader gets misled.

Reader lands on the new 'Structured Output' section because they want session/user attribution on a parse call.

They read the bullet: "You can pass a Pydantic model directly via response_format and still set Langfuse attributes such as name, metadata, langfuse_session_id, etc."

By analogy with name="..." (a direct kwarg shown in the example), they write:

completion = openai.chat.completions.parse( model="gpt-4o-2024-08-06", messages=[...], response_format=CalendarEvent, name="extract-calendar-event", langfuse_session_id="session_abc", # ← reader infers this from the prose )

The wrapper recognizes name and consumes it. langfuse_session_id is not a recognized wrapper kwarg, so it is either passed through to the underlying OpenAI client (which will reject it as an unknown parameter, or silently ignore depending on version), or simply dropped — in any case, it never makes it onto the trace as a session id.

The reader's traces have no session id and they don't know why; the docs that would tell them (lines 213-236) are now contradicted by the new bullet they trusted.

Cross-file impact. The cookbook .md at lines 58 and 193 has the same sentence, but in that file the disambiguating 'Custom trace properties' / 'Setting trace attributes' sections do not exist nearby — so a cookbook reader has even less chance of catching the issue. The .ipynb carries the same wording.

Fix. Reword along the lines of:

...Both paths support Langfuse attributes such as name and metadata. To attach a session id, user id, or tags, set them as keys inside metadata (langfuse_session_id, langfuse_user_id, langfuse_tags).

This keeps the reassurance that the stable parse path is fully instrumented, while no longer presenting metadata-keys as if they were direct kwargs. The change is text-only and applies to: the new note in the MDX (around line 386), the same note in the cookbook .md (line 58 and 193), and the corresponding cells in the .ipynb.

claude · 2026-04-30T07:02:13Z

+For **structured output parsing**, you have two fully instrumented options depending on your `openai` Python SDK version:

-If you rely on parsing Pydantic defintions for your `response_format`, you may leverage the `type_to_response_format_param` utility function from the OpenAI Python SDK to convert the Pydantic definition to a `response_format` dictionary. This is the same function the OpenAI Beta API uses to convert Pydantic definitions to `response_format` dictionaries.
+- **`openai>=1.92.0` (recommended):** use `client.chat.completions.parse(...)`. OpenAI graduated `parse` and `stream` out of beta in [v1.92.0](https://github.com/openai/openai-python/releases/tag/v1.92.0), and Langfuse wraps the stable `openai.resources.chat.completions.Completions.parse` (and the async variant). You can pass a Pydantic model directly via `response_format` and still set Langfuse attributes such as `name`, `metadata`, `langfuse_session_id`, etc.
+- **`openai<1.92.0` (legacy):** the parse helper is only available under `client.beta.chat.completions.parse(...)`. Langfuse also wraps the beta path on these older versions, so attributes like `name` and `metadata` work there too.
+


🟡 The parent OpenAI Beta APIs section intro (just above line 381) still says "we fully support only the stable APIs in the OpenAI SDK. If you are using a beta API, you can still use the Langfuse SDK by wrapping the OpenAI SDK manually with the @observe() decorator", but the new Structured Output subsection placed under it now explicitly states that Langfuse instruments the legacy beta parse path too. Readers following the parent intro would unnecessarily wrap with @observe(). Consider updating the intro paragraph or moving Structured Output out from under the OpenAI Beta APIs header (since parse is no longer beta on openai>=1.92.0).

Extended reasoning...

What the bug is. This PR rewrites the Structured Output subsection of content/integrations/model-providers/openai-py.mdx to explain that Langfuse instruments both the stable client.chat.completions.parse(...) (on openai>=1.92.0) and the legacy client.beta.chat.completions.parse(...) (on older SDKs). However, that subsection still lives under the parent ### OpenAI Beta APIs header, whose intro paragraph (untouched by this PR) reads:

Since OpenAI beta APIs are changing frequently across versions, we fully support only the stable APIs in the OpenAI SDK. If you are using a beta API, you can still use the Langfuse SDK by wrapping the OpenAI SDK manually with the @observe() decorator.

The new subsection directly contradicts that intro by stating:

openai<1.92.0 (legacy): the parse helper is only available under client.beta.chat.completions.parse(...). Langfuse also wraps the beta path on these older versions, so attributes like name and metadata work there too.

Why it's a problem. A reader who scans the page top-down hits the section header ("OpenAI Beta APIs"), reads the intro that says beta APIs need manual @observe() wrapping, and may stop reading there or carry that mental model forward. Even readers who continue then see the opposite claim a few lines later. Both messages can't be true at once.

Section structure is also stale. Since parse graduated out of beta in openai-python v1.92.0 (the very point this PR is making), the Structured Output content no longer belongs under an "OpenAI Beta APIs" header at all. The recommended path (client.chat.completions.parse(...)) is stable, not beta.

Step-by-step proof.

Open the rendered page /integrations/model-providers/openai-py after this PR.

Scroll to the ### OpenAI Beta APIs heading and read the intro paragraph: it tells the user that only stable APIs are fully supported and that beta APIs require manual @observe() wrapping.

Continue into the #### Structured Output subsection just below: bullet two explicitly says Langfuse "also wraps the beta path on these older versions, so attributes like name and metadata work there too."

The two statements are mutually exclusive — either the legacy beta parse path is automatically instrumented (so no @observe() is needed) or it isn't. The PR makes the new claim true but leaves the contradicting intro in place.

How to fix. Either (a) update the intro paragraph under ### OpenAI Beta APIs to clarify that some beta endpoints are now instrumented (and call out Structured Output as an example), or (b) promote #### Structured Output to its own top-level section (e.g., under "Advanced usage") since parse is no longer a beta API on supported SDK versions. Option (b) probably ages better given that parse is the recommended stable path.

dosubot Bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Apr 30, 2026

dosubot Bot added the documentation Improvements or additions to documentation label Apr 30, 2026

vercel Bot deployed to Preview April 30, 2026 06:55 View deployment

claude Bot reviewed Apr 30, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: clarify OpenAI Python parse vs response_format guidance#2884

docs: clarify OpenAI Python parse vs response_format guidance#2884
jannikmaierhoefer wants to merge 1 commit intomainfrom
claude/quirky-montalcini-52d792

jannikmaierhoefer commented Apr 30, 2026 •

edited by greptile-apps Bot

Loading

Uh oh!

vercel Bot commented Apr 30, 2026 •

edited

Loading

Uh oh!

review-notebook-app Bot commented Apr 30, 2026

Uh oh!

github-actions Bot commented Apr 30, 2026

Uh oh!

claude Bot left a comment

Uh oh!

claude Bot Apr 30, 2026

Uh oh!

claude Bot Apr 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		- `openai>=1.92.0` (recommended): use `client.chat.completions.parse(...)`. OpenAI graduated `parse` and `stream` out of beta in [v1.92.0](https://github.com/openai/openai-python/releases/tag/v1.92.0), and Langfuse wraps the stable `openai.resources.chat.completions.Completions.parse` (and the async variant). You can pass a Pydantic model directly via `response_format` and still set Langfuse attributes such as `name`, `metadata`, `langfuse_session_id`, etc.
		- `openai<1.92.0` (legacy): the parse helper is only available under `client.beta.chat.completions.parse(...)`. Langfuse also wraps the beta path on these older versions, so attributes like `name` and `metadata` work there too.

Conversation

jannikmaierhoefer commented Apr 30, 2026 • edited by greptile-apps Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why

Test plan

Disclaimer: Experimental PR review

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Flowchart

Uh oh!

vercel Bot commented Apr 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

review-notebook-app Bot commented Apr 30, 2026

Uh oh!

github-actions Bot commented Apr 30, 2026

Uh oh!

claude Bot left a comment

Choose a reason for hiding this comment

Overview

Security risks

Level of scrutiny

Other factors

Uh oh!

claude Bot Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

jannikmaierhoefer commented Apr 30, 2026 •

edited by greptile-apps Bot

Loading

vercel Bot commented Apr 30, 2026 •

edited

Loading