docs: document multi-modal datasets#3119
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
@claude review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 7f1207db0b
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
Prior review comments are all resolved by recent commits and no new bugs were found — LGTM.
Extended reasoning...
Overview
Docs-only PR documenting multi-modal dataset items: a new section in datasets.mdx, a new multi-modal experiments subsection in experiments-via-sdk.mdx, new DatasetItemMediaReference reference rows in data-model.mdx, a dated changelog entry, and cross-links from the multi-modality and self-hosted blob storage pages. The embedded components-mdx/datasets-create-dataset-item.mdx snippet was also migrated from langfuse.api.datasetItems.create to langfuse.dataset.createItem.
Security risks
None — content-only .mdx changes with no runtime code, secrets, or auth-adjacent surfaces touched. The self-hosted blob storage page edits are wording-only (renaming a section header, broadening one sentence to mention datasets); no env-var defaults or example credentials changed.
Level of scrutiny
Low. Docs PRs that ship through the Next.js docs site have no runtime blast radius beyond the rendered page, and an active human author (Tobias Wochinger) is iterating on it. My prior pass already exercised the code samples against the SDK type definitions; the follow-up commits picked up each correction.
Other factors
All three of my previous inline comments are resolved: the JS/TS task signature is now async (item) => in experiments-via-sdk.mdx, the pre-existing same-bug instances in datasets.mdx and the versioned-experiments changelog were fixed in the same pass, both stale JS/TS langfuse.api.datasetItems.create call sites in datasets.mdx were migrated, and the expected_output enum value was clarified inline (the API value really is snake_case; the new wording "expected_output (for expectedOutput)" disambiguates it from the JS field name). Greptile's open suggestion about marking media as Required: No instead of Yes (nullable) is a wording preference, and Tobias explicitly pushed back on the assert isinstance suggestion for reader pedagogy — both are within author discretion and not blockers. The bug-hunting system found no new issues on the current commit.
Summary
LangfuseMediaReferencein Python and JS/TS.Linear
Major Decisions
Review Focus
Greptile Summary
This PR adds documentation for multi-modal dataset items — covering UI upload flow, Python/JS/TS SDK creation examples, SDK version callouts, and an end-to-end SDK experiment guide using
LangfuseMediaReference. It also introduces a new changelog entry and cross-links the blobstorage, multi-modality, and data-model reference pages.datasets.mdxdocuments item creation via UI and SDK, with version constraints (Python ≥ 4.10.0,@langfuse/client≥ 5.5.0) and a UI-experiment limitation callout.experiments-via-sdk.mdxsubsection shows full experiment flow: fetch dataset withresolve_media_references=True, unpackLangfuseMediaReference, and call the model provider with bytes/base64/data-URI.data-model.mdxgains amediaReferencesfield onDatasetItemand a newDatasetItemMediaReferenceobject table; themediasub-field is marked Required: Yes but can benull, which is worth revisiting for clarity.Confidence Score: 4/5
Safe to merge; all changes are documentation-only with no runtime code paths.
The changes are well-structured and internally consistent. The data-model table marks the media sub-field as Required while describing it as nullable, which could mislead SDK consumers. Pre-existing JS/TS examples using the old langfuse.api.datasetItems.create pattern were not updated to match the new langfuse.dataset.createItem shape introduced elsewhere on the same page. Both are minor doc-clarity issues with no functional impact.
content/docs/evaluation/experiments/data-model.mdx (nullable-but-Required field) and content/docs/evaluation/experiments/datasets.mdx (API shape inconsistency in JS/TS examples).
Flowchart
%%{init: {'theme': 'neutral'}}%% flowchart TD A([User / CI]) -->|1 wrap file in LangfuseMedia| B[SDK: create_dataset_item / createItem] B -->|2 upload bytes via presigned URL| C[(S3 / Blob Storage)] B -->|3 store media reference token| D[(Langfuse DB: DatasetItem)] D -->|4 UI reads token| E[Langfuse UI: renders preview] A2([Experiment Runner]) -->|5 get_dataset resolve_media_references=True| D D -->|6 generate signed download URL| C C -->|7 return DatasetItemMediaReference with signed url + urlExpiry| A2 A2 -->|8 fetch_bytes / fetch_base64 / fetch_data_uri| C A2 -->|9 pass media to model provider| F[LLM / Vision Model] F -->|10 output| A2 A2 -->|11 scores + traces| D%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%% flowchart TD A([User / CI]) -->|1 wrap file in LangfuseMedia| B[SDK: create_dataset_item / createItem] B -->|2 upload bytes via presigned URL| C[(S3 / Blob Storage)] B -->|3 store media reference token| D[(Langfuse DB: DatasetItem)] D -->|4 UI reads token| E[Langfuse UI: renders preview] A2([Experiment Runner]) -->|5 get_dataset resolve_media_references=True| D D -->|6 generate signed download URL| C C -->|7 return DatasetItemMediaReference with signed url + urlExpiry| A2 A2 -->|8 fetch_bytes / fetch_base64 / fetch_data_uri| C A2 -->|9 pass media to model provider| F[LLM / Vision Model] F -->|10 output| A2 A2 -->|11 scores + traces| DPrompt To Fix All With AI
Reviews (1): Last reviewed commit: "docs(evaluation): document multi-modal d..." | Re-trigger Greptile