fix(llmobs): handle None metadata in experiment dataset records#17729
Open
asaxena2019 wants to merge 1 commit intomainfrom
Open
fix(llmobs): handle None metadata in experiment dataset records#17729asaxena2019 wants to merge 1 commit intomainfrom
asaxena2019 wants to merge 1 commit intomainfrom
Conversation
Dataset records whose `metadata` field is `None` (a common shape for records
serialized from JSON with an explicit `null`) caused
`TypeError: 'NoneType' object is not a mapping` in the LLMObs experiment
pipeline. `dict.get("metadata", {})` returns the stored `None` rather than
the default `{}` when the key is present, so the subsequent
`{**record_metadata, "experiment_config": self._config}` spread crashed.
Fix every `record.get("metadata", {})` read in `_experiment.py` to use
`record.get("metadata") or {}`, which correctly coerces missing, absent, and
explicitly-None values to an empty dict. Four sites were affected:
- Dataset.as_dataframe() flattening (previously guarded by `isinstance`, now
consistent with the others)
- _prepare_summary_evaluator_data (the main crash site reported by multiple
ddeval projects)
- per-record task argument plumbing
- per-record evaluator context building
Adds a regression test (`test_experiment_run_summary_evaluators_handles_none_metadata`)
that constructs an in-memory Dataset with `metadata=None`, runs the full
task → evaluator → summary-evaluator flow, and asserts no crash. The test
fails on main at `_experiment.py:2004` without this fix and passes with it.
Impact: unblocks ddeval projects across dd-source (multi-claim-example,
single-claim-example, synthetics-critical-endpoint-selector, and others)
that currently fail with `'NoneType' object is not a mapping` when pulling
datasets via LLMObs' `pull_dataset` since metadata is JSON-decoded to Python
None at that layer.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Codeowners resolved as |
juanjux
approved these changes
Apr 28, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
metadatafield is explicitlyNonein LLMObs experiments (previously crashed withTypeError: 'NoneType' object is not a mapping)dict.get("metadata", {})returns the storedNonewhen the key is present, so the subsequent{**record_metadata, ...}spread crashesrecord.get("metadata") or {}at every read site inddtrace/llmobs/_experiment.py(four sites)metadata=Noneand runs the full task → evaluator → summary-evaluator flowContext
Several ddeval projects across dd-source currently fail during
ddeval runwith:The failure reproduces against a bare
ddeval run(no custom tooling) onmulti-claim-example,single-claim-example,synthetics-critical-endpoint-selector, and others. The full stack trace lands atddtrace/llmobs/_experiment.py:2004:This cropped up after ddeval
0.0.109272488(released 2026-04-23) which JSON-decodes pulled dataset records in place. Before that fix, the error was'str' object is not a mapping(metadata was the string"null"); after, the decoded value is PythonNone— whichdict.get("metadata", {})happily returns unchanged, and**Nonethen crashes.The upstream ddeval fix was necessary but exposed a latent None-safety bug on this side of the boundary.
Change details
Every
record.get("metadata", {})in_experiment.pybecomesrecord.get("metadata") or {}:Dataset.as_dataframe()— previously guarded byisinstance(metadata, dict), now consistent_prepare_summary_evaluator_data— the crash sitecombined_metadataspread)Test plan
test_experiment_run_summary_evaluators_handles_none_metadataintests/llmobs/test_experiments.pyDatasetwithmetadata=None(no backend fixture)_run_task→_run_evaluators→_run_summary_evaluatorswithraise_errors=True_experiment.py:2004with the exact production error, and passes with the fix appliedmetadata=None, missing metadata, and populated metadata all coerce to the same shape downstreamRelease note
Added at
releasenotes/notes/fix-llmobs-experiment-none-metadata-*.yaml.