@@ -594,36 +594,36 @@ schema.
594594` resolve_pyproject_path ` walks up from a model's module file to find ` pyproject.toml ` .
595595` load_examples_from_toml ` reads the ` [examples.ModelName] ` TOML section.
596596
597- Validation requires three preprocessing steps that handle TOML's limitations and
598- flat-schema conventions.
599-
600- TOML has no null literal, so examples use the string ` "null" ` as a stand-in. ` _denull `
601- replaces these recursively, walking nested dicts and lists.
597+ Validation requires two preprocessing steps that handle flat-schema conventions.
602598
603599Literal fields (like ` theme="buildings" ` ) are omitted from examples since they carry
604600constant values. ` _inject_literal_fields ` adds them back before validation by scanning
605601` model_fields ` for single-value ` Literal ` annotations via ` single_literal_value ` .
606602
607- Discriminated union examples from flat parquet schemas include null fields from
603+ Discriminated union examples from flat Parquet schemas include null fields from
608604non-selected variant arms. ` _strip_null_unknown_fields ` removes null-valued fields not
609605in the common base's field set, so the selected arm's validator accepts the data without
610606choking on fields that belong to sibling variants.
611607
612- ` collect_dict_paths ` walks the ` FieldSpec ` tree to identify dict-typed fields (like
613- ` tags: dict[str, str] ` ), returning their dot-paths as a ` frozenset ` . Schema-notation
614- paths use empty brackets (` items[].tags ` ) while runtime paths carry indices
615- (` items[0].tags ` ); ` _normalize_path ` strips indices before membership checks.
608+ ` validate_example ` returns a Pydantic model instance. ` flatten_model_instance ` walks the
609+ instance recursively using ` isinstance(value, BaseModel) ` to distinguish model fields
610+ (recurse with dot notation) from dict fields (keep as leaf values). Lists of models
611+ use bracket notation (` sources[0].dataset ` ), nested lists use double-index notation
612+ (` hierarchies[0][1].name ` ). The model instance itself encodes the type structure,
613+ eliminating the need for external schema information.
614+
615+ For discriminated unions, the concrete variant instance lacks fields from other arms.
616+ ` augment_missing_fields ` compares base field names against the union's merged field list
617+ and appends ` (name, None) ` for absent fields, matching the flat Parquet schema where all
618+ variant columns exist.
616619
617- ` flatten_example ` converts nested dicts to dot-notation. Nested dicts become
618- ` parent.child ` , lists of dicts become ` parent[0].child ` . Dicts at paths in ` dict_paths `
619- are kept as leaf values -- a ` tags ` field typed as ` dict[str, str] ` renders as a whole
620- map rather than being split into ` tags.color ` , ` tags.size ` . ` order_example_rows ` sorts by
621- field position in the documentation's field order using a stable sort, so sub-fields
622- maintain their original relative order.
620+ ` order_example_rows ` sorts by field position in the documentation's field order using a
621+ stable sort, so sub-fields maintain their original relative order.
623622
624623` load_examples ` orchestrates the full flow: find the pyproject.toml, load the TOML
625- section, validate each example, flatten, and order. Invalid examples log a warning and
626- skip rather than failing the pipeline.
624+ section, validate each example, flatten via ` flatten_model_instance ` , augment missing
625+ fields, and order. Invalid examples log a warning and skip rather than failing the
626+ pipeline.
627627
628628## 16. Orchestration and CLI
629629
@@ -739,9 +739,9 @@ sources appear on the source NewType's page instead.
739739
740740The example loader finds ` pyproject.toml ` in the transportation theme package, reads
741741` [examples.Segment] ` , validates each example against the union alias (injecting literal
742- fields, stripping null fields from non-selected arms), computes ` dict_paths ` from
743- ` spec.fields ` to identify dict-typed fields, flattens to dot-notation (keeping dict-typed
744- fields as leaf values), and orders by field position.
742+ fields, stripping null fields from non-selected arms), flattens the model instance to
743+ dot-notation via ` flatten_model_instance ` , augments missing cross-arm fields, and orders
744+ by field position.
745745
746746The Jinja2 template assembles the field table, optional constraints section, examples,
747747and "Used By" partial into markdown.
0 commit comments