Skip to content

Commit 09a7f5c

Browse files
Merge pull request #864 from Pipelex/release/v0.26.2
Release v0.26.2
2 parents 991e2dc + 974b3b3 commit 09a7f5c

7 files changed

Lines changed: 98 additions & 29 deletions

File tree

.badges/tests.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
{
22
"schemaVersion": 1,
33
"label": "tests",
4-
"message": "4900",
4+
"message": "4901",
55
"color": "blue",
66
"cacheSeconds": 300
77
}

CHANGELOG.md

Lines changed: 10 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,11 @@
11
# Changelog
22

3+
## [v0.26.2] - 2026-05-06
4+
5+
### Fixed
6+
7+
- **`choices` fields no longer fail validation with `'EnumName.MEMBER_NAME'` errors.** A concept declared with `choices = [...]` produces a `Literal[...]` field on the dynamic Pydantic class. That schema is round-tripped through `SchemaToModelFactory.make_from_json_schema` (used to rebuild dynamic models on Temporal workers and to feed structured-output schemas to LLM providers). Previously the round-trip silently re-emitted the field as a plain Python `Enum` class — e.g. `Literal["Strong Match", "Good Match", "Partial Match", "Poor Match"]` became `class Recommendation(Enum): Poor_Match = "Poor Match"; ...`. LLMs filling that schema then returned the enum's Python repr (`"Recommendation.Poor_Match"`) instead of the literal string (`"Poor Match"`), which failed Pydantic validation against the original choice set with errors like `Invalid choice errors: 'recommendation': got 'Recommendation.Poor_Match', expected one of 'Strong Match', 'Good Match', 'Partial Match' or 'Poor Match'`. `_generate_source_from_schema` now passes `enum_field_as_literal=LiteralType.All` to `datamodel-code-generator`, so `enum: [strings]` schema nodes round-trip as `Literal[...]` instead of being regenerated as `Enum` classes. `_exec_source_to_types` now also exposes `Literal` in the rebuild namespace so `model_rebuild` resolves the deferred annotations.
8+
39
## [v0.26.1] - 2026-05-05
410

511
### Changed
@@ -694,10 +700,9 @@
694700
1. Pipelex Gateway telemetry for service monitoring (never collects prompts/completions/business data)
695701
2. Custom telemetry to user-configured backends
696702
3. Config updated accordingly (`telemetry.toml`):
703+
- Renamed `[posthog]` to `[custom_posthog]` to distinguish user's PostHog from Pipelex Gateway telemetry
704+
- Added new `[custom_portkey]` section with `force_debug_enabled` and `force_tracing_enabled` settings
697705

698-
699-
- Renamed `[posthog]` to `[custom_posthog]` to distinguish user's PostHog from Pipelex Gateway telemetry
700-
- Added new `[custom_portkey]` section with `force_debug_enabled` and `force_tracing_enabled` settings
701706
- **Main Configuration Overrides Updated** (`.pipelex/pipelex.toml`):
702707
- `pipelex_override.toml` (final override) renamed from `pipelex_super.toml` to `pipelex_override.toml` and moved from repo root to `.pipelex/` directory
703708
- `telemetry_override.toml` (personal telemetry settings)
@@ -1013,10 +1018,9 @@
10131018
1. Pipelex Gateway telemetry for service monitoring (never collects prompts/completions/business data)
10141019
2. Custom telemetry to user-configured backends
10151020
3. Config updated accordingly (`telemetry.toml`):
1021+
- Renamed `[posthog]` to `[custom_posthog]` to distinguish user's PostHog from Pipelex Gateway telemetry
1022+
- Added new `[custom_portkey]` section with `force_debug_enabled` and `force_tracing_enabled` settings
10161023

1017-
1018-
- Renamed `[posthog]` to `[custom_posthog]` to distinguish user's PostHog from Pipelex Gateway telemetry
1019-
- Added new `[custom_portkey]` section with `force_debug_enabled` and `force_tracing_enabled` settings
10201024
- **Main Configuration Overrides Updated** (`.pipelex/pipelex.toml`):
10211025
- `pipelex_override.toml` (final override) renamed from `pipelex_super.toml` to `pipelex_override.toml` and moved from repo root to `.pipelex/` directory
10221026
- `telemetry_override.toml` (personal telemetry settings)

pipelex/cogt/content_generation/schema_to_model_factory.py

Lines changed: 16 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -36,9 +36,9 @@
3636
from collections import OrderedDict
3737
from enum import Enum
3838
from pathlib import Path
39-
from typing import Any, ClassVar, cast
39+
from typing import Any, ClassVar, Literal, cast
4040

41-
from pydantic import BaseModel
41+
from pydantic import BaseModel, RootModel
4242

4343
from pipelex.cogt.content_generation.exceptions import UnsafeSchemaError
4444

@@ -160,7 +160,7 @@ def _reject_unsafe_schema_extensions(cls, schema: dict[str, Any]) -> None:
160160
@classmethod
161161
def _generate_source_from_schema(cls, schema: dict[str, Any]) -> str:
162162
"""Generate Python source code from a JSON schema using datamodel-code-generator."""
163-
from datamodel_code_generator import InputFileType, generate # noqa: PLC0415
163+
from datamodel_code_generator import InputFileType, LiteralType, generate # noqa: PLC0415
164164
from datamodel_code_generator.enums import DataModelType # noqa: PLC0415
165165

166166
cls._reject_unsafe_schema_extensions(schema)
@@ -178,11 +178,19 @@ def _generate_source_from_schema(cls, schema: dict[str, Any]) -> str:
178178
# be replaced by ruff, but ruff isn't a runtime dep of pipelex or of
179179
# datamodel-code-generator's core install. An empty list silences the
180180
# warning without forcing a new runtime dependency on consumers.
181+
# `enum_field_as_literal=LiteralType.All` keeps `enum: [strings]` schema
182+
# nodes as Python `Literal[...]` annotations instead of regenerating a
183+
# named `Enum` class. Without it, a `Literal[...]` field round-trips into
184+
# a plain `Enum` (e.g. `class Recommendation(Enum): Poor_Match = "Poor Match"`),
185+
# and an LLM filling that schema returns the Python repr
186+
# `"Recommendation.Poor_Match"` instead of the value `"Poor Match"`,
187+
# which then fails Pydantic validation against the original choice set.
181188
generate(
182189
input_=schema_str,
183190
input_file_type=InputFileType.JsonSchema,
184191
output=output_path,
185192
output_model_type=DataModelType.PydanticV2BaseModel,
193+
enum_field_as_literal=LiteralType.All,
186194
formatters=[],
187195
)
188196
return output_path.read_text(encoding="utf-8")
@@ -252,16 +260,18 @@ def _exec_source_to_types(cls, source_code: str) -> dict[str, type[Any]]:
252260
and not name.startswith("_")
253261
and (issubclass(obj, BaseModel) or issubclass(obj, Enum))
254262
and obj is not BaseModel
263+
and obj is not RootModel
255264
and obj is not Enum
256265
}
257266

258267
# datamodel-code-generator uses `from __future__ import annotations` which turns
259268
# type annotations into strings. Rebuild every BaseModel so forward refs (including
260-
# references to generated Enum classes for choices fields) resolve against the
261-
# full type namespace.
269+
# references to generated Enum classes for choices fields, and Literal annotations
270+
# produced by `enum_field_as_literal=All`) resolve against the full type namespace.
271+
rebuild_namespace: dict[str, Any] = {**all_user_types, "Literal": Literal}
262272
for candidate in all_user_types.values():
263273
if issubclass(candidate, BaseModel):
264-
candidate.model_rebuild(_types_namespace=all_user_types)
274+
candidate.model_rebuild(_types_namespace=rebuild_namespace)
265275

266276
return all_user_types
267277

pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
[project]
22
name = "pipelex"
3-
version = "0.26.1"
3+
version = "0.26.2"
44
description = "Execute composable AI methods declared in the MTHDS open standard"
55
authors = [{ name = "Evotis S.A.S.", email = "oss@pipelex.com" }]
66
maintainers = [{ name = "Pipelex staff", email = "oss@pipelex.com" }]

tests/integration/pipelex/temporal/data_converter/test_data_conv_enum_roundtrip.py

Lines changed: 10 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,3 @@
1-
from enum import Enum
21
from typing import cast
32

43
import pytest
@@ -26,10 +25,10 @@ def test_dynamic_class_with_enum_field_round_trips(
2625
self,
2726
payload_converter: BaseModelPayloadConverter,
2827
):
29-
"""A class with an Enum field generated via SchemaToModelFactory must survive a
30-
full payload round-trip. Exercises the receiver-side exec path that registers
31-
Enum subclasses in the per-call scoped ClassRegistry — without it the
32-
deserializer cannot resolve the dynamic enum class and the round-trip fails.
28+
"""A class with an enum-shaped field generated via SchemaToModelFactory must
29+
survive a full payload round-trip. Exercises the receiver-side exec path that
30+
rebuilds the dynamic class from `__kajson_class_source__` — without it the
31+
deserializer cannot resolve the dynamic class and the round-trip fails.
3332
"""
3433
schema = Pet.model_json_schema()
3534
dynamic_pet_cls = SchemaToModelFactory.make_from_json_schema(schema, "Pet")
@@ -46,12 +45,10 @@ def test_dynamic_class_with_enum_field_round_trips(
4645

4746
restored_class: type[BaseModel] = type(restored)
4847
assert restored_class.__name__ == "Pet"
49-
assert restored.name == "Rex" # type: ignore[attr-defined]
50-
# datamodel-code-generator emits `class PetSpecies(Enum)` (not StrEnum), so the
51-
# dynamic enum is a distinct class from the static PetSpecies above. Assert on
52-
# class name + value instead of identity equality.
53-
species_value: Enum = restored.species # type: ignore[attr-defined]
54-
assert isinstance(species_value, Enum)
55-
assert type(species_value).__name__ == "PetSpecies"
56-
assert species_value.value == "dog"
48+
# datamodel-code-generator now emits enum-shaped `$defs` as
49+
# `RootModel[Literal[...]]` (since `enum_field_as_literal=LiteralType.All` is
50+
# set in `_generate_source_from_schema`), so `restored.species` is a
51+
# `RootModel` wrapping the value, not a Python `Enum` instance. Assert on the
52+
# serialized data — that is what actually crosses the Temporal payload boundary.
53+
assert restored.model_dump() == {"name": "Rex", "species": "dog"}
5754
assert getattr(restored_class, "__kajson_class_source__", None)

tests/unit/pipelex/cogt/content_generation/test_schema_to_model.py

Lines changed: 59 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
import threading
44
import uuid
5-
from typing import Any
5+
from typing import Any, Literal, get_args, get_origin
66

77
import pytest
88
from pydantic import BaseModel, Field
@@ -17,6 +17,17 @@ class SimpleModel(BaseModel):
1717
age: int = Field(description="The age")
1818

1919

20+
class ModelWithLiteralChoices(BaseModel):
21+
"""A model whose ``recommendation`` field is a ``Literal`` over string choices.
22+
23+
Mirrors the shape a `.mthds` ``choices = [...]`` declaration produces: the
24+
field's JSON schema contains an inline ``enum: [strings]`` array, and the
25+
Python annotation is ``Literal[...]`` — NOT a named Python enum class.
26+
"""
27+
28+
recommendation: Literal["Strong Match", "Good Match", "Partial Match", "Poor Match"]
29+
30+
2031
class Address(BaseModel):
2132
street: str
2233
city: str
@@ -32,6 +43,53 @@ def _benign_object_schema() -> dict[str, Any]:
3243

3344

3445
class TestSchemaToModel:
46+
def test_literal_choices_field_round_trips_as_literal_not_enum(self) -> None:
47+
"""Round-tripping a ``Literal[str-set]`` field through ``make_from_json_schema``
48+
must keep it as a ``Literal[...]`` annotation in the reconstructed class.
49+
50+
Bug repro: today the round-trip silently re-emits the field as a generated
51+
``Enum`` class (e.g. ``class Recommendation(Enum)`` with members like
52+
``Poor_Match = "Poor Match"``). When this reconstructed class is handed to
53+
an LLM as the structured-output target, the LLM tends to fill it with the
54+
Python enum repr (``"Recommendation.Poor_Match"``) instead of the literal
55+
string (``"Poor Match"``), which then fails Pydantic validation against the
56+
original choice set.
57+
58+
We assert two things:
59+
1. the generated Python source code does NOT introduce an ``Enum`` class
60+
named after the field (``class Recommendation(Enum)``);
61+
2. the reconstructed model's ``recommendation`` field annotation is a
62+
``Literal[...]`` whose args are exactly the original string choices.
63+
"""
64+
# Use a unique title so the class-level schema cache never short-circuits
65+
# this test with a stale (already-correct or already-buggy) result from
66+
# another test run.
67+
schema = ModelWithLiteralChoices.model_json_schema()
68+
unique_title = f"LiteralChoicesRepro_{uuid.uuid4().hex}"
69+
schema["title"] = unique_title
70+
71+
result_class = SchemaToModelFactory.make_from_json_schema(schema, unique_title)
72+
source = getattr(result_class, "__kajson_class_source__", "")
73+
74+
assert "class Recommendation(Enum)" not in source, (
75+
"Bug: Literal[...] choices were re-emitted as a generated Enum class. "
76+
"An LLM targeting this Enum returns 'Recommendation.Poor_Match' (Python "
77+
"enum repr) instead of the literal 'Poor Match', which fails validation "
78+
"against the original choice set.\n\nGenerated source:\n" + source
79+
)
80+
81+
recommendation_field = result_class.model_fields["recommendation"]
82+
annotation = recommendation_field.annotation
83+
assert get_origin(annotation) is Literal, (
84+
f"Expected the round-tripped 'recommendation' field annotation to be Literal[...], got {annotation!r}. Source:\n{source}"
85+
)
86+
assert set(get_args(annotation)) == {
87+
"Strong Match",
88+
"Good Match",
89+
"Partial Match",
90+
"Poor Match",
91+
}, f"Literal args drifted during round-trip: {get_args(annotation)!r}"
92+
3593
def test_simple_model_reconstruction(self) -> None:
3694
"""A simple model can be reconstructed from its JSON schema."""
3795
schema = SimpleModel.model_json_schema()

uv.lock

Lines changed: 1 addition & 1 deletion
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

0 commit comments

Comments
 (0)