Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
41 changes: 37 additions & 4 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,10 +20,10 @@ ADDED
`call_entity` accept an optional `return_type`, and `wait_for_external_event`
accepts an optional `data_type`. When provided, the result/event payload is
reconstructed as that type (dataclasses — including nested dataclass,
`Optional`, and `list` fields — and `from_json()`-capable types) and the
returned task is typed accordingly (e.g. `call_activity(..., return_type=Foo)`
yields `CompletableTask[Foo]`). When omitted, the raw deserialized JSON is
returned as before.
`Optional`, `list`, `dict`/`Mapping`, and `tuple` fields — and
`from_json()`-capable types) and the returned task is typed accordingly (e.g.
`call_activity(..., return_type=Foo)` yields `CompletableTask[Foo]`). When
omitted, the raw deserialized JSON is returned as before.
- Inbound payloads are reconstructed from function type annotations. When an
orchestrator, activity, or entity operation annotates its input parameter (or
an activity its return value) with a dataclass or `from_json()`-capable type,
Expand All @@ -37,6 +37,15 @@ ADDED
retained.
- Objects exposing a `to_json()` method are now JSON-serializable when passed as
activity/orchestrator inputs or outputs.
- `enum.Enum` values now serialize (to their underlying `.value`) and, when a
target type is supplied, deserialize back to the enum member. This covers
string-valued and other non-`int` enums as activity/orchestrator/entity inputs
and outputs, including as dataclass fields and inside `list` / `dict` /
`tuple` containers. (`IntEnum` / `IntFlag` already serialized as integers.)
- A `from_json()` classmethod may now optionally accept the active
`DataConverter` as a second parameter (`from_json(cls, value, converter)`),
letting it reconstruct nested typed values via `converter.coerce(...)` /
`converter.deserialize(...)`. The single-argument form remains supported.
- Added `EntityMetadata.get_typed_state(intended_type=...)`, which deserializes
the entity's persisted state and reconstructs dataclasses and
`from_json()`-capable types. The existing `get_state()` is unchanged: with no
Expand All @@ -63,11 +72,35 @@ CHANGED

FIXED

- A dataclass or `SimpleNamespace` that defines a `to_json()` hook now uses it
when serialized. Previously the built-in dataclass / `SimpleNamespace`
handling ran first, so the hook was ignored — and a dataclass with a field
that was not JSON-serializable on its own would fail to serialize even when it
provided a `to_json()` hook to handle that field. The serialize side now
prefers `to_json()`, mirroring the deserialize side, which already prefers
`from_json()`.
- Nested `to_json()` hooks are now honored when an object is serialized inside a
dataclass. Custom objects (including nested dataclasses with their own
`to_json()`) are now encoded recursively instead of being flattened to their
raw fields, so values that reshape themselves via `to_json()` round-trip
correctly.
- Type-directed deserialization now recurses into `dict`/`Mapping` values and
`tuple` elements, in addition to the existing `list`, `Optional`/`Union`, and
dataclass-field recursion. A `dict[str, Foo]` or `tuple[Foo, ...]` hint now
reconstructs the contained `Foo` values.
- Falsy entity states (`0`, `""`, `[]`, `{}`) are no longer dropped when an
entity batch is persisted. Previously a falsy current state was treated as
"no state" and written as `None`, effectively deleting it; only an actual
`None` state now clears the persisted entity state.

DEPRECATED

- `durabletask.internal.shared.to_json` and `durabletask.internal.shared.from_json`
are deprecated and now emit a `DeprecationWarning`. Use a
`durabletask.serialization.DataConverter` (for example the default
`JsonDataConverter`) instead. The functions continue to work for backwards
compatibility.

BREAKING CHANGES (type-level only — no runtime impact for typical users)

These changes do not alter runtime behavior, but because the package ships
Expand Down
68 changes: 58 additions & 10 deletions durabletask/internal/entity_state_shim.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,9 +12,31 @@


class StateShim:
def __init__(self, start_state: Any, data_converter: "DataConverter | None" = None):
"""In-memory view of an entity's state during a batch.

The state arriving from the wire is held as its raw serialized JSON string
and is **not** deserialized in the constructor: deserialization is deferred
until :meth:`get_state` is called, so the caller's requested type reaches the
data converter together with the original payload (a custom converter can
then deserialize the string directly into the target type). Once the state
has been read into a Python value or replaced via :meth:`set_state`, it is
held as that live object instead.

Tracking whether the current value is still the raw serialized string also
lets :meth:`encode_state` pass an unmodified payload straight back to the
wire instead of re-serializing it, which would double-encode the JSON.
"""

def __init__(self, start_state: Any, data_converter: "DataConverter | None" = None,
*, is_serialized: bool = False):
# ``is_serialized`` marks ``start_state`` as a raw serialized payload
# (the value off the wire) whose deserialization should be deferred. A
# ``None`` state is never treated as serialized.
serialized = is_serialized and start_state is not None
self._current_state: Any = start_state
self._current_is_serialized: bool = serialized
self._checkpoint_state: Any = start_state
self._checkpoint_is_serialized: bool = serialized
self._operation_actions: list[pb.OperationAction] = []
self._actions_checkpoint_state: int = 0
if data_converter is None:
Expand All @@ -35,31 +57,53 @@ def get_state(self, intended_type: None = None, default: Any = None) -> Any:
...

def get_state(self, intended_type: type[TState] | None = None, default: TState | None = None) -> TState | Any | None:
if self._current_state is None and default is not None:
if self._current_state is None:
return default

if intended_type is None:
return self._current_state

coerced = self._data_converter.coerce(self._current_state, intended_type)
if self._current_is_serialized:
# Deferred deserialization: the converter receives the raw payload
# together with the requested type.
if intended_type is None:
return self._data_converter.deserialize(self._current_state)
result = self._data_converter.deserialize(self._current_state, intended_type)
else:
if intended_type is None:
return self._current_state
result = self._data_converter.coerce(self._current_state, intended_type)

# An explicit ``intended_type`` is a request to receive that type. The
# default converter is best-effort and would silently return the raw
# value on a failed coercion; restore the stricter contract here by
# raising when a non-None state could not be coerced to a concrete type.
# ``intended_type`` may be a typing generic (e.g. ``list[int]``) at
# runtime, which is not a ``type`` instance, so the guard is required.
if (self._current_state is not None
and isinstance(intended_type, type) # pyright: ignore[reportUnnecessaryIsInstance]
and not isinstance(coerced, intended_type)):
if (isinstance(intended_type, type) # pyright: ignore[reportUnnecessaryIsInstance]
and not isinstance(result, intended_type)):
raise TypeError(
f"Could not convert state of type '{type(self._current_state).__name__}' to '{intended_type.__name__}'"
)
Comment thread
andystaples marked this conversation as resolved.

return coerced
return result

def set_state(self, state: Any) -> None:
# A value set in-process is a live Python object, not a serialized payload.
self._current_state = state
self._current_is_serialized = False

def encode_state(self) -> str | None:
"""Serialize the current state for persistence back to the wire.

Returns ``None`` only when the state is actually ``None`` (which clears
the persisted entity state). When the current value is still the raw
serialized payload (the state was never modified), it is returned
unchanged to avoid double-encoding; otherwise the live value is
serialized.
"""
if self._current_state is None:
return None
if self._current_is_serialized:
return self._current_state
return self._data_converter.serialize(self._current_state)

def add_operation_action(self, action: pb.OperationAction) -> None:
self._operation_actions.append(action)
Expand All @@ -69,14 +113,18 @@ def get_operation_actions(self) -> list[pb.OperationAction]:

def commit(self) -> None:
self._checkpoint_state = self._current_state
self._checkpoint_is_serialized = self._current_is_serialized
self._actions_checkpoint_state = len(self._operation_actions)

def rollback(self) -> None:
self._current_state = self._checkpoint_state
self._current_is_serialized = self._checkpoint_is_serialized
self._operation_actions = self._operation_actions[:self._actions_checkpoint_state]

def reset(self) -> None:
self._current_state = None
self._current_is_serialized = False
self._checkpoint_state = None
self._checkpoint_is_serialized = False
self._operation_actions = []
self._actions_checkpoint_state = 0
183 changes: 0 additions & 183 deletions durabletask/internal/json_codec.py

This file was deleted.

Loading
Loading