Skip to content

Commit 79561e1

Browse files
deploy: af3b933
1 parent 12380b8 commit 79561e1

170 files changed

Lines changed: 421 additions & 353 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

2.0/llms-full.txt

Lines changed: 45 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -10436,14 +10436,41 @@ except WorkflowNotFound:
1043610436

1043710437
## Payload Codecs
1043810438

10439-
All payloads are codec-tagged. The Python SDK currently encodes and decodes the `json` codec only — values are serialized as `{codec: "json", blob: "..."}` envelopes on the wire. The SDK never guesses the codec: if it receives a payload with an unknown codec (including the server's `avro` default), it raises with a clear message.
10439+
All payloads are codec-tagged. The Python SDK encodes and decodes the `json` codec out of the box, and the `avro` codec when the optional Avro extra is installed.
1044010440

10441-
The PHP server's default codec for new v2 workflows is `avro`. To run a Python worker against a shared server you must keep workflows the worker consumes on a JSON-tagged codec. Two ways to do that:
10441+
### Installing Avro support
1044210442

10443-
- **Pin the server-side default to `json`** — set `'serializer' => 'json'` in `config/workflows.php`. Every new workflow that lands on this server will be tagged `payload_codec = "json"` and any Python worker can drive it.
10444-
- **Send an explicit envelope from every client** — clients that start workflows can post the explicit `{codec: "json", blob: "..."}` envelope on `POST /api/workflows`. The Python `Client.start_workflow()` already does this, so workflows started by a Python client are always Python-readable regardless of the server-side default.
10443+
Avro support is optional to keep the default install minimal (only `httpx` is required for JSON-mode interop). Install the extra to opt in:
1044510444

10446-
Once Python adds an Avro decoder, this restriction goes away and Python workers will decode Avro-tagged tasks transparently.
10445+
```sh
10446+
pip install 'durable-workflow[avro]'
10447+
```
10448+
10449+
This pulls the `avro` package (the official Apache Avro Python bindings) and enables the SDK to encode and decode payloads tagged `payload_codec = "avro"`.
10450+
10451+
### What works today (generic-wrapper surface)
10452+
10453+
The SDK uses a generic-wrapper schema for Avro payloads — the Python value is JSON-encoded, then the resulting string is Avro-binary-framed under a `{json: string, version: int}` record. This gives schema-evolution framing and compact transport without requiring the Python developer to hand-write an Avro schema for every workflow.
10454+
10455+
Supported surfaces when `durable-workflow[avro]` is installed:
10456+
10457+
- **Activity worker** — Avro-tagged activity arguments decode transparently. The worker echos the task's codec when completing: if a task arrives Avro-coded, the worker encodes the result as Avro; if it arrives JSON-coded, the worker returns JSON.
10458+
- **Workflow worker history replay** — Avro-tagged start input and activity result events are decoded during replay, so a Python workflow can participate in an Avro-coded run.
10459+
10460+
Surfaces that are still JSON-only on the Python side (tracked in [#331](https://github.com/zorporation/durable-workflow/issues/331)):
10461+
10462+
- **Client-level codec selection for `start_workflow`, `signal_workflow`, `query_workflow`, `update_workflow`** — these always emit JSON-tagged payloads today. The Python client cannot currently start an Avro-coded run or send an Avro-coded signal/query/update; those runs are only reachable from a codec-aware PHP client or via the HTTP API with an explicit `{codec: "avro", ...}` envelope.
10463+
10464+
### Running a Python activity worker against an Avro-coded run
10465+
10466+
With `durable-workflow[avro]` installed, no extra configuration is needed: the SDK reads `payload_codec` on every claim and picks the right decoder. If the server default is `avro` and the activity task arrives Avro-coded, the Python worker decodes it, runs the activity, and encodes the result back as `avro`.
10467+
10468+
### Sticking to JSON for simplicity
10469+
10470+
Teams that want the simplest possible setup can keep every run on the `json` codec:
10471+
10472+
- **Pin the server-side default to `json`** — set `'serializer' => 'json'` in `config/workflows.php`. Every new workflow started on this server will be tagged `payload_codec = "json"`. This is the simplest path and removes the Avro install from the Python side entirely.
10473+
- **Send an explicit JSON envelope from every client** — clients that start workflows can post the explicit `{codec: "json", blob: "..."}` envelope on `POST /api/workflows`. The Python `Client.start_workflow()` defaults to this when called without `input_envelope`.
1044710474

1044810475
When both sides agree on `json`, Python workflows and activities interoperate cleanly with PHP workers on the same task queue, as long as the data types are JSON-serializable in both languages.
1044910476

@@ -10622,7 +10649,12 @@ Activity heartbeat responses include `can_continue` and `cancel_requested` field
1062210649

1062310650
## Payload Codecs
1062410651

10625-
Every payload byte string that crosses the worker-protocol boundary is tagged with a **`payload_codec`** naming the format of the accompanying blob. v2 ships with two language-neutral codecs — **`avro`** (the default) and **`json`** — so any SDK (PHP, Python, Go, TypeScript, Rust) can encode and decode payloads without sharing a runtime or an app key. The set of universal codecs the running server supports is advertised on `GET /api/cluster/info` under `capabilities.payload_codecs`.
10652+
Every payload byte string that crosses the worker-protocol boundary is tagged with a **`payload_codec`** naming the format of the accompanying blob. v2 ships with two language-neutral codecs — **`avro`** (the default) and **`json`** — so any SDK (PHP, Python, Go, TypeScript, Rust) can encode and decode payloads without sharing a runtime or an app key. The running server advertises its codec support on `GET /api/cluster/info`:
10653+
10654+
- **`capabilities.payload_codecs`** — the universal codec names every SDK is expected to be able to decode. This is what polyglot clients should key their codec negotiation off.
10655+
- **`capabilities.payload_codecs_engine_specific.<engine>`** — codec names that require a specific engine runtime to decode (e.g. PHP's legacy `SerializableClosure` codecs under `.php`). This key is only present when the server exposes engine-specific codecs, and it is deliberately namespaced so non-PHP SDKs do not advertise codecs they cannot decode.
10656+
10657+
The same split carries through the embedded control-plane request contract: `operations.start.fields.payload_codec.canonical_values` advertises only universal codec names, and `operations.start.fields.payload_codec.engine_specific_values.<engine>` mirrors the cluster-info split for explicit-envelope starts.
1062610658

1062710659
### The `avro` codec
1062810660

@@ -10651,7 +10683,7 @@ The worker reads `payload_codec` to choose a decoder. A non-matching codec is a
1065110683

1065210684
`POST /api/workflows` accepts `input` in two shapes:
1065310685

10654-
1. **Plain JSON array** — the server encodes through the configured default codec and tags the run with the resulting `payload_codec` (e.g. `"avro"` on a default install).
10686+
1. **Plain JSON array** — the server JSON-encodes the values and tags the run `payload_codec = "json"`. Avro requires a writer schema that a plain array cannot carry, so HTTP starts that omit `input` or send a bare array always land on the `json` codec even when the server default is `avro`. Clients that want an Avro-coded run must send the explicit envelope form.
1065510687

1065610688
```json
1065710689
{ "workflow_type": "MyWorkflow", "input": ["hello", 42] }
@@ -10662,14 +10694,16 @@ The worker reads `payload_codec` to choose a decoder. A non-matching codec is a
1066210694
```json
1066310695
{
1066410696
"workflow_type": "MyWorkflow",
10665-
"input": { "codec": "json", "blob": "[\"hello\", 42]" }
10697+
"input": { "codec": "avro", "blob": "<base64-avro-bytes>" }
1066610698
}
1066710699
```
1066810700

10669-
The server stores the blob verbatim and tags the run with the declared codec. Either `"avro"` or `"json"` is accepted as long as the running server advertises it under `capabilities.payload_codecs`.
10701+
The server stores the blob verbatim and tags the run with the declared codec. Any codec the server advertises under `capabilities.payload_codecs` (or under `capabilities.payload_codecs_engine_specific.<engine>` if the client knows how to decode it) is accepted.
1067010702

1067110703
The chosen codec is stored on the `WorkflowRun` and **propagates for the life of the run**: activity arguments, results, signal/update arguments, and child-workflow inputs all use the same codec.
1067210704

10705+
Embedded/package starts (workflows kicked off from PHP via `WorkflowStub::make(...)->start(...)` rather than the HTTP API) follow the configured `workflows.serializer` default and can land on `avro` or `json` depending on configuration. The plain-array-is-JSON shortcut applies only to the HTTP start API.
10706+
1067310707
### JSON Type Normalization
1067410708

1067510709
JSON has a single numeric type; language runtimes do not. When a payload round-trips between SDKs under the `json` codec, some type distinctions are **normalized away**. Workflows that depend on the exact runtime type of a value across a language boundary must encode the type explicitly (for example, as a string).
@@ -10685,14 +10719,14 @@ Known normalizations:
1068510719

1068610720
If your workflow needs to preserve the integer-vs-float distinction across a PHP↔Python hop (for example, a schema validator that rejects `3` but accepts `3.0`), encode the value as a string (`"3.0"`) and parse it on the receiving side. This is an intrinsic property of JSON, not a bug in the codec.
1068710721

10688-
### Legacy codecs (v1 migration only)
10722+
### Legacy codecs (PHP-only)
1068910723

1069010724
Older v1 deployments wrote history under two PHP-only codecs, which the package continues to read so finish-on-v1 migrations can drain:
1069110725

1069210726
- `workflow-serializer-y` — PHP `SerializableClosure` with byte-escape encoding. Requires a shared `config('app.key')` between server and worker.
1069310727
- `workflow-serializer-base64` — PHP `SerializableClosure` with base64 encoding.
1069410728

10695-
These codecs are **not recommended for new workflows**: a Python or Go worker cannot decode them. v2 installations that still have `workflows.serializer` set to a legacy value will be flagged by `php artisan workflow:v2:doctor`.
10729+
These codecs are **not recommended for new workflows**: a Python or Go worker cannot decode them. Any v2 installation with `workflows.serializer` still pinned to a legacy codec continues to use that codec for new runs — the runtime honors the setting, does not force a decode-only mode, and only surfaces the legacy pinning as a warning from `php artisan workflow:v2:doctor`. Polyglot fleets should either flip the setting to `'avro'` (the default for new installs) or `'json'` before registering a non-PHP worker on the task queue.
1069610730

1069710731
Legacy fully-qualified PHP class names (e.g. `Workflow\Serializers\Y`) are accepted as aliases so rows persisted before the codec rename keep decoding.
1069810732

0 commit comments

Comments
 (0)