Skip to content

Commit a8b8b52

Browse files
deploy: f13c30a
1 parent 170c543 commit a8b8b52

287 files changed

Lines changed: 1642 additions & 1105 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

2.0/llms-full.txt

Lines changed: 217 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -8442,7 +8442,9 @@ self-serve contract is intentionally narrow:
84428442
`DW_V2_MATCHING_ROLE_QUEUE_WAKE=0` to `shape: "dedicated"` and
84438443
`wake_owner: "dedicated_repair_pass"`. The same block should continue to
84448444
advertise `partition_primitives` of `connection`, `queue`, `compatibility`,
8445-
and `namespace`, plus the `lease_ownership` `backpressure_model`.
8445+
and `namespace`, plus the `lease_ownership` `backpressure_model`. See
8446+
[Server Role Topology](/docs/2.0/polyglot/server-role-topology) for the
8447+
field-by-field meaning of the topology manifest.
84468448
- Scale external SDK workers independently from API nodes. Workers can run on
84478449
separate hosts or processes, but they should talk to the load-balanced API
84488450
endpoint rather than to one sticky node.
@@ -8467,7 +8469,9 @@ product topology. Treat that as one contract with different role assignments,
84678469
not as a second server product. If you pilot a more explicit role split later,
84688470
keep reading `topology.current_shape`, `topology.current_roles`, and
84698471
`topology.matching_role` from `/api/cluster/info` instead of inferring duties
8470-
from hostnames or container names.
8472+
from hostnames or container names. The
8473+
[Server Role Topology](/docs/2.0/polyglot/server-role-topology) page explains
8474+
those role assignments and the migration path in one place.
84718475

84728476
Every API node should use the same auth tokens or signature keys, app version,
84738477
workflow package version, payload-codec configuration, database connection, and
@@ -10353,6 +10357,12 @@ unsupported.
1035310357

1035410358
### Role topology and deployment shape
1035510359

10360+
The field-by-field reference for this manifest lives on
10361+
[Server Role Topology](/docs/2.0/polyglot/server-role-topology). Keep this
10362+
section for the inline `cluster/info` example and use the dedicated page when
10363+
you need the supported shapes, authority boundaries, failure domains, scaling
10364+
boundaries, or migration-path contract in one place.
10365+
1035610366
`GET /api/cluster/info` also publishes a `topology` manifest. It is the
1035710367
machine-readable role map for the node that answered the request, so operators
1035810368
and automation can read one contract instead of inferring node duties from
@@ -11409,12 +11419,22 @@ curl -sS "$DURABLE_WORKFLOW_SERVER_URL/api/cluster/info" \
1140911419
`/api/cluster/info` intentionally does not require the control-plane version
1141011420
header because it is the endpoint that advertises the supported versions.
1141111421

11412-
The same response also publishes the live topology and rollout-safety state for
11413-
that node:
11422+
### Cluster Topology Manifest
11423+
11424+
`/api/cluster/info` also returns the node's `topology` manifest under the
11425+
schema `durable-workflow.v2.role-topology`. That manifest is the supported way
11426+
to discover whether the node is currently acting as `standalone_server`,
11427+
`embedded`, or `split_control_execution`, which roles it owns, and what the
11428+
server expects from `matching_role`, `authority_boundaries`,
11429+
`failure_domains`, `scaling_boundaries`, and `migration_path`. The same
11430+
response also publishes live rollout-safety state for that node.
11431+
11432+
Read the manifest as follows:
1141411433

11415-
- `topology.current_process_class`, `topology.current_roles`, and
11416-
`topology.execution_mode` tell you which role shape the node is actually
11417-
serving.
11434+
- `topology.current_process_class`, `topology.current_shape`,
11435+
`topology.current_roles`, and `topology.execution_mode` tell you which role
11436+
shape the node is actually serving. `current_shape` and `current_roles`
11437+
describe the responding node, not the full fleet.
1141811438
- `topology.matching_role.shape`, `topology.matching_role.wake_owner`, and
1141911439
`topology.matching_role.task_dispatch_mode` tell you whether broad ready-task
1142011440
discovery is happening in-worker or through a dedicated matching-role sweep.
@@ -11424,15 +11444,21 @@ that node:
1142411444
reason about.
1142511445
- `coordination_health` summarizes fleet-wide rollout and compatibility risk in
1142611446
one machine-readable block.
11447+
- `execution_mode` distinguishes `local_queue_worker` embedded execution from
11448+
`remote_worker_protocol` worker-protocol execution.
11449+
- `split_control_execution` is a supported product topology, not a second
11450+
server product or a different API.
1142711451

1142811452
Example:
1142911453

1143011454
<!-- docs-example id="server.cluster-info.topology.curl" -->
1143111455
```bash
1143211456
curl -sS "$DURABLE_WORKFLOW_SERVER_URL/api/cluster/info" \
1143311457
-H "Authorization: Bearer $DURABLE_WORKFLOW_AUTH_TOKEN" \
11434-
-H "X-Namespace: default" | jq '{
11458+
-H "X-Namespace: default" \
11459+
| jq '{
1143511460
current_process_class: .topology.current_process_class,
11461+
current_shape: .topology.current_shape,
1143611462
current_roles: .topology.current_roles,
1143711463
execution_mode: .topology.execution_mode,
1143811464
matching_role: .topology.matching_role,
@@ -11443,6 +11469,9 @@ curl -sS "$DURABLE_WORKFLOW_SERVER_URL/api/cluster/info" \
1144311469
}'
1144411470
```
1144511471

11472+
For the conceptual contract behind those fields, including the role vocabulary
11473+
and migration path, see
11474+
[Server Role Topology](/docs/2.0/polyglot/server-role-topology).
1144611475
## Workflow Control Plane
1144711476

1144811477
Workflow routes are operator/control-plane routes. They require an operator or
@@ -14302,6 +14331,186 @@ client = Client(
1430214331

1430314332
Set the `namespace` argument to whichever tenant namespace the shared server has provisioned for your team, and use the credentials issued for that namespace. The server operator manages namespace creation — see the [server setup guide](/docs/2.0/polyglot/server) for details.
1430414333

14334+
<!-- Source: docs/polyglot/server-role-topology.md -->
14335+
14336+
# Server Role Topology
14337+
14338+
`GET /api/cluster/info` publishes the machine-readable role map for the node
14339+
that answered the request. Use that `topology` manifest when you need to know
14340+
which responsibilities the node owns, which durable write surfaces belong to
14341+
each role, and how the same product contract scales from one process to a split
14342+
control-plane and execution-plane deployment.
14343+
14344+
The manifest uses the schema `durable-workflow.v2.role-topology`. It is a
14345+
product contract, not an internal implementation detail. Operators, SDKs, CLI
14346+
automation, and rollout tooling should read it instead of inferring duties from
14347+
container names, hostnames, or process labels.
14348+
14349+
## Why This Manifest Exists
14350+
14351+
Durable Workflow keeps one durable engine across embedded mode, the standalone
14352+
server distribution, and more explicit split-role deployments. The topology
14353+
manifest makes that boundary explicit:
14354+
14355+
- it names the legal topology shapes as product terms
14356+
- it tells you which roles the current node owns
14357+
- it publishes the durable write boundary for each role
14358+
- it exposes the first failure signal and main scaling driver per role
14359+
- it preserves one migration path instead of implying a second engine
14360+
14361+
This is how you distinguish a supported topology change from a product fork.
14362+
14363+
## Role Vocabulary
14364+
14365+
The manifest vocabulary is fixed for v2:
14366+
14367+
| Role | What it owns |
14368+
| --- | --- |
14369+
| `api_ingress` | HTTP termination, request authentication, namespace resolution, and handing requests to the right plane. |
14370+
| `control_plane` | Durable workflow commands such as start, signal, update, cancel, terminate, reset, repair, and archive. |
14371+
| `matching` | Ready-task discovery, claim arbitration, dispatch publication, and wake ownership. |
14372+
| `history_projection` | Durable history recording, run summaries, and operator-facing projection surfaces. |
14373+
| `scheduler` | Schedule evaluation and turning schedule fire state into workflow starts. |
14374+
| `execution_plane` | Workflow-task replay, activity execution, task heartbeats, and task completion/failure outcomes. |
14375+
14376+
The role list is intentionally logical rather than process-shaped. One process
14377+
may host multiple roles, and one role may later move to its own process class
14378+
without changing the contract.
14379+
14380+
## Supported Deployment Shapes
14381+
14382+
The server publishes the supported shapes in `topology.supported_shapes`:
14383+
14384+
| Shape | Process classes | Typical use |
14385+
| --- | --- | --- |
14386+
| `embedded` | One `application_process` owns `control_plane`, `matching`, `history_projection`, `scheduler`, and `execution_plane`. | Laravel-app embedding with local queue workers. |
14387+
| `standalone_server` | `server_http_node`, `scheduler_node`, and `worker_node`. | Published server image, self-hosted Compose, and the narrow small-cluster contract. |
14388+
| `split_control_execution` | `ingress_node`, `control_plane_node`, `scheduler_node`, `matching_node`, and `execution_node`. | More explicit role isolation without introducing a second engine. |
14389+
14390+
Two details matter in practice:
14391+
14392+
- The published self-hosted server artifacts start in the
14393+
`standalone_server` shape, even though the manifest also advertises
14394+
`split_control_execution` as a supported product topology.
14395+
- `current_shape` and `current_roles` describe the node you queried right now,
14396+
not the whole fleet. A standalone server API node reports
14397+
`api_ingress`, `control_plane`, `matching`, and `history_projection`; the
14398+
scheduler and workers are separate process classes in the same shape.
14399+
14400+
For rollout planning, keep the topology page paired with
14401+
[Self-Hosting Deployments](/docs/2.0/deployment) and
14402+
[Rolling Upgrades](/docs/2.0/rolling-upgrades). Those pages tell you which
14403+
shape is self-serve today; the manifest tells you which roles the node is
14404+
actually owning.
14405+
14406+
## Authority Boundaries
14407+
14408+
`topology.authority_boundaries` tells you which durable write surfaces belong
14409+
to each role. Treat it as the first guardrail before splitting a deployment:
14410+
14411+
| Role | Durable write boundary |
14412+
| --- | --- |
14413+
| `control_plane` | `workflow_instances`, `workflow_runs.status`, `workflow_tasks.lifecycle` |
14414+
| `execution_plane` | `workflow_tasks.outcomes`, `activity_attempts`, `worker_compatibility_heartbeats` |
14415+
| `matching` | `workflow_tasks.leases`, `activity_tasks.leases` |
14416+
| `history_projection` | `history_events`, `workflow_run_summaries`, `workflow_history_exports` |
14417+
| `scheduler` | `workflow_schedules.fire_state`, `workflow_starts.scheduled` |
14418+
| `api_ingress` | `worker_registrations` |
14419+
14420+
If a process needs a durable write surface outside the roles it claims, that is
14421+
topology drift and should be treated as a contract problem before you scale or
14422+
split the fleet further.
14423+
14424+
## Failure And Scaling Boundaries
14425+
14426+
The same manifest also publishes the first expected failure signal and the main
14427+
scaling driver for each role:
14428+
14429+
| Role or failure | What operators should expect |
14430+
| --- | --- |
14431+
| `control_plane_down` | Operator commands fail fast; already-claimed work continues only until lease expiry. |
14432+
| `execution_plane_down` | Ready tasks accumulate without loss; queue depth and schedule-to-start lag grow. |
14433+
| `matching_down` | Claim rate falls while ready depth rises; current implementations fall back to direct ready-task discovery. |
14434+
| `history_projection_down` | Durable writes continue, but projection reads may go stale and projection-lag health rises. |
14435+
| `scheduler_down` | Scheduled workflows stop firing and the missed-schedule state becomes visible. |
14436+
| `api_ingress_down` | External HTTP traffic stops at the edge even if embedded in-process calls still exist elsewhere. |
14437+
14438+
| Role | Main scaling driver |
14439+
| --- | --- |
14440+
| `api_ingress` | `incoming_http_request_rate` |
14441+
| `control_plane` | `operator_commands_and_run_lifecycle_transitions` |
14442+
| `matching` | `ready_task_rate_and_poller_count` |
14443+
| `history_projection` | `durable_event_rate` |
14444+
| `scheduler` | `active_schedule_count` |
14445+
| `execution_plane` | `workflow_and_activity_task_rate` |
14446+
14447+
This is the contract behind the public operator guidance. When the deployment
14448+
guide says the scheduler is singleton or when the rolling-upgrade guide says
14449+
workers and API nodes roll independently, it is relying on these boundaries.
14450+
14451+
## Migration Path
14452+
14453+
The manifest publishes one ordered `migration_path` so a deployment can evolve
14454+
without inventing a new engine:
14455+
14456+
1. `audit_role_boundaries` so tooling can detect cross-role writes before
14457+
runtime shape changes.
14458+
2. `expose_role_bindings` so hosts can swap adapters or run a role out of
14459+
process without patching the package.
14460+
3. `introduce_dedicated_matching_shape` so matching can move out of the worker
14461+
loop without changing the claim contract.
14462+
4. `split_history_projection` so history/projection work can move without
14463+
introducing a second writer.
14464+
5. `split_scheduler` so schedule firing can sit behind explicit ownership while
14465+
single-replica deployments stay legal.
14466+
6. `optional_execution_partitioning` so workers can partition by namespace,
14467+
connection, queue, and compatibility.
14468+
14469+
Read that sequence literally. `split_control_execution` is a topology that
14470+
keeps the same durable kernel and discovery surface; it is not a new control
14471+
plane, a second protocol, or a hosted-only feature fork.
14472+
14473+
## Reading The Topology Manifest
14474+
14475+
Use `/api/cluster/info` to read the node's current role assignment:
14476+
14477+
```bash
14478+
curl -sS "$DURABLE_WORKFLOW_SERVER_URL/api/cluster/info" \
14479+
-H "Authorization: Bearer $DURABLE_WORKFLOW_AUTH_TOKEN" \
14480+
-H "X-Namespace: default" \
14481+
| jq '.topology | {schema, version, current_shape, current_roles, execution_mode, matching_role}'
14482+
```
14483+
14484+
Key fields to inspect:
14485+
14486+
- `schema` and `version` tell you which topology manifest schema you are
14487+
parsing. Treat `topology.version` as the manifest version, not as the server
14488+
build version.
14489+
- `current_shape` and `current_roles` tell you what the responding node owns
14490+
right now.
14491+
- `execution_mode` distinguishes `local_queue_worker` embedded execution from
14492+
`remote_worker_protocol` standalone-server execution.
14493+
- `matching_role` tells you whether the node still owns the in-worker wake path
14494+
or expects a dedicated repair/matching loop to do that work.
14495+
- `shape_assignments`, `authority_boundaries`, `failure_domains`,
14496+
`scaling_boundaries`, and `migration_path` are the fields operators should
14497+
read before changing topology, rollout posture, or process ownership.
14498+
14499+
## Related References
14500+
14501+
- [Server](/docs/2.0/polyglot/server) for the general standalone-server guide
14502+
and the inline cluster-info example.
14503+
- [Server API Reference](/docs/2.0/polyglot/server-api-reference) for the
14504+
discovery endpoint, route matrix, and required headers.
14505+
- [Self-Hosting Deployments](/docs/2.0/deployment) for the supported
14506+
self-serve shapes and their operational boundaries.
14507+
- [Rolling Upgrades](/docs/2.0/rolling-upgrades) for mixed-version rollout
14508+
behavior across API nodes, workers, and the scheduler.
14509+
- [Task Matching and Dispatch](/docs/2.0/polyglot/task-matching-dispatch) for
14510+
the matching-role contract that `topology.matching_role` points at.
14511+
- [Worker Compatibility Routing](/docs/2.0/polyglot/worker-compatibility-routing)
14512+
for build-id and compatibility-marker routing semantics across worker fleets.
14513+
1430514514
<!-- Source: docs/polyglot/cli-python-parity.md -->
1430614515

1430714516
# CLI and Python Parity

0 commit comments

Comments
 (0)