@@ -8442,7 +8442,9 @@ self-serve contract is intentionally narrow:
84428442 `DW_V2_MATCHING_ROLE_QUEUE_WAKE=0` to `shape: "dedicated"` and
84438443 `wake_owner: "dedicated_repair_pass"`. The same block should continue to
84448444 advertise `partition_primitives` of `connection`, `queue`, `compatibility`,
8445- and `namespace`, plus the `lease_ownership` `backpressure_model`.
8445+ and `namespace`, plus the `lease_ownership` `backpressure_model`. See
8446+ [Server Role Topology](/docs/2.0/polyglot/server-role-topology) for the
8447+ field-by-field meaning of the topology manifest.
84468448- Scale external SDK workers independently from API nodes. Workers can run on
84478449 separate hosts or processes, but they should talk to the load-balanced API
84488450 endpoint rather than to one sticky node.
@@ -8467,7 +8469,9 @@ product topology. Treat that as one contract with different role assignments,
84678469not as a second server product. If you pilot a more explicit role split later,
84688470keep reading `topology.current_shape`, `topology.current_roles`, and
84698471`topology.matching_role` from `/api/cluster/info` instead of inferring duties
8470- from hostnames or container names.
8472+ from hostnames or container names. The
8473+ [Server Role Topology](/docs/2.0/polyglot/server-role-topology) page explains
8474+ those role assignments and the migration path in one place.
84718475
84728476Every API node should use the same auth tokens or signature keys, app version,
84738477workflow package version, payload-codec configuration, database connection, and
@@ -10353,6 +10357,12 @@ unsupported.
1035310357
1035410358### Role topology and deployment shape
1035510359
10360+ The field-by-field reference for this manifest lives on
10361+ [Server Role Topology](/docs/2.0/polyglot/server-role-topology). Keep this
10362+ section for the inline `cluster/info` example and use the dedicated page when
10363+ you need the supported shapes, authority boundaries, failure domains, scaling
10364+ boundaries, or migration-path contract in one place.
10365+
1035610366`GET /api/cluster/info` also publishes a `topology` manifest. It is the
1035710367machine-readable role map for the node that answered the request, so operators
1035810368and automation can read one contract instead of inferring node duties from
@@ -11409,12 +11419,22 @@ curl -sS "$DURABLE_WORKFLOW_SERVER_URL/api/cluster/info" \
1140911419`/api/cluster/info` intentionally does not require the control-plane version
1141011420header because it is the endpoint that advertises the supported versions.
1141111421
11412- The same response also publishes the live topology and rollout-safety state for
11413- that node:
11422+ ### Cluster Topology Manifest
11423+
11424+ `/api/cluster/info` also returns the node's `topology` manifest under the
11425+ schema `durable-workflow.v2.role-topology`. That manifest is the supported way
11426+ to discover whether the node is currently acting as `standalone_server`,
11427+ `embedded`, or `split_control_execution`, which roles it owns, and what the
11428+ server expects from `matching_role`, `authority_boundaries`,
11429+ `failure_domains`, `scaling_boundaries`, and `migration_path`. The same
11430+ response also publishes live rollout-safety state for that node.
11431+
11432+ Read the manifest as follows:
1141411433
11415- - `topology.current_process_class`, `topology.current_roles`, and
11416- `topology.execution_mode` tell you which role shape the node is actually
11417- serving.
11434+ - `topology.current_process_class`, `topology.current_shape`,
11435+ `topology.current_roles`, and `topology.execution_mode` tell you which role
11436+ shape the node is actually serving. `current_shape` and `current_roles`
11437+ describe the responding node, not the full fleet.
1141811438- `topology.matching_role.shape`, `topology.matching_role.wake_owner`, and
1141911439 `topology.matching_role.task_dispatch_mode` tell you whether broad ready-task
1142011440 discovery is happening in-worker or through a dedicated matching-role sweep.
@@ -11424,15 +11444,21 @@ that node:
1142411444 reason about.
1142511445- `coordination_health` summarizes fleet-wide rollout and compatibility risk in
1142611446 one machine-readable block.
11447+ - `execution_mode` distinguishes `local_queue_worker` embedded execution from
11448+ `remote_worker_protocol` worker-protocol execution.
11449+ - `split_control_execution` is a supported product topology, not a second
11450+ server product or a different API.
1142711451
1142811452Example:
1142911453
1143011454<!-- docs-example id="server.cluster-info.topology.curl" -->
1143111455```bash
1143211456curl -sS "$DURABLE_WORKFLOW_SERVER_URL/api/cluster/info" \
1143311457 -H "Authorization: Bearer $DURABLE_WORKFLOW_AUTH_TOKEN" \
11434- -H "X-Namespace: default" | jq '{
11458+ -H "X-Namespace: default" \
11459+ | jq '{
1143511460 current_process_class: .topology.current_process_class,
11461+ current_shape: .topology.current_shape,
1143611462 current_roles: .topology.current_roles,
1143711463 execution_mode: .topology.execution_mode,
1143811464 matching_role: .topology.matching_role,
@@ -11443,6 +11469,9 @@ curl -sS "$DURABLE_WORKFLOW_SERVER_URL/api/cluster/info" \
1144311469 }'
1144411470```
1144511471
11472+ For the conceptual contract behind those fields, including the role vocabulary
11473+ and migration path, see
11474+ [Server Role Topology](/docs/2.0/polyglot/server-role-topology).
1144611475## Workflow Control Plane
1144711476
1144811477Workflow routes are operator/control-plane routes. They require an operator or
@@ -14302,6 +14331,186 @@ client = Client(
1430214331
1430314332Set the `namespace` argument to whichever tenant namespace the shared server has provisioned for your team, and use the credentials issued for that namespace. The server operator manages namespace creation — see the [server setup guide](/docs/2.0/polyglot/server) for details.
1430414333
14334+ <!-- Source: docs/polyglot/server-role-topology.md -->
14335+
14336+ # Server Role Topology
14337+
14338+ `GET /api/cluster/info` publishes the machine-readable role map for the node
14339+ that answered the request. Use that `topology` manifest when you need to know
14340+ which responsibilities the node owns, which durable write surfaces belong to
14341+ each role, and how the same product contract scales from one process to a split
14342+ control-plane and execution-plane deployment.
14343+
14344+ The manifest uses the schema `durable-workflow.v2.role-topology`. It is a
14345+ product contract, not an internal implementation detail. Operators, SDKs, CLI
14346+ automation, and rollout tooling should read it instead of inferring duties from
14347+ container names, hostnames, or process labels.
14348+
14349+ ## Why This Manifest Exists
14350+
14351+ Durable Workflow keeps one durable engine across embedded mode, the standalone
14352+ server distribution, and more explicit split-role deployments. The topology
14353+ manifest makes that boundary explicit:
14354+
14355+ - it names the legal topology shapes as product terms
14356+ - it tells you which roles the current node owns
14357+ - it publishes the durable write boundary for each role
14358+ - it exposes the first failure signal and main scaling driver per role
14359+ - it preserves one migration path instead of implying a second engine
14360+
14361+ This is how you distinguish a supported topology change from a product fork.
14362+
14363+ ## Role Vocabulary
14364+
14365+ The manifest vocabulary is fixed for v2:
14366+
14367+ | Role | What it owns |
14368+ | --- | --- |
14369+ | `api_ingress` | HTTP termination, request authentication, namespace resolution, and handing requests to the right plane. |
14370+ | `control_plane` | Durable workflow commands such as start, signal, update, cancel, terminate, reset, repair, and archive. |
14371+ | `matching` | Ready-task discovery, claim arbitration, dispatch publication, and wake ownership. |
14372+ | `history_projection` | Durable history recording, run summaries, and operator-facing projection surfaces. |
14373+ | `scheduler` | Schedule evaluation and turning schedule fire state into workflow starts. |
14374+ | `execution_plane` | Workflow-task replay, activity execution, task heartbeats, and task completion/failure outcomes. |
14375+
14376+ The role list is intentionally logical rather than process-shaped. One process
14377+ may host multiple roles, and one role may later move to its own process class
14378+ without changing the contract.
14379+
14380+ ## Supported Deployment Shapes
14381+
14382+ The server publishes the supported shapes in `topology.supported_shapes`:
14383+
14384+ | Shape | Process classes | Typical use |
14385+ | --- | --- | --- |
14386+ | `embedded` | One `application_process` owns `control_plane`, `matching`, `history_projection`, `scheduler`, and `execution_plane`. | Laravel-app embedding with local queue workers. |
14387+ | `standalone_server` | `server_http_node`, `scheduler_node`, and `worker_node`. | Published server image, self-hosted Compose, and the narrow small-cluster contract. |
14388+ | `split_control_execution` | `ingress_node`, `control_plane_node`, `scheduler_node`, `matching_node`, and `execution_node`. | More explicit role isolation without introducing a second engine. |
14389+
14390+ Two details matter in practice:
14391+
14392+ - The published self-hosted server artifacts start in the
14393+ `standalone_server` shape, even though the manifest also advertises
14394+ `split_control_execution` as a supported product topology.
14395+ - `current_shape` and `current_roles` describe the node you queried right now,
14396+ not the whole fleet. A standalone server API node reports
14397+ `api_ingress`, `control_plane`, `matching`, and `history_projection`; the
14398+ scheduler and workers are separate process classes in the same shape.
14399+
14400+ For rollout planning, keep the topology page paired with
14401+ [Self-Hosting Deployments](/docs/2.0/deployment) and
14402+ [Rolling Upgrades](/docs/2.0/rolling-upgrades). Those pages tell you which
14403+ shape is self-serve today; the manifest tells you which roles the node is
14404+ actually owning.
14405+
14406+ ## Authority Boundaries
14407+
14408+ `topology.authority_boundaries` tells you which durable write surfaces belong
14409+ to each role. Treat it as the first guardrail before splitting a deployment:
14410+
14411+ | Role | Durable write boundary |
14412+ | --- | --- |
14413+ | `control_plane` | `workflow_instances`, `workflow_runs.status`, `workflow_tasks.lifecycle` |
14414+ | `execution_plane` | `workflow_tasks.outcomes`, `activity_attempts`, `worker_compatibility_heartbeats` |
14415+ | `matching` | `workflow_tasks.leases`, `activity_tasks.leases` |
14416+ | `history_projection` | `history_events`, `workflow_run_summaries`, `workflow_history_exports` |
14417+ | `scheduler` | `workflow_schedules.fire_state`, `workflow_starts.scheduled` |
14418+ | `api_ingress` | `worker_registrations` |
14419+
14420+ If a process needs a durable write surface outside the roles it claims, that is
14421+ topology drift and should be treated as a contract problem before you scale or
14422+ split the fleet further.
14423+
14424+ ## Failure And Scaling Boundaries
14425+
14426+ The same manifest also publishes the first expected failure signal and the main
14427+ scaling driver for each role:
14428+
14429+ | Role or failure | What operators should expect |
14430+ | --- | --- |
14431+ | `control_plane_down` | Operator commands fail fast; already-claimed work continues only until lease expiry. |
14432+ | `execution_plane_down` | Ready tasks accumulate without loss; queue depth and schedule-to-start lag grow. |
14433+ | `matching_down` | Claim rate falls while ready depth rises; current implementations fall back to direct ready-task discovery. |
14434+ | `history_projection_down` | Durable writes continue, but projection reads may go stale and projection-lag health rises. |
14435+ | `scheduler_down` | Scheduled workflows stop firing and the missed-schedule state becomes visible. |
14436+ | `api_ingress_down` | External HTTP traffic stops at the edge even if embedded in-process calls still exist elsewhere. |
14437+
14438+ | Role | Main scaling driver |
14439+ | --- | --- |
14440+ | `api_ingress` | `incoming_http_request_rate` |
14441+ | `control_plane` | `operator_commands_and_run_lifecycle_transitions` |
14442+ | `matching` | `ready_task_rate_and_poller_count` |
14443+ | `history_projection` | `durable_event_rate` |
14444+ | `scheduler` | `active_schedule_count` |
14445+ | `execution_plane` | `workflow_and_activity_task_rate` |
14446+
14447+ This is the contract behind the public operator guidance. When the deployment
14448+ guide says the scheduler is singleton or when the rolling-upgrade guide says
14449+ workers and API nodes roll independently, it is relying on these boundaries.
14450+
14451+ ## Migration Path
14452+
14453+ The manifest publishes one ordered `migration_path` so a deployment can evolve
14454+ without inventing a new engine:
14455+
14456+ 1. `audit_role_boundaries` so tooling can detect cross-role writes before
14457+ runtime shape changes.
14458+ 2. `expose_role_bindings` so hosts can swap adapters or run a role out of
14459+ process without patching the package.
14460+ 3. `introduce_dedicated_matching_shape` so matching can move out of the worker
14461+ loop without changing the claim contract.
14462+ 4. `split_history_projection` so history/projection work can move without
14463+ introducing a second writer.
14464+ 5. `split_scheduler` so schedule firing can sit behind explicit ownership while
14465+ single-replica deployments stay legal.
14466+ 6. `optional_execution_partitioning` so workers can partition by namespace,
14467+ connection, queue, and compatibility.
14468+
14469+ Read that sequence literally. `split_control_execution` is a topology that
14470+ keeps the same durable kernel and discovery surface; it is not a new control
14471+ plane, a second protocol, or a hosted-only feature fork.
14472+
14473+ ## Reading The Topology Manifest
14474+
14475+ Use `/api/cluster/info` to read the node's current role assignment:
14476+
14477+ ```bash
14478+ curl -sS "$DURABLE_WORKFLOW_SERVER_URL/api/cluster/info" \
14479+ -H "Authorization: Bearer $DURABLE_WORKFLOW_AUTH_TOKEN" \
14480+ -H "X-Namespace: default" \
14481+ | jq '.topology | {schema, version, current_shape, current_roles, execution_mode, matching_role}'
14482+ ```
14483+
14484+ Key fields to inspect:
14485+
14486+ - `schema` and `version` tell you which topology manifest schema you are
14487+ parsing. Treat `topology.version` as the manifest version, not as the server
14488+ build version.
14489+ - `current_shape` and `current_roles` tell you what the responding node owns
14490+ right now.
14491+ - `execution_mode` distinguishes `local_queue_worker` embedded execution from
14492+ `remote_worker_protocol` standalone-server execution.
14493+ - `matching_role` tells you whether the node still owns the in-worker wake path
14494+ or expects a dedicated repair/matching loop to do that work.
14495+ - `shape_assignments`, `authority_boundaries`, `failure_domains`,
14496+ `scaling_boundaries`, and `migration_path` are the fields operators should
14497+ read before changing topology, rollout posture, or process ownership.
14498+
14499+ ## Related References
14500+
14501+ - [Server](/docs/2.0/polyglot/server) for the general standalone-server guide
14502+ and the inline cluster-info example.
14503+ - [Server API Reference](/docs/2.0/polyglot/server-api-reference) for the
14504+ discovery endpoint, route matrix, and required headers.
14505+ - [Self-Hosting Deployments](/docs/2.0/deployment) for the supported
14506+ self-serve shapes and their operational boundaries.
14507+ - [Rolling Upgrades](/docs/2.0/rolling-upgrades) for mixed-version rollout
14508+ behavior across API nodes, workers, and the scheduler.
14509+ - [Task Matching and Dispatch](/docs/2.0/polyglot/task-matching-dispatch) for
14510+ the matching-role contract that `topology.matching_role` points at.
14511+ - [Worker Compatibility Routing](/docs/2.0/polyglot/worker-compatibility-routing)
14512+ for build-id and compatibility-marker routing semantics across worker fleets.
14513+
1430514514<!-- Source: docs/polyglot/cli-python-parity.md -->
1430614515
1430714516# CLI and Python Parity
0 commit comments