You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
| `dw task-queue:build-ids <queue>` | Inspect per-build-id cohort state and rollout status for one queue. | `--json` |
10477
+
| `dw task-queue:drain <queue>` | Mark a build-id cohort as draining so it stops claiming new tasks. | `--build-id <value>`, `--unversioned`, `--json` |
10478
+
| `dw task-queue:resume <queue>` | Clear a previous drain so the cohort can claim new tasks again. | `--build-id <value>`, `--unversioned`, `--json` |
10468
10479
10469
10480
The task queue commands are the preferred operator view for throttling,
10470
10481
capacity, and no-worker diagnoses. See
10471
10482
[Task Queue Admission](/docs/2.0/polyglot/task-queue-admission) for the
10472
-
server-side policy behind those fields.
10483
+
server-side policy behind those fields and
10484
+
[Worker Build-Id Rollout](/docs/2.0/polyglot/worker-build-id-rollout) for the
10485
+
full unversioned-to-versioned cutover, canary, drain, and rollback lifecycle.
10486
+
10487
+
`dw task-queue:drain` and `dw task-queue:resume` both require either
10488
+
`--build-id <value>` to target a specific build cohort or `--unversioned` to
10489
+
target the cohort of workers registered without a `build_id`. Combining the
10490
+
two fails fast with an invalid-option error. Both commands are idempotent:
10491
+
repeated drains do not shift the recorded `drained_at` timestamp, and
10492
+
resuming an already-active cohort is a no-op.
10473
10493
10474
10494
## Worker Protocol Commands
10475
10495
@@ -12765,6 +12785,190 @@ An admission payload has three sections:
Use this reference when you cut over from unversioned workers to build-tagged
12793
+
workers, canary a new build onto a task queue, drain an older build before
12794
+
decommissioning it, or roll a bad build back. The server records operator
12795
+
intent alongside the live worker rows so the next poll, CLI describe, or
12796
+
`list_task_queue_build_ids` call reflects the rollout state honestly even if
12797
+
the old workers disappear before their backlog drains.
12798
+
12799
+
The Durable Workflow server expresses a rollout on one task queue as a set of
12800
+
**build-id cohorts**. A cohort groups every worker registration that reported
12801
+
the same `build_id` when it called `POST /api/worker/register`. Workers that
12802
+
omit `build_id` form the **unversioned cohort**, which is the pre-rollout
12803
+
default and the one you migrate away from on the first cutover.
12804
+
12805
+
## Rollout State The Server Records
12806
+
12807
+
Each `(namespace, task_queue, build_id)` cohort carries the aggregated worker
12808
+
state (active, draining, stale, total counts) plus operator intent:
12809
+
12810
+
| Field | Purpose |
12811
+
| --- | --- |
12812
+
| `build_id` | The registered build identity. `null` identifies the unversioned cohort. |
12813
+
| `rollout_status` | Aggregate view of what the cohort will do with new tasks: `active`, `active_with_draining`, `draining`, `stale_only`, or `no_workers`. |
12814
+
| `drain_intent` | Operator intent for the cohort: `active` or `draining`. |
12815
+
| `drained_at` | When the cohort was first marked draining. Absent while the cohort is active. Repeated drain calls do not shift this timestamp. |
12816
+
| `active_worker_count` | Live workers currently accepting new tasks. |
12817
+
| `draining_worker_count` | Live workers that still hold in-flight tasks but no longer claim new work. |
12818
+
| `stale_worker_count` | Workers whose last heartbeat is older than the stale cutoff. |
12819
+
| `total_worker_count` | Sum of the three cohort populations. |
12820
+
| `runtimes`, `sdk_versions` | Distinct runtime and SDK version strings observed across the cohort. |
12821
+
| `last_heartbeat_at`, `first_seen_at` | Cohort-wide heartbeat window, useful for confirming quiet cohorts before deleting them. |
12822
+
12823
+
`drain_intent` is persistent: resuming a cohort, stopping every worker, or
12824
+
letting the cohort go stale does not silently flip it back to `active`. Only
12825
+
an explicit `POST .../build-ids/resume` clears `drain_intent` and `drained_at`.
12826
+
This keeps `rollout_status` honest even after a cohort has no live workers.
12827
+
12828
+
## Inspect The Rollout
12829
+
12830
+
Before draining or deleting a build, confirm which cohorts are still
0 commit comments