You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/tutorials/task_k8s.md
+79-28Lines changed: 79 additions & 28 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -7,22 +7,36 @@ This offloads heavy computation to dedicated pods and keeps the consumer event l
7
7
8
8
The switch is automatic. When `KUBERNETES_SERVICE_HOST` is set (which Kubernetes injects into every pod) and the `k8s` extra is installed, any task declared with `cpu_bound=True` will spawn a Kubernetes Job instead of a subprocess. No code change is required in the task itself.
9
9
10
-
The Job reuses the **targeted container's spec**from the task consumer deployment (image, resource limits, volume mounts, security context, image pull policy, and everything else). Only that container is included in the Job pod — sidecars and other containers from the deployment are dropped. The **pod-level init containers** and **volumes** are preserved as-is, so any setup performed by init containers (e.g. installing TLS certificates) is reproduced in the Job pod. Only these fields on the main container are overridden:
10
+
The Job pod template is derived from the **task consumer deployment**. The implementation reads the deployment, locates the target container, and builds a Job spec from it. This means the Job inherits most of the container's configuration from the deployment — image, image pull policy, volume mounts, security context, and everything else — while only overriding the fields necessary to run the task.
11
11
12
-
| Field | Value |
12
+
13
+
**Inherited from the deployment container (unchanged):**
14
+
15
+
- Container image and image pull policy
16
+
- Volume mounts (and pod-level volumes)
17
+
- Environment variables (the task's env vars are appended, never replaced)
18
+
- Security context
19
+
- Everything else not listed below
20
+
21
+
**Overridden or cleared:**
22
+
23
+
| Field | Value in the Job |
13
24
|---|---|
14
-
|`command`|same as the deployment container, with any trailing `serve` token removed |
25
+
|`command`|Same as the deployment, but any trailing `serve` token is removed |
|`env`| same as the deployment container, with `TASK_MANAGER_SPAWN=true` appended |
17
-
|`liveness_probe`| cleared — probes are not meaningful for Job pods and could prematurely kill a long-running task |
18
-
|`readiness_probe`| cleared |
27
+
|`env`| Inherited from the deployment, then `TASK_MANAGER_SPAWN=true` appended, then any task-level `env` vars appended (see [Injecting environment variables](#injecting-environment-variables)) |
28
+
|`resources`| Inherited from the deployment unless overridden via [`K8sConfig.resources`](#configuration)|
29
+
|`liveness_probe`| Cleared — probes are not meaningful for Job pods and would prematurely kill long-running tasks |
30
+
|`readiness_probe`| Cleared |
19
31
20
-
The `TASK_MANAGER_SPAWN=true` environment variable signals to the process running inside the Job that it is executing as a CPU-bound worker rather than a long-lived consumer.
32
+
**Other containers** (sidecars) from the deployment are dropped — only the target container runs in the Job pod. **Pod-level init containers** and **volumes** are preserved, so any setup performed at pod startup (e.g. installing TLS certificates) is reproduced in the Job pod.
33
+
34
+
`TASK_MANAGER_SPAWN=true` signals to the process inside the Job that it is a CPU-bound worker rather than a long-lived consumer.
21
35
22
36
The Job is created in the same namespace as the consumer with:
23
37
24
-
-`backoff_limit: 0`: a failed pod is never retried; the error is propagated back to the task consumer instead
25
-
-`ttlSecondsAfterFinished`: set from `K8sConfig.job_ttl`, the Job and its pods are cleaned up automatically after completion (default 300 s)
38
+
-`backoff_limit: 0` — a failed pod is never retried; the error is propagated back to the task consumer instead
39
+
-`ttlSecondsAfterFinished` — set from [`K8sConfig.job_ttl`](#configuration), the Job and its pods are cleaned up automatically after completion (default 300 s)
26
40
-`restartPolicy: Never` on the pod template
27
41
28
42
The job name is derived from the task name and the first 7 characters of the run ID, slugified and capped at 63 characters to comply with Kubernetes DNS label requirements:
@@ -31,7 +45,7 @@ The job name is derived from the task name and the first 7 characters of the run
31
45
task-<slugified-task-name>-<short-run-id>
32
46
```
33
47
34
-
Once the Job is created, the consumer polls its status every `sleep` seconds until it either succeeds or fails.
48
+
Once the Job is created, the consumer polls its status every [`K8sConfig.sleep`](#configuration) seconds until it either succeeds or fails.
35
49
36
50
## Installation
37
51
@@ -48,27 +62,27 @@ from fluid.scheduler import task, TaskRun
48
62
49
63
@task(cpu_bound=True)
50
64
asyncdefheavy_calculation(ctx: TaskRun) -> None:
51
-
# heavy CPU work here, runs in a k8s Job when inside a cluster
65
+
# heavy CPU work here — runs in a k8s Job when inside a cluster,
66
+
# or in a local subprocess when running outside one
52
67
...
53
68
```
54
69
55
70
## Configuration
56
71
57
-
K8s behaviour can be tuned per-task via the `k8s_config` argument which
58
-
accepts a [K8sConfig][fluid.scheduler.K8sConfig] object:
72
+
K8s behaviour can be tuned per-task via the `k8s_config` argument, which accepts a [K8sConfig][fluid.scheduler.K8sConfig] object:
59
73
60
74
```python
61
75
from fluid.scheduler import task, TaskRun, K8sConfig
62
76
63
77
@task(
64
78
cpu_bound=True,
65
79
k8s_config=K8sConfig(
66
-
namespace="workers", # namespace where the Job is created
67
-
deployment="fluid-task", # deployment to copy the container spec from
68
-
container="main", # container name inside the deployment
69
-
job_ttl=600, # seconds to keep the Job after completion
70
-
sleep=2.0, # polling interval while waiting for the Job
71
-
resources={ # optional Kubernetes resource limits and requests for the container
80
+
namespace="workers", # namespace where the Job is created
81
+
deployment="fluid-task", # deployment to copy the container spec from
82
+
container="main", # container name inside the deployment
83
+
job_ttl=600, # seconds to keep the Job after completion (default 300)
84
+
sleep=2.0, # polling interval while waiting for the Job (default 2.0)
85
+
resources={ # override the container's resource spec (default: inherited from deployment)
If `k8s_config` is omitted, or any of the optional fields are not provided, the following environment variables are used:
95
+
### K8sConfig fields
96
+
97
+
| Field | Type | Default | Description |
98
+
|---|---|---|---|
99
+
|`namespace`|`str`|`FLUID_TASK_CONSUMER_K8S_NAMESPACE` or `"default"`| Kubernetes namespace where the Job is created |
100
+
|`deployment`|`str`|`FLUID_TASK_CONSUMER_K8S_DEPLOYMENT` or `"fluid-task"`| Deployment to read the container spec from |
101
+
|`container`|`str`|`FLUID_TASK_CONSUMER_K8S_CONTAINER` or `"main"`| Container name within the deployment |
102
+
|`resources`|`K8sResourceRequirements \| None`|`None`| Resource limits/requests for the Job container. If `None`, the deployment's existing resource spec is used unchanged |
103
+
|`job_ttl`|`int`|`FLUID_TASK_CONSUMER_K8S_JOB_TTL` or `300`| Seconds to retain the Job after completion before automatic cleanup |
104
+
|`sleep`|`float`|`FLUID_TASK_CONSUMER_K8S_SLEEP` or `2.0`| Polling interval in seconds while waiting for the Job to finish |
105
+
106
+
All `K8sConfig` fields have defaults drawn from environment variables, so a minimal deployment only needs to set those variables rather than hard-coding values per task.
107
+
108
+
If `k8s_config` is omitted entirely, a [K8sConfig][fluid.scheduler.K8sConfig] instance with all defaults is used.
109
+
110
+
### Resource overrides
111
+
112
+
The `resources` field accepts a [K8sResourceRequirements][fluid.scheduler.K8sResourceRequirements] dict with optional `limits` and `requests` keys:
113
+
114
+
```python
115
+
resources={
116
+
"limits": {"cpu": "4", "memory": "8Gi"},
117
+
"requests": {"cpu": "500m", "memory": "1Gi"},
118
+
}
119
+
```
120
+
121
+
When not provided (the default), the Job container inherits the resource spec from the deployment container unchanged. This is useful for tasks that need more CPU or memory than the consumer pod is allocated.
122
+
123
+
## Injecting environment variables
124
+
125
+
Extra environment variables can be injected into the Job (or subprocess, when running outside a cluster) using the `env` argument on the [`@task`][fluid.scheduler.task] decorator:
|`FLUID_TASK_CONSUMER_K8S_DEPLOYMENT`|`fluid-task`| Deployment name |
87
-
|`FLUID_TASK_CONSUMER_K8S_CONTAINER`|`main`| Container name |
88
-
|`FLUID_TASK_CONSUMER_K8S_JOB_TTL`|`300`| Job TTL in seconds |
89
-
|`FLUID_TASK_CONSUMER_K8S_SLEEP`|`2.0`| Polling interval in seconds |
140
+
These variables are appended to the environment after the deployment's existing env vars and `TASK_MANAGER_SPAWN=true`, so they can override anything set in the deployment if needed.
90
141
91
-
If `resources` is not provided, the container's resource spec from the deployment is used as-is. If provided, it must be a [K8sResourceRequirements][fluid.scheduler.K8sResourceRequirements] with optional `limits` and `requests` keys, each mapping resource names (e.g. `"cpu"`, `"memory"`) to their string values.
142
+
For subprocess execution (outside a cluster), they are merged into the spawned process's environment the same way, making task definitions portable across both runtimes without any conditional logic.
Copy file name to clipboardExpand all lines: docs/tutorials/task_queue.md
+42-26Lines changed: 42 additions & 26 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -94,30 +94,6 @@ if is_in_cpu_process():
94
94
95
95
Stdout and stderr from the subprocess are streamed back to the consumer in real time, so logs produced by the task appear in the consumer's output.
96
96
97
-
#### Timeout
98
-
99
-
CPU bound tasks respect the `timeout_seconds` parameter. If the subprocess has not finished within the timeout, it is killed and the task run transitions to the `failure` state.
100
-
101
-
```python
102
-
@task(cpu_bound=True, timeout_seconds=300)
103
-
asyncdefslow_calculation(ctx: TaskRun) -> None:
104
-
...
105
-
```
106
-
107
-
The default timeout is **60 seconds**. For long-running tasks make sure to raise this to an appropriate value.
108
-
109
-
#### Concurrency control
110
-
111
-
Use `max_concurrency` to limit how many instances of a CPU bound task can run simultaneously. This is useful to prevent exhausting system resources when many tasks are queued at the same time.
112
-
113
-
```python
114
-
@task(cpu_bound=True, max_concurrency=2)
115
-
asyncdefheavy_calculation(ctx: TaskRun) -> None:
116
-
...
117
-
```
118
-
119
-
A value of `0` (the default) means no limit.
120
-
121
97
#### Kubernetes
122
98
123
99
When the consumer is running inside a Kubernetes cluster, CPU bound tasks can be dispatched as Kubernetes Jobs instead of local subprocesses. See [K8s Jobs](task_k8s.md) for more details.
All tasks, both IO and CPU bound, respect the `timeout_seconds` parameter (default **60 seconds**). The timeout is measured from when the task starts executing.
132
+
133
+
For IO bound tasks, `asyncio` raises a `TimeoutError` if the coroutine has not completed within the timeout, and the task run transitions to the `failure` state. For CPU bound tasks, the subprocess (or Kubernetes Job) is killed and the run likewise transitions to `failure`.
134
+
135
+
```python
136
+
from fluid.scheduler import task, TaskRun
137
+
138
+
@task(timeout_seconds=300)
139
+
asyncdefslow_io_task(ctx: TaskRun) -> None:
140
+
...
141
+
142
+
@task(cpu_bound=True, timeout_seconds=300)
143
+
asyncdefslow_cpu_task(ctx: TaskRun) -> None:
144
+
...
145
+
```
146
+
147
+
For long-running tasks make sure to raise `timeout_seconds` to an appropriate value.
148
+
149
+
## Concurrency control
150
+
151
+
Use `max_concurrency` to limit how many instances of a task can run simultaneously. This applies to both IO and CPU bound tasks, and is useful to avoid overwhelming downstream services or exhausting system resources when many tasks are queued at once.
152
+
153
+
```python
154
+
from fluid.scheduler import task, TaskRun
155
+
156
+
@task(max_concurrency=5)
157
+
asyncdeffetch_data(ctx: TaskRun) -> None:
158
+
...
159
+
160
+
@task(cpu_bound=True, max_concurrency=2)
161
+
asyncdefheavy_calculation(ctx: TaskRun) -> None:
162
+
...
163
+
```
164
+
165
+
A value of `0` (the default) means no limit.
166
+
167
+
When the limit is reached the task run transitions to the `rate_limited` state. To automatically retry rate-limited tasks, combine `max_concurrency` with `rate_limit_retry`. See [Task Retry](task_retry.md) for details.
168
+
153
169
## Aborting a task
154
170
155
171
Any task — IO or CPU bound — can signal a deliberate, non-error cancellation by calling [ctx.abort()][fluid.scheduler.TaskRun.abort]:
@@ -174,8 +190,8 @@ When this happens the task run transitions to the `aborted` [TaskState][fluid.sc
174
190
175
191
For CPU-bound tasks (subprocess or Kubernetes Job) the task function runs in a **separate process**, so the abort signal must be relayed back to the consumer. The mechanism works as follows:
176
192
177
-
1. The inner process calls `ctx.abort()`, which raises [TaskAbortedError][fluid.scheduler.TaskAbortedError].
193
+
1. The inner process calls `ctx.abort()`, which raises [TaskAbortedError][fluid.scheduler.errors.TaskAbortedError].
178
194
2. The consumer running *inside* that process catches the error and writes the reason to a short-lived Redis key (60-second TTL).
179
-
3. After the subprocess or k8s Job exits, the outer consumer reads the Redis key. If an abort reason is found it re-raises [TaskAbortedError][fluid.scheduler.TaskAbortedError], marking the run as `aborted` instead of `success`.
195
+
3. After the subprocess or k8s Job exits, the outer consumer reads the Redis key. If an abort reason is found it re-raises [TaskAbortedError][fluid.scheduler.errors.TaskAbortedError], marking the run as `aborted` instead of `success`.
180
196
181
197
This means a CPU-bound task that aborts itself is always correctly reflected as `aborted` in the task run state, regardless of whether it ran locally or as a Kubernetes Job.
Copy file name to clipboardExpand all lines: docs/tutorials/workers.md
+17Lines changed: 17 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -23,3 +23,20 @@ To shut down a worker there are few possibilities.
23
23
24
24
* Direct call to the async [Worker.shutdown][fluid.utils.worker.Worker.shutdown] method which will trigger the graceful shutdown and wait for the worker to finish its work.
25
25
* Call the [Worker.gracefully_stop][fluid.utils.worker.Worker.gracefully_stop] method which will trigger the graceful shutdown. Importantly, this method does not wait for the worker to finish its work, ti simply transition from the [WorkerState.RUNNING][fluid.utils.worker.WorkerState.RUNNING] to [WorkerState.STOPPING][fluid.utils.worker.WorkerState.STOPPING] state. To wait for the worker exit one should call the async [Worker.wait_for_shutdown][fluid.utils.worker.Worker.wait_for_shutdown] method (as in the example above)
26
+
27
+
## Async Context Manager
28
+
29
+
[Worker][fluid.utils.worker.Worker] implements the async context manager protocol. Entering the context calls [Worker.startup][fluid.utils.worker.Worker.startup] and exiting it calls [Worker.shutdown][fluid.utils.worker.Worker.shutdown], so the `async with` pattern is the most concise way to manage the full lifecycle:
30
+
31
+
```python
32
+
asyncwith MyWorker() as worker:
33
+
# worker is running here
34
+
...
35
+
# worker is fully shut down here
36
+
```
37
+
38
+
Resources that the worker needs for its entire lifetime can be opened and closed inside [Worker.run][fluid.utils.worker.Worker.run] using normal `async with` statements — no subclassing of lifecycle hooks is required. The example below subclasses [QueueConsumer][fluid.utils.worker.QueueConsumer] to build a worker that accepts text items via [send][fluid.utils.worker.QueueConsumer.send], opens an [AsyncAnthropic](https://github.com/anthropics/anthropic-sdk-python) client for the duration of `run`, and streams a one-sentence summary from Claude for each item:
0 commit comments