Skip to content

Commit a863f6d

Browse files
[Blog] Orchestrating GPU workloads on Kubernetes (#3161)
1 parent 7c7ed7f commit a863f6d

File tree

1 file changed

+315
-0
lines changed

1 file changed

+315
-0
lines changed

docs/blog/posts/kubernetes-beta.md

Lines changed: 315 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,315 @@
1+
---
2+
title: Orchestrating GPUs on Kubernetes clusters
3+
date: 2025-10-08
4+
description: "TBA"
5+
slug: kubernetes-beta
6+
image: https://dstack.ai/static-assets/static-assets/images/dstack-kubernetes.png
7+
categories:
8+
- Changelog
9+
---
10+
11+
# Orchestrating GPUs on Kubernetes clusters
12+
13+
`dstack` gives teams a unified way to run and manage GPU-native containers across clouds and on-prem environments — without requiring Kubernetes.
14+
At the same time, many organizations rely on Kubernetes as the foundation of their infrastructure.
15+
16+
To support these users, `dstack` is releasing the beta of its native Kubernetes integration.
17+
18+
<img src="https://dstack.ai/static-assets/static-assets/images/dstack-kubernetes.png" width="630"/>
19+
20+
<!-- more -->
21+
22+
This update allows `dstack` to orchestrate dev environments, distributed training, and inference workloads directly on Kubernetes clusters — combining the best of both worlds: an ML-tailored interface for ML teams together with the full Kubernetes ecosystem.
23+
24+
Read below to learn on how to use `dstack` with Kubernetes clusters.
25+
26+
## Creating a Kubernetes cluster
27+
28+
A major advantage of Kubernetes is its portability. Whether you’re using managed Kubernetes on a GPU cloud or an on-prem cluster, you can connect it to `dstack` and use it to orchestrate your GPU workloads.
29+
30+
!!! info "NVIDIA GPU Operator"
31+
For `dstack` to correctly detect GPUs in your Kubernetes cluster, the cluster must have the
32+
[NVIDIA GPU Operator :material-arrow-top-right-thin:{ .external }](https://docs.nvidia.com/datacenter/cloud-native/gpu-operator/latest/index.html){:target="_blank"} pre-installed.
33+
34+
### Nebius example
35+
36+
If you're using [Nebius :material-arrow-top-right-thin:{ .external }](https://nebius.com/){:target="_blank"}, the process of creating a Kubernetes cluster is straightforward.
37+
38+
Select the region of interest and click `Create cluster`.
39+
Once the cluster is created, switch to `Applications` and install the `nvidia-device-plugin` application — this can be done in one click.
40+
41+
<img src="https://dstack.ai/static-assets/static-assets/images/dstack-nebius-cluster-ui.png" width="750"/>
42+
43+
Next, go to `Node groups` and click `Create node group`. Choose the GPU type and count, disk size, and other options.
44+
If `dstack` doesn't run in the same network, enable public IPs so that `dstack` can access the nodes.
45+
46+
<img src="https://dstack.ai/static-assets/static-assets/images/dstack-nebius-node-group.png" width="750"/>
47+
48+
## Setting up the backend
49+
50+
Once the cluster is ready, you need to configure the `kubernetes` backend in the `dstack` server.
51+
To do this, add the corresponding configuration to your `~/.dstack/server/config.yml` file:
52+
53+
<div editor-title="~/.dstack/server/config.yml">
54+
55+
```yaml
56+
projects:
57+
- name: main
58+
backends:
59+
- type: kubernetes
60+
kubeconfig:
61+
filename: ~/.kube/config
62+
proxy_jump:
63+
hostname: 204.12.171.137
64+
port: 32000
65+
```
66+
67+
</div>
68+
69+
The configuration includes two main parts: the path to the kubeconfig file and the proxy-jump configuration.
70+
71+
If your cluster is on Nebius, click `How to connect` in the console — it will guide you through setting up the kubeconfig file.
72+
73+
!!! info "Proxy jump"
74+
To allow `dstack` to forward SSH traffic, it needs one node to act as a proxy jump.
75+
Choose any node in the cluster and specify its IP address and an accessible port in the backend configuration.
76+
77+
Now that the backend is configured, go ahead and restart the `dstack server`.
78+
79+
That’s it — you can now use all of `dstack`’s features, including [dev environments](../../docs/concepts/dev-environments.md), [tasks](../../docs/concepts/tasks.md), [services](../../docs/concepts/services.md), and [fleets](../../docs/concepts/fleets.md).
80+
81+
## Running a dev environment
82+
83+
A dev environment lets you provision an instance and connect to it from your desktop IDE.
84+
85+
<div editor-title="examples/.dstack.yml">
86+
87+
```yaml
88+
type: dev-environment
89+
# The name is optional, if not specified, generated randomly
90+
name: vscode
91+
92+
python: "3.11"
93+
94+
# Uncomment to use a custom Docker image
95+
#image: huggingface/trl-latest-gpu
96+
97+
ide: vscode
98+
99+
resources:
100+
gpu: H200
101+
```
102+
103+
</div>
104+
105+
To run a dev environment, pass the configuration to [`dstack apply`](../../docs/reference/cli/dstack/apply.md):
106+
107+
<div class="termy">
108+
109+
```shell
110+
$ dstack apply -f examples/.dstack.yml
111+
112+
# BACKEND RESOURCES INSTANCE TYPE PRICE
113+
1 kubernetes (-) cpu=127 mem=1574GB disk=871GB H200:141GB:8 computeinstance-u00hwk32d0xemhxhvj $0
114+
2 kubernetes (-) cpu=127 mem=1574GB disk=871GB H200:141GB:8 computeinstance-u00n24fb4q85yavc9z $0
115+
116+
Submit the run vscode? [y/n]: y
117+
118+
Launching `vscode`...
119+
---> 100%
120+
121+
To open in VS Code Desktop, use this link:
122+
vscode://vscode-remote/ssh-remote+vscode/workflow
123+
```
124+
125+
</div>
126+
127+
Dev environments support many [diffrent options](../../docs/concepts/dev-environments.md), including a custom Docker image, mounted repositories, idle timeout, min GPU utilization, and more.
128+
129+
## Running distributed training
130+
131+
Distributed training can be performed in `dstack` using [distributed tasks](../../docs/concepts/tasks.md#distributed-tasks).
132+
The configuration is similar to a dev environment, except it runs across multiple nodes.
133+
134+
### Creating a cluster fleet
135+
136+
Before running a distributed task, create a fleet with `placement` set to `cluster`:
137+
138+
<div editor-title="examples/misc/fleets/.dstack.yml">
139+
140+
```yaml
141+
type: fleet
142+
# The name is optional; if not specified, one is generated automatically
143+
name: my-k8s-fleet
144+
145+
# For `kubernetes`, `min` should be set to `0` since it can't pre-provision VMs.
146+
# Optionally, you can set the maximum number of nodes to limit scaling.
147+
nodes: 0..
148+
149+
placement: cluster
150+
151+
backends: [kuberenetes]
152+
153+
resources:
154+
# Specify requirements to filter nodes
155+
gpu: 1..8
156+
```
157+
158+
</div>
159+
160+
Then, create the fleet using the `dstack apply` command:
161+
162+
<div class="termy">
163+
164+
```shell
165+
$ dstack apply -f examples/misc/fleets/.dstack.yml
166+
167+
Provisioning...
168+
---> 100%
169+
170+
FLEET INSTANCE BACKEND GPU PRICE STATUS CREATED
171+
```
172+
173+
</div>
174+
175+
Once the fleet is created, you can run distributed tasks on it.
176+
177+
### NCCL tests example
178+
179+
Below is an example of using distributed tasks to run NCCL tests.
180+
It also demonstrates how to use mpirun with `dstack`:
181+
182+
<div editor-title="examples/clusters/nccl-tests/.dstack.yml">
183+
184+
```yaml
185+
type: task
186+
name: nccl-tests
187+
188+
nodes: 2
189+
190+
# The `startup_order` and `stop_criteria` properties are required for `mpirun`
191+
startup_order: workers-first
192+
stop_criteria: master-done
193+
194+
env:
195+
- NCCL_DEBUG=INFO
196+
commands:
197+
- |
198+
if [ $DSTACK_NODE_RANK -eq 0 ]; then
199+
mpirun \
200+
--allow-run-as-root \
201+
--hostfile $DSTACK_MPI_HOSTFILE \
202+
-n $DSTACK_GPUS_NUM \
203+
-N $DSTACK_GPUS_PER_NODE \
204+
--bind-to none \
205+
/opt/nccl-tests/build/all_reduce_perf -b 8 -e 8G -f 2 -g 1
206+
else
207+
sleep infinity
208+
fi
209+
210+
# The `kubernetes` backend requires it
211+
privileged: true
212+
213+
resources:
214+
gpu: nvidia:1..8
215+
shm_size: 16GB
216+
```
217+
218+
</div>
219+
220+
To run the configuration, use the [`dstack apply`](../../docs/reference/cli/dstack/apply.md) command.
221+
222+
<div class="termy">
223+
224+
```shell
225+
$ dstack apply -f examples/clusters/nccl-tests/.dstack.yml --fleet my-k8s-fleet
226+
227+
# BACKEND RESOURCES INSTANCE TYPE PRICE
228+
1 kubernetes (-) cpu=127 mem=1574GB disk=871GB H200:141GB:8 computeinstance-u00hwk32d0xemhxhvj $0
229+
2 kubernetes (-) cpu=127 mem=1574GB disk=871GB H200:141GB:8 computeinstance-u00n24fb4q85yavc9z $0
230+
231+
Submit the run nccl-tests? [y/n]: y
232+
```
233+
234+
</div>
235+
236+
### Distributed training example
237+
238+
Below is a minimal example of a distributed training configuration:
239+
240+
<div editor-title="examples/distributed-training/torchrun/.dstack.yml">
241+
242+
```yaml
243+
type: task
244+
name: train-distrib
245+
246+
nodes: 2
247+
248+
python: 3.12
249+
env:
250+
- NCCL_DEBUG=INFO
251+
commands:
252+
- git clone https://github.com/pytorch/examples.git pytorch-examples
253+
- cd pytorch-examples/distributed/ddp-tutorial-series
254+
- uv pip install -r requirements.txt
255+
- |
256+
torchrun \
257+
--nproc-per-node=$DSTACK_GPUS_PER_NODE \
258+
--node-rank=$DSTACK_NODE_RANK \
259+
--nnodes=$DSTACK_NODES_NUM \
260+
--master-addr=$DSTACK_MASTER_NODE_IP \
261+
--master-port=12345 \
262+
multinode.py 50 10
263+
264+
resources:
265+
gpu: 1..8
266+
shm_size: 16GB
267+
```
268+
269+
</div>
270+
271+
To run the configuration, use the [`dstack apply`](../../docs/reference/cli/dstack/apply.md) command.
272+
273+
<div class="termy">
274+
275+
```shell
276+
$ dstack apply -f examples/distributed-training/torchrun/.dstack.yml --fleet my-k8s-fleet
277+
278+
# BACKEND RESOURCES INSTANCE TYPE PRICE
279+
1 kubernetes (-) cpu=127 mem=1574GB disk=871GB H200:141GB:8 computeinstance-u00hwk32d0xemhxhvj $0
280+
2 kubernetes (-) cpu=127 mem=1574GB disk=871GB H200:141GB:8 computeinstance-u00n24fb4q85yavc9z $0
281+
282+
Submit the run nccl-tests? [y/n]: y
283+
```
284+
285+
</div>
286+
287+
For more examples, explore the [distirbuted training](../../examples.md#distributed-training) section in the docs.
288+
289+
## FAQ
290+
291+
### VM-based backends vs Kubernetes backend
292+
293+
While the `kubernetes` backend is preferred if your team depends on the Kubernetes ecosystem,
294+
the [VM-based](../../docs/concepts/backends.md#vm-based) backends leverage native integration with top GPU clouds (including Nebius and others) and may be a better choice if Kubernetes isn’t required.
295+
296+
VM-based backends also offer more granular control over cluster provisioning.
297+
298+
> Note that `dstack` doesn’t yet support Kubernetes clusters with auto-scaling enabled (coming soon), which can be another reason to use VM-based backends.
299+
300+
### SSH fleets vs Kubernetes backend
301+
302+
If you’re using on-prem servers and Kubernetes isn’t a requirement, [SSH fleets](../../docs/concepts/fleets.md#ssh) may be simpler.
303+
They provide a lightweight and flexible alternative.
304+
305+
### AMD GPUs
306+
307+
Support for AMD GPUs is coming soon — our team is actively working on it right now.
308+
309+
!!! info "What's next"
310+
1. Check [Quickstart](../../docs/quickstart.md)
311+
2. Explore [dev environments](../../docs/concepts/dev-environments.md),
312+
[tasks](../../docs/concepts/tasks.md), [services](../../docs/concepts/services.md),
313+
and [fleets](../../docs/concepts/fleets.md)
314+
3. Read the the [clusters](../../docs/guides/clusters.md) guide
315+
4. Join [Discord :material-arrow-top-right-thin:{ .external }](https://discord.gg/u8SmfwPpMd){:target="_blank"}

0 commit comments

Comments
 (0)