Skip to content

Commit 0373268

Browse files
[Docs] Kubernetes guide
1 parent daa3d03 commit 0373268

2 files changed

Lines changed: 115 additions & 3 deletions

File tree

docs/docs/guides/kubernetes.md

Lines changed: 111 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,111 @@
1+
# Kubernetes
2+
3+
While `dstack` can run natively without Kubernetes on both cloud (via cloud [backends](../concepts/backends.md)) and on-prem
4+
(via [SSH fleets](../concepts/fleets.md#ssh)), it also supports running dev environments, tasks, and services directly on Kubernetes clusters through its native integration — the `kubernetes` backend.
5+
6+
## Setting up the backend
7+
8+
To use the `kubernetes` backend with `dstack`, you need to configure it with the path to the kubeconfig file, the IP address of any node in the cluster, and the port that `dstack` will use for proxying SSH traffic.
9+
This configuration is defined in the `~/.dstack/server/config.yml` file:
10+
11+
<div editor-title="~/.dstack/server/config.yml">
12+
13+
```yaml
14+
projects:
15+
- name: main
16+
backends:
17+
- type: kubernetes
18+
kubeconfig:
19+
filename: ~/.kube/config
20+
proxy_jump:
21+
hostname: 204.12.171.137
22+
port: 32000
23+
```
24+
25+
</div>
26+
27+
### Proxy jump
28+
29+
To allow the `dstack` server and CLI to access runs via SSH, `dstack` requires a node that acts as a jump host to proxy SSH traffic into containers.
30+
31+
To configure this node, specify `hostname` and `port` under the `proxy_jump` property:
32+
33+
- `hostname` — the IP address of any cluster node selected as the jump host. Both the `dstack` server and CLI must be able to reach it. This node can be either a GPU node or a CPU-only node — it makes no difference.
34+
- `port` — any accessible port on that node, which `dstack` uses to forward SSH traffic.
35+
36+
No additional setup is required — `dstack` configures and manages the proxy automatically.
37+
38+
### NVIDIA GPU Operator
39+
40+
> For `dstack` to correctly detect GPUs in your Kubernetes cluster, the cluster must have the
41+
[NVIDIA GPU Operator :material-arrow-top-right-thin:{ .external }](https://docs.nvidia.com/datacenter/cloud-native/gpu-operator/latest/index.html){:target="_blank"} pre-installed.
42+
43+
After the backend is set up, you interact with `dstack` just as you would with other backends or SSH fleets. You can run dev environments, tasks, and services.
44+
45+
## Fleets
46+
47+
### Clusters
48+
49+
If you’d like to run [distributed tasks](../concepts/tasks.md#distributed-tasks) with the `kubernetes` backend, you first need to create a fleet with `placement` set to `cluster`:
50+
51+
<div editor-title="examples/misc/fleets/.dstack.yml">
52+
53+
```yaml
54+
type: fleet
55+
# The name is optional; if not specified, one is generated automatically
56+
name: my-k8s-fleet
57+
58+
# For `kubernetes`, `min` should be set to `0` since it can't pre-provision VMs.
59+
# Optionally, you can set the maximum number of nodes to limit scaling.
60+
nodes: 0..
61+
62+
placement: cluster
63+
64+
backends: [kuberenetes]
65+
66+
resources:
67+
# Specify requirements to filter nodes
68+
gpu: 1..8
69+
```
70+
71+
</div>
72+
73+
Then, create the fleet using the `dstack apply` command:
74+
75+
<div class="termy">
76+
77+
```shell
78+
$ dstack apply -f examples/misc/fleets/.dstack.yml
79+
80+
Provisioning...
81+
---> 100%
82+
83+
FLEET INSTANCE BACKEND GPU PRICE STATUS CREATED
84+
```
85+
86+
</div>
87+
88+
Once the fleet is created, you can run [distributed tasks](../concepts/tasks.md#distributed-tasks). `dstack` takes care of orchestration automatically.
89+
90+
For more details on clusters, see the [corresponding guide](clusters.md).
91+
92+
> Fleets with `placement` set to `cluster` can be used not only for distributed tasks, but also for dev environments, single-node tasks, and services.
93+
> Since Kubernetes clusters are interconnected by default, you can always set `placement` to `cluster`.
94+
95+
!!! info "Fleets"
96+
It’s generally recommended to create [fleets](../concepts/fleets.md) even if you don’t plan to run distributed tasks.
97+
98+
## FAQ
99+
100+
??? info "Is managed Kubernetes with auto-scaling supported?"
101+
Managed Kubernetes is supported. However, the `kubernetes` backend can only run on pre-provisioned nodes.
102+
Support for auto-scalable Kubernetes clusters is coming soon—you can track progress in the corresponding [issue :material-arrow-top-right-thin:{ .external }](https://github.com/dstackai/dstack/issues/3126){:target="_blank"}.
103+
104+
If on-demand provisioning is important, we recommend using [cloud backends](../concepts/backends.md) instead of the `kubernetes` backend, as cloud backends already support auto-scaling.
105+
106+
??? info "When should I use the Kubernetes backend?"
107+
Choose the `kubernetes` backend if your GPUs already run on Kubernetes and your team depends on its ecosystem and tooling.
108+
109+
If your priority is orchestrating cloud GPUs and Kubernetes isn’t a must, [cloud backends](../concepts/backends.md) are a better fit thanks to their native cloud integration.
110+
111+
For on-prem GPUs where Kubernetes is optional, [SSH fleets](../concepts/fleets.md#ssh) provide a simpler and more lightweight alternative.

mkdocs.yml

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -229,12 +229,13 @@ nav:
229229
- Projects: docs/concepts/projects.md
230230
- Gateways: docs/concepts/gateways.md
231231
- Guides:
232-
- Protips: docs/guides/protips.md
233-
- Metrics: docs/guides/metrics.md
234232
- Clusters: docs/guides/clusters.md
233+
- Kubernetes: docs/guides/kubernetes.md
235234
- Server deployment: docs/guides/server-deployment.md
236-
- Plugins: docs/guides/plugins.md
237235
- Troubleshooting: docs/guides/troubleshooting.md
236+
- Metrics: docs/guides/metrics.md
237+
- Protips: docs/guides/protips.md
238+
- Plugins: docs/guides/plugins.md
238239
- Reference:
239240
- .dstack.yml:
240241
- dev-environment: docs/reference/dstack.yml/dev-environment.md

0 commit comments

Comments
 (0)