[Docs] Kubernetes guide

peterschmidt85 · peterschmidt85 · commit 037326834633 · 2025-09-30T10:47:12.000+02:00
diff --git a/docs/docs/guides/kubernetes.md b/docs/docs/guides/kubernetes.md
@@ -0,0 +1,111 @@
+# Kubernetes
+
+While `dstack` can run natively without Kubernetes on both cloud (via cloud [backends](../concepts/backends.md)) and on-prem 
+(via [SSH fleets](../concepts/fleets.md#ssh)), it also supports running dev environments, tasks, and services directly on Kubernetes clusters through its native integration — the `kubernetes` backend.
+
+## Setting up the backend
+
+To use the `kubernetes` backend with `dstack`, you need to configure it with the path to the kubeconfig file, the IP address of any node in the cluster, and the port that `dstack` will use for proxying SSH traffic. 
+This configuration is defined in the `~/.dstack/server/config.yml` file:
+
+<div editor-title="~/.dstack/server/config.yml">
+
+```yaml
+projects:
+- name: main
+    backends:
+    - type: kubernetes
+        kubeconfig:
+            filename: ~/.kube/config
+        proxy_jump:
+            hostname: 204.12.171.137
+            port: 32000
+```
+
+</div>
+
+### Proxy jump
+
+To allow the `dstack` server and CLI to access runs via SSH, `dstack` requires a node that acts as a jump host to proxy SSH traffic into containers.  
+
+To configure this node, specify `hostname` and `port` under the `proxy_jump` property:  
+
+- `hostname` — the IP address of any cluster node selected as the jump host. Both the `dstack` server and CLI must be able to reach it. This node can be either a GPU node or a CPU-only node — it makes no difference.  
+- `port` — any accessible port on that node, which `dstack` uses to forward SSH traffic.  
+
+No additional setup is required — `dstack` configures and manages the proxy automatically.
+
+### NVIDIA GPU Operator
+
+> For `dstack` to correctly detect GPUs in your Kubernetes cluster, the cluster must have the
+[NVIDIA GPU Operator :material-arrow-top-right-thin:{ .external }](https://docs.nvidia.com/datacenter/cloud-native/gpu-operator/latest/index.html){:target="_blank"} pre-installed.
+
+After the backend is set up, you interact with `dstack` just as you would with other backends or SSH fleets. You can run dev environments, tasks, and services.
+
+## Fleets
+
+### Clusters
+
+If you’d like to run [distributed tasks](../concepts/tasks.md#distributed-tasks) with the `kubernetes` backend, you first need to create a fleet with `placement` set to `cluster`:
+
+<div editor-title="examples/misc/fleets/.dstack.yml">
+    
+    ```yaml
+    type: fleet
+    # The name is optional; if not specified, one is generated automatically
+    name: my-k8s-fleet
+    
+    # For `kubernetes`, `min` should be set to `0` since it can't pre-provision VMs.
+    # Optionally, you can set the maximum number of nodes to limit scaling.
+    nodes: 0..
+
+    placement: cluster
+    
+    backends: [kuberenetes]
+    
+    resources:
+      # Specify requirements to filter nodes
+      gpu: 1..8
+    ```
+    
+</div>
+
+Then, create the fleet using the `dstack apply` command:
+
+<div class="termy">
+
+```shell
+$ dstack apply -f examples/misc/fleets/.dstack.yml
+
+Provisioning...
+---> 100%
+
+ FLEET     INSTANCE  BACKEND              GPU             PRICE    STATUS  CREATED 
+```
+
+</div>
+
+Once the fleet is created, you can run [distributed tasks](../concepts/tasks.md#distributed-tasks). `dstack` takes care of orchestration automatically.
+
+For more details on clusters, see the [corresponding guide](clusters.md).
+
+> Fleets with `placement` set to `cluster` can be used not only for distributed tasks, but also for dev environments, single-node tasks, and services.
+> Since Kubernetes clusters are interconnected by default, you can always set `placement` to `cluster`.
+
+!!! info "Fleets"
+    It’s generally recommended to create [fleets](../concepts/fleets.md) even if you don’t plan to run distributed tasks.  
+
+## FAQ
+
+??? info "Is managed Kubernetes with auto-scaling supported?"
+    Managed Kubernetes is supported. However, the `kubernetes` backend can only run on pre-provisioned nodes.  
+    Support for auto-scalable Kubernetes clusters is coming soon—you can track progress in the corresponding [issue :material-arrow-top-right-thin:{ .external }](https://github.com/dstackai/dstack/issues/3126){:target="_blank"}.
+
+    If on-demand provisioning is important, we recommend using [cloud backends](../concepts/backends.md) instead of the `kubernetes` backend, as cloud backends already support auto-scaling.
+    
+??? info "When should I use the Kubernetes backend?"
+    Choose the `kubernetes` backend if your GPUs already run on Kubernetes and your team depends on its ecosystem and tooling. 
+
+    If your priority is orchestrating cloud GPUs and Kubernetes isn’t a must, [cloud backends](../concepts/backends.md) are a better fit thanks to their native cloud integration.
+
+    For on-prem GPUs where Kubernetes is optional, [SSH fleets](../concepts/fleets.md#ssh) provide a simpler and more lightweight alternative.
diff --git a/mkdocs.yml b/mkdocs.yml
@@ -229,12 +229,13 @@ nav:
           - Projects: docs/concepts/projects.md
           - Gateways: docs/concepts/gateways.md
       - Guides:
-          - Protips: docs/guides/protips.md
-          - Metrics: docs/guides/metrics.md
           - Clusters: docs/guides/clusters.md
+          - Kubernetes: docs/guides/kubernetes.md
           - Server deployment: docs/guides/server-deployment.md
-          - Plugins: docs/guides/plugins.md
           - Troubleshooting: docs/guides/troubleshooting.md
+          - Metrics: docs/guides/metrics.md
+          - Protips: docs/guides/protips.md
+          - Plugins: docs/guides/plugins.md
       - Reference:
         - .dstack.yml:
             - dev-environment: docs/reference/dstack.yml/dev-environment.md