You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/blog/posts/amd-mi300x-inference-benchmark.md
+3-3Lines changed: 3 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -10,7 +10,7 @@ categories:
10
10
11
11
# Benchmarking Llama 3.1 405B on 8x AMD MI300X GPUs
12
12
13
-
At `dstack`, we've been adding support for AMD GPUs with [SSH fleets](../../docs/concepts/fleets.md#ssh),
13
+
At `dstack`, we've been adding support for AMD GPUs with [SSH fleets](../../docs/concepts/fleets.md#ssh-fleets),
14
14
so we saw this as a great chance to test our integration by benchmarking AMD GPUs. Our friends at
15
15
[Hot Aisle :material-arrow-top-right-thin:{ .external }](https://hotaisle.xyz/){:target="_blank"}, who build top-tier
16
16
bare metal compute for AMD GPUs, kindly provided the hardware for the benchmark.
@@ -34,7 +34,7 @@ Here is the spec of the bare metal machine we got:
34
34
??? info "Set up an SSH fleet"
35
35
36
36
Hot Aisle provided us with SSH access to the machine. To make it accessible via `dstack`,
37
-
we created an [SSH fleet](../../docs/concepts/fleets.md#ssh) using the following configuration:
37
+
we created an [SSH fleet](../../docs/concepts/fleets.md#ssh-fleets) using the following configuration:
38
38
39
39
<div editor-title="hotaisle.dstack.yml">
40
40
@@ -215,7 +215,7 @@ If you have questions, feedback, or want to help improve the benchmark, please r
215
215
is the primary sponsor of this benchmark, and we are sincerely grateful for their hardware and support.
216
216
217
217
If you'd like to use top-tier bare metal compute with AMD GPUs, we recommend going
218
-
with Hot Aisle. Once you gain access to a cluster, it can be easily accessed via `dstack`'s [SSH fleet](../../docs/concepts/fleets.md#ssh) easily.
218
+
with Hot Aisle. Once you gain access to a cluster, it can be easily accessed via `dstack`'s [SSH fleet](../../docs/concepts/fleets.md#ssh-fleets) easily.
219
219
220
220
### RunPod
221
221
If you’d like to use on-demand compute with AMD GPUs at affordable prices, you can configure `dstack` to
Copy file name to clipboardExpand all lines: docs/docs/concepts/backends.md
+2-3Lines changed: 2 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -9,9 +9,8 @@ They can be configured via `~/.dstack/server/config.yml` or through the [project
9
9
*[Container-based](#container-based) – use either `dstack`'s native integration with cloud providers or Kubernetes to orchestrate container-based runs; provisioning in this case is delegated to the cloud provider or Kubernetes.
10
10
*[On-prem](#on-prem) – use `dstack`'s native support for on-prem servers without needing Kubernetes.
11
11
12
-
??? info "dstack Sky"
13
-
If you're using [dstack Sky :material-arrow-top-right-thin:{ .external }](https://sky.dstack.ai){:target="_blank"},
14
-
you can either configure your own backends or use the pre-configured backend that gives you access to compute from the GPU marketplace.
12
+
!!! info "dstack Sky"
13
+
If you're using [dstack Sky :material-arrow-top-right-thin:{ .external }](https://sky.dstack.ai){:target="_blank"}, backend configuration is optional. dstack Sky lets you use pre-configured backends to access GPU marketplace.
Copy file name to clipboardExpand all lines: docs/docs/concepts/fleets.md
+46-55Lines changed: 46 additions & 55 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,36 +4,41 @@ Fleets act both as pools of instances and as templates for how those instances a
4
4
5
5
`dstack` supports two kinds of fleets:
6
6
7
-
*[Standard fleets](#standard) – dynamically provisioned through configured backends; they are supported with any type of backends: [VM-based](backends.md#vm-based), [container-based](backends.md#container-based), and [Kubernetes](backends.md#kubernetes)
7
+
*[Backend fleets](#backend) – dynamically provisioned through configured backends; they are supported with any type of backends: [VM-based](backends.md#vm-based) and [container-based](backends.md#container-based) (incl. [`kubernetes`](backends.md#kubernetes))
8
8
*[SSH fleets](#ssh) – created using on-prem servers; do not require backends
9
9
10
-
## Standard fleets { #standard }
10
+
When you run `dstack apply` to start a dev environment, task, or service, `dstack` will reuse idle instances from an existing fleet whenever available.
11
11
12
-
When you run `dstack apply` to start a dev environment, task, or service, `dstack` will reuse idle instances
13
-
from an existing fleet whenever available.
12
+
## Backend fleets { #backend-fleets }
14
13
15
-
If no fleet meets the requirements or has idle capacity, `dstack` can create a new fleet on the fly.
16
-
However, it’s generally better to define fleets explicitly in configuration files for greater control.
14
+
If you configured [backends](backends.md), `dstack` can provision fleets on the fly.
15
+
However, it’s recommended to define fleets explicitly.
17
16
18
17
### Apply a configuration
19
18
20
-
Define a fleet configuration as a YAML file in your project directory. The file must have a
19
+
To create a backend fleet, define a configuration as a YAML file in your project directory. The file must have a
21
20
`.dstack.yml` extension (e.g. `.dstack.yml` or `fleet.dstack.yml`).
my-fleet 0 gcp (europe-west-1) L4:24GB (spot) $0.1624 idle 3 mins ago
53
-
1 gcp (europe-west-1) L4:24GB (spot) $0.1624 idle 3 mins ago
56
+
FLEET INSTANCE BACKEND GPU PRICE STATUS CREATED
57
+
my-fleet - - - - - -
54
58
```
55
59
56
60
</div>
57
61
58
-
Once the status of instances changes to `idle`, they can be used by dev environments, tasks, and services.
62
+
`dstack` always keeps the minimum number of nodes provisioned. Additional instances, up to the maximum limit, are provisioned on demand.
59
63
60
-
??? info "Container-based backends"
61
-
[Container-based](backends.md#container-based) backends don’t support pre-provisioning,
62
-
so `nodes` can only be set to a range starting with `0`.
63
-
64
-
This means instances are created only when a run starts, and once it finishes, they’re terminated and released back to the provider (either a cloud service or Kubernetes).
64
+
!!! info "Container-based backends"
65
+
For [container-based](backends.md#container-based) backends (such as `kubernetes`, `runpod`, etc), `nodes` must be defined as a range starting with `0`. In these cases, instances are provisioned on demand as needed.
65
66
66
-
<div editor-title=".dstack.yml">
67
+
<!-- TODO: Ensure the user sees the error or warning otherwise -->
67
68
68
-
```yaml
69
-
type: fleet
70
-
# The name is optional, if not specified, generated randomly
71
-
name: my-fleet
72
-
73
-
# Specify the number of instances
74
-
nodes: 0..2
75
-
# Uncomment to ensure instances are inter-connected
76
-
#placement: cluster
77
-
78
-
resources:
79
-
gpu: 24GB
80
-
```
69
+
??? info "Target number of nodes"
81
70
82
-
</div>
71
+
If `nodes` is defined as a range, you can start with more than the minimum number of instances by using the `target` parameter when creating the fleet.
83
72
84
-
### Configuration options
73
+
<div editor-title=".dstack.yml">
85
74
86
-
#### Nodes { #nodes }
75
+
```yaml
76
+
type: fleet
87
77
88
-
The `nodes` property controls how many instances to provision and maintain in the fleet:
78
+
name: my-fleet
89
79
90
-
<diveditor-title=".dstack.yml">
80
+
nodes:
81
+
min: 0
82
+
max: 2
91
83
92
-
```yaml
93
-
type: fleet
84
+
# Provision 2 instances initially
85
+
target: 2
94
86
95
-
name: my-fleet
87
+
# Deprovision instances above the minimum if they remain idle
88
+
idle_duration: 1h
89
+
```
96
90
97
-
nodes:
98
-
min: 1# Always maintain at least 1 idle instance. Can be 0.
max: 3# (Optional) Do not allow more than 3 instances
101
-
```
91
+
</div>
102
92
103
-
</div>
93
+
By default, when you submit a [dev environment](dev-environments.md), [task](tasks.md), or [service](services.md), `dstack` tries all available fleets. However, you can explicitly specify the [`fleets`](../reference/dstack.yml/dev-environment.md#fleets) in your run configuration
94
+
or via [`--fleet`](../reference/cli/dstack/apply.md#fleet) with `dstack apply`.
104
95
105
-
`dstack` ensures the fleet always has at least `nodes.min` instances, creating new instances in the background if necessary. If you don't need to keep instances in the fleet forever, you can set `nodes.min` to `0`. By default, `dstack apply` also provisions `nodes.min` instances. The `nodes.target` property allows provisioning more instances initially than needs to be maintained.
96
+
### Configuration options
106
97
107
-
#### Placement { #standard-placement }
98
+
#### Placement { #backend-placement }
108
99
109
100
To ensure instances are interconnected (e.g., for
110
101
[distributed tasks](tasks.md#distributed-tasks)), set `placement` to `cluster`.
@@ -190,9 +181,9 @@ and their quantity. Examples: `nvidia` (one NVIDIA GPU), `A100` (one A100), `A10
190
181
> If you’re unsure which offers (hardware configurations) are available from the configured backends, use the
191
182
> [`dstack offer`](../reference/cli/dstack/offer.md#list-gpu-offers) command to list them.
192
183
193
-
#### Blocks { #standard-blocks }
184
+
#### Blocks { #backend-blocks }
194
185
195
-
For standard fleets, `blocks` function the same way as in SSH fleets.
186
+
For backend fleets, `blocks` function the same way as in SSH fleets.
196
187
See the [`Blocks`](#ssh-blocks) section under SSH fleets for details on the blocks concept.
197
188
198
189
<div editor-title=".dstack.yml">
@@ -214,10 +205,10 @@ blocks: 4
214
205
#### Idle duration
215
206
216
207
By default, fleet instances stay `idle` for 3 days and can be reused within that time.
217
-
If the fleet is not reused within this period, it is automatically terminated.
208
+
If an instance is not reused within this period, it is automatically terminated.
218
209
219
210
To change the default idle duration, set
220
-
[`idle_duration`](../reference/dstack.yml/fleet.md#idle_duration) in the run configuration (e.g., `0s`, `1m`, or `off` for
211
+
[`idle_duration`](../reference/dstack.yml/fleet.md#idle_duration) in the fleet configuration (e.g., `0s`, `1m`, or `off` for
0 commit comments