Skip to content

Commit eee869d

Browse files
authored
Add ARM64 support (#2595)
* Support NVIDIA Grace superchips on Lambda * Support ARM with SSH fleets Closes: #2101
1 parent 004b91e commit eee869d

File tree

34 files changed

+622
-92
lines changed

34 files changed

+622
-92
lines changed

docs/docs/concepts/dev-environments.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -175,6 +175,8 @@ name: vscode
175175
ide: vscode
176176
177177
resources:
178+
# 16 or more x86_64 cores
179+
cpu: 16..
178180
# 200GB or more RAM
179181
memory: 200GB..
180182
# 4 GPUs from 40GB to 80GB
@@ -187,10 +189,16 @@ resources:
187189

188190
</div>
189191

192+
The `cpu` property also allows you to specify the CPU architecture, `x86` or `arm`. Examples:
193+
`x86:16` (16 x86-64 cores), `arm:8..` (at least 8 ARM64 cores).
194+
If the architecture is not specified, `dstack` tries to infer it from the `gpu` specification
195+
using `x86` as the fallback value.
196+
190197
The `gpu` property allows specifying not only memory size but also GPU vendor, names
191198
and their quantity. Examples: `nvidia` (one NVIDIA GPU), `A100` (one A100), `A10G,A100` (either A10G or A100),
192199
`A100:80GB` (one A100 of 80GB), `A100:2` (two A100), `24GB..40GB:2` (two GPUs between 24GB and 40GB),
193200
`A100:40GB:2` (two A100 GPUs of 40GB).
201+
If the vendor is not specified, `dstack` tries to infer it from the GPU name using `nvidia` as the fallback value.
194202

195203
??? info "Google Cloud TPU"
196204
To use TPUs, specify its architecture via the `gpu` property.

docs/docs/concepts/services.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -316,6 +316,8 @@ commands:
316316
port: 8000
317317
318318
resources:
319+
# 16 or more x86_64 cores
320+
cpu: 16..
319321
# 2 GPUs of 80GB
320322
gpu: 80GB:2
321323
@@ -325,10 +327,16 @@ resources:
325327

326328
</div>
327329

330+
The `cpu` property also allows you to specify the CPU architecture, `x86` or `arm`. Examples:
331+
`x86:16` (16 x86-64 cores), `arm:8..` (at least 8 ARM64 cores).
332+
If the architecture is not specified, `dstack` tries to infer it from the `gpu` specification
333+
using `x86` as the fallback value.
334+
328335
The `gpu` property allows specifying not only memory size but also GPU vendor, names
329336
and their quantity. Examples: `nvidia` (one NVIDIA GPU), `A100` (one A100), `A10G,A100` (either A10G or A100),
330337
`A100:80GB` (one A100 of 80GB), `A100:2` (two A100), `24GB..40GB:2` (two GPUs between 24GB and 40GB),
331338
`A100:40GB:2` (two A100 GPUs of 40GB).
339+
If the vendor is not specified, `dstack` tries to infer it from the GPU name using `nvidia` as the fallback value.
332340

333341
??? info "Google Cloud TPU"
334342
To use TPUs, specify its architecture via the `gpu` property.

docs/docs/concepts/tasks.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -192,6 +192,8 @@ commands:
192192
- python fine-tuning/qlora/train.py
193193
194194
resources:
195+
# 16 or more x86_64 cores
196+
cpu: 16..
195197
# 200GB or more RAM
196198
memory: 200GB..
197199
# 4 GPUs from 40GB to 80GB
@@ -204,10 +206,16 @@ resources:
204206

205207
</div>
206208

209+
The `cpu` property also allows you to specify the CPU architecture, `x86` or `arm`. Examples:
210+
`x86:16` (16 x86-64 cores), `arm:8..` (at least 8 ARM64 cores).
211+
If the architecture is not specified, `dstack` tries to infer it from the `gpu` specification
212+
using `x86` as the fallback value.
213+
207214
The `gpu` property allows specifying not only memory size but also GPU vendor, names
208215
and their quantity. Examples: `nvidia` (one NVIDIA GPU), `A100` (one A100), `A10G,A100` (either A10G or A100),
209216
`A100:80GB` (one A100 of 80GB), `A100:2` (two A100), `24GB..40GB:2` (two GPUs between 24GB and 40GB),
210217
`A100:40GB:2` (two A100 GPUs of 40GB).
218+
If the vendor is not specified, `dstack` tries to infer it from the GPU name using `nvidia` as the fallback value.
211219

212220
??? info "Google Cloud TPU"
213221
To use TPUs, specify its architecture via the `gpu` property.

docs/docs/reference/api/python/index.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -136,10 +136,21 @@ finally:
136136
show_root_toc_entry: false
137137
heading_level: 4
138138
item_id_mapping:
139+
cpu: dstack.api.CPU
139140
gpu: dstack.api.GPU
140141
memory: dstack.api.Memory
141142
Range: dstack.api.Range
142143

144+
### `dstack.api.CPU` { #dstack.api.CPU data-toc-label="CPU" }
145+
146+
#SCHEMA# dstack.api.CPU
147+
overrides:
148+
show_root_heading: false
149+
show_root_toc_entry: false
150+
heading_level: 4
151+
item_id_mapping:
152+
Range: dstack.api.Range
153+
143154
### `dstack.api.GPU` { #dstack.api.GPU data-toc-label="GPU" }
144155

145156
#SCHEMA# dstack.api.GPU

docs/docs/reference/dstack.yml/dev-environment.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,14 @@ The `dev-environment` configuration type allows running [dev environments](../..
3535
required: true
3636
item_id_prefix: resources-
3737

38+
#### `resources.cpu` { #resources-cpu data-toc-label="cpu" }
39+
40+
#SCHEMA# dstack._internal.core.models.resources.CPUSpec
41+
overrides:
42+
show_root_heading: false
43+
type:
44+
required: true
45+
3846
#### `resources.gpu` { #resources-gpu data-toc-label="gpu" }
3947

4048
#SCHEMA# dstack._internal.core.models.resources.GPUSpec

docs/docs/reference/dstack.yml/fleet.md

Lines changed: 10 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -46,15 +46,23 @@ The `fleet` configuration type allows creating and updating fleets.
4646
required: true
4747
item_id_prefix: resources-
4848

49-
#### `resouces.gpu` { #resources-gpu data-toc-label="gpu" }
49+
#### `resources.cpu` { #resources-cpu data-toc-label="cpu" }
50+
51+
#SCHEMA# dstack._internal.core.models.resources.CPUSpec
52+
overrides:
53+
show_root_heading: false
54+
type:
55+
required: true
56+
57+
#### `resources.gpu` { #resources-gpu data-toc-label="gpu" }
5058

5159
#SCHEMA# dstack._internal.core.models.resources.GPUSpec
5260
overrides:
5361
show_root_heading: false
5462
type:
5563
required: true
5664

57-
#### `resouces.disk` { #resources-disk data-toc-label="disk" }
65+
#### `resources.disk` { #resources-disk data-toc-label="disk" }
5866

5967
#SCHEMA# dstack._internal.core.models.resources.DiskSpec
6068
overrides:

docs/docs/reference/dstack.yml/service.md

Lines changed: 10 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -129,15 +129,23 @@ The `service` configuration type allows running [services](../../concepts/servic
129129
required: true
130130
item_id_prefix: resources-
131131

132-
#### `resouces.gpu` { #resources-gpu data-toc-label="gpu" }
132+
#### `resources.cpu` { #resources-cpu data-toc-label="cpu" }
133+
134+
#SCHEMA# dstack._internal.core.models.resources.CPUSpec
135+
overrides:
136+
show_root_heading: false
137+
type:
138+
required: true
139+
140+
#### `resources.gpu` { #resources-gpu data-toc-label="gpu" }
133141

134142
#SCHEMA# dstack._internal.core.models.resources.GPUSpec
135143
overrides:
136144
show_root_heading: false
137145
type:
138146
required: true
139147

140-
#### `resouces.disk` { #resources-disk data-toc-label="disk" }
148+
#### `resources.disk` { #resources-disk data-toc-label="disk" }
141149

142150
#SCHEMA# dstack._internal.core.models.resources.DiskSpec
143151
overrides:

docs/docs/reference/dstack.yml/task.md

Lines changed: 10 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -35,15 +35,23 @@ The `task` configuration type allows running [tasks](../../concepts/tasks.md).
3535
required: true
3636
item_id_prefix: resources-
3737

38-
#### `resouces.gpu` { #resources-gpu data-toc-label="gpu" }
38+
#### `resources.cpu` { #resources-cpu data-toc-label="cpu" }
39+
40+
#SCHEMA# dstack._internal.core.models.resources.CPUSpec
41+
overrides:
42+
show_root_heading: false
43+
type:
44+
required: true
45+
46+
#### `resources.gpu` { #resources-gpu data-toc-label="gpu" }
3947

4048
#SCHEMA# dstack._internal.core.models.resources.GPUSpec
4149
overrides:
4250
show_root_heading: false
4351
type:
4452
required: true
4553

46-
#### `resouces.disk` { #resources-disk data-toc-label="disk" }
54+
#### `resources.disk` { #resources-disk data-toc-label="disk" }
4755

4856
#SCHEMA# dstack._internal.core.models.resources.DiskSpec
4957
overrides:

docs/docs/reference/environment-variables.md

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -117,8 +117,11 @@ For more details on the options below, refer to the [server deployment](../guide
117117
* `DSTACK_SERVER_MAX_OFFERS_TRIED` - Sets how many instance offers to try when starting a job.
118118
Setting a high value can degrade server performance.
119119
* `DSTACK_RUNNER_VERSION` – Sets exact runner version for debug. Defaults to `latest`. Ignored if `DSTACK_RUNNER_DOWNLOAD_URL` is set.
120-
* `DSTACK_RUNNER_DOWNLOAD_URL` – Overrides `dstack-runner` binary download URL.
121-
* `DSTACK_SHIM_DOWNLOAD_URL` – Overrides `dstack-shim` binary download URL.
120+
* `DSTACK_RUNNER_DOWNLOAD_URL` – Overrides `dstack-runner` binary download URL. The URL can contain `{version}` and/or `{arch}` placeholders,
121+
where `{version}` is `dstack` version in the `X.Y.Z` format or `latest`, and `{arch}` is either `amd64` or `arm64`, for example,
122+
`https://dstack.example.com/{arch}/{version}/dstack-runner`.
123+
* `DSTACK_SHIM_DOWNLOAD_URL` – Overrides `dstack-shim` binary download URL. The URL can contain `{version}` and/or `{arch}` placeholders,
124+
see `DSTACK_RUNNER_DOWNLOAD_URL` for the details.
122125
* `DSTACK_DEFAULT_CREDS_DISABLED` – Disables default credentials detection if set. Defaults to `None`.
123126
* `DSTACK_LOCAL_BACKEND_ENABLED` – Enables local backend for debug if set. Defaults to `None`.
124127

src/dstack/_internal/cli/services/args.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -19,8 +19,8 @@ def port_mapping(v: str) -> PortMapping:
1919
return PortMapping.parse(v)
2020

2121

22-
def cpu_spec(v: str) -> resources.Range[int]:
23-
return parse_obj_as(resources.Range[int], v)
22+
def cpu_spec(v: str) -> dict:
23+
return resources.CPUSpec.parse(v)
2424

2525

2626
def memory_spec(v: str) -> resources.Range[resources.Memory]:

0 commit comments

Comments
 (0)