Releases: dstackai/dstack-enterprise
0.18.10-v1
0.18.10
The update includes all the features and bug fixes from version 0.18.10.
Environment variables interpolation
Previously, it wasn't possible to use environment variables to configure credentials for a private Docker registry. With this update, you can now use the following interpolation syntax to avoid hardcoding credentials in the configuration.
type: dev-environment
name: train
env:
- DOCKER_USER
- DOCKER_USERPASSWORD
image: dstackai/base:py3.10-0.4-cuda-12.1
registry_auth:
username: ${{ env.DOCKER_USER }}
password: ${{ env.DOCKER_USERPASSWORD }}Network interfaces for port forwarding
When you run a dev environment or a task with dstack apply, it automatically forwards the remote ports to localhost. However, these ports are, by default, bound to 127.0.0.1. If you'd like to make a port available on an arbitrary host, you can now specify the host using the --host option.
For example, this command will make the port available on all network interfaces:
dstack apply --host 0.0.0.0 -f my-task.dstack.yml
Major bugfixes
- [Bugfix] Fix
httpservices running on443in the logs by @r4victor in dstackai/dstack#1522 - [Bugfix] Ensure
dstackCLI exits with non-zero exit code on errors by @r4victor in dstackai/dstack#1529 - [Bugfix] Forece the use of the
rootuser in custom Docker images by @jvstme in dstackai/dstack#1538 - [Bugfix] Update Docker to 27.1.1 in
dstackVM images by @jvstme in dstackai/dstack#1536
Other
- [Feature] Add
--host HOSTarg todstack applycommand by @un-def in dstackai/dstack#1531 - [Feature] Interpolate
envinregistry_authby @r4victor in dstackai/dstack#1540 - [Docs] Document the
nvccproperty by @peterschmidt85 in dstackai/dstack#1526 - [Interna] Fix unlocking on transaction rollback by @r4victor in dstackai/dstack#1537
- [Internal] Bump base
dstackimage version to0.5by @jvstme in dstackai/dstack#1541
All changes: dstackai/dstack@0.18.9...0.18.10
0.18.9-v1
0.18.9
The update includes all the features and bug fixes from version 0.18.9.
Base Docker image with nvcc
If you don't specify a custom Docker image, dstack uses its own base image with essential CUDA drivers, python, pip, and conda (Miniforge). Previously, this image didn't include nvcc, needed for compiling custom CUDA kernels (e.g., Flash Attention).
With version 0.18.9, you can now include nvcc.
type: task
python: "3.10"
# This line ensures `nvcc` is included into the base Docker image
nvcc: true
commands:
- pip install -r requirements.txt
- python train.py
resources:
gpu: 24GBEnvironment variables for on-prem fleets
When you create an on-prem fleet, it's now possible to pre-configure environment variables. These variables will be used when installing the dstack-shim service on hosts and running workloads.
For example, these environment variables can be used to configure dstack to use a proxy:
type: fleet
name: my-fleet
placement: cluster
env:
- HTTP_PROXY=http://proxy.example.com:80
- HTTPS_PROXY=http://proxy.example.com:80
- NO_PROXY=localhost,127.0.0.1
ssh_config:
user: ubuntu
identity_file: ~/.ssh/id_rsa
hosts:
- 3.255.177.51
- 3.255.177.52Examples
New examples include:
Other
- [Bugifx] Fix filtering offers by disk size by @jvstme in dstackai/dstack#1517
- [Bugifx] Run containers as root for all images by @r4victor in dstackai/dstack#1499
- [Docs] Document GCP permissions for volumes by @r4victor in dstackai/dstack#1501
- [Docs] Another batch of docs improvements #1497 by @peterschmidt85 in dstackai/dstack#1498
- [Bugfix] Fix creating TensorDock instances by @jvstme in dstackai/dstack#1506
- [Bugfix] Launch TensorDock instances with correct disk size by @jvstme in dstackai/dstack#1508
- [Bugfix] Set timeouts to TensorDock API requests by @jvstme in dstackai/dstack#1509
- [Docs] Update TensorDock setup instructions by @jvstme in dstackai/dstack#1512
- [Internal] Implement API endpoint for listing volumes across projects by @r4victor in dstackai/dstack#1519
- [Internal] Include Volume.deleted in the API by @r4victor in dstackai/dstack#1520
- [Docs] Update the Axolotl example #1493 by @peterschmidt85 in dstackai/dstack#1494
- [Internal] Print docker image pulling errors to
shim.logby @jvstme in dstackai/dstack#1503 - [Feature] Add
envsetting to fleet config for on-prem fleets by @un-def in dstackai/dstack#1505
Full changelog: https://github.com/dstackai/dstack/releases/0.18.9
0.18.8-v1
0.18.8
The update includes all the features and bug fixes from version 0.18.8.
GCP volumes
Now, volumes are also supported for the gcp backend:
type: volume
name: my-gcp-volume
backend: gcp
region: europe-west1
size: 100GB
Previously, volumes were only supported for aws and runpod.
Major bugfixes
The update fixes a major bug introduced in 0.18.7 that could prevent instances from being terminated in the cloud.
Other
- [Docs] Updated Alignment Handbook example by @peterschmidt85 in dstackai/dstack#1475
- [Fleets] Ensure on-prem fleets' service is initialized after the network goes online by @un-def in dstackai/dstack#1480
- [Fleets] Ensure on-pre, fleets' update the previous configuration by @un-def in dstackai/dstack#1479
- [UI] Fixed not-working user token rotation by @r4victor in dstackai/dstack#1487
Full changelog: https://github.com/dstackai/dstack/releases/0.18.8
0.18.7-v1
0.18.7
The update brings all the features and bug fixes introduced in version 0.18.7.
Fleets
With fleets, you can now describe clusters declaratively and create them in both cloud and on-prem with a single command. Once a fleet is created, it can be used with dev environments, tasks, and services.
Cloud fleets
To provision a fleet in the cloud, specify the required resources, number of nodes, and other optional parameters.
type: fleet
name: my-fleet
placement: cluster
nodes: 2
resources:
gpu: 24GBOn-prem fleets
To create a fleet from on-prem servers, specify their hosts along with the user, port, and SSH key for connection via SSH.
type: fleet
name: my-fleet
placement: cluster
ssh_config:
user: ubuntu
identity_file: ~/.ssh/id_rsa
hosts:
- 3.255.177.51
- 3.255.177.52To create or update the fleet, simply call the dstack apply command:
dstack apply -f examples/fleets/my-fleet.dstack.ymlLearn more about fleets in the documentation.
Deprecating dstack run
Now that we support dstack apply for gateways, volumes, and fleets, we have extended this support to dev environments, tasks, and services. Instead of using dstack run WORKING_DIR -f CONFIG_FILE, you can now use dstack apply -f CONFIG_FILE.
Also, it's now possible to specify a name for dev environments, tasks, and services, just like for gateways, volumes, and fleets.
type: dev-environment
name: my-ide
python: "3.11"
ide: vscode
resources:
gpu: 80GB
This name is used as a run name and is more convenient than a random name. However, if you don't specify a name, dstack will assign a random name as before.
Major bugfixes
Important
This update fixes the broken kubernetes backend, which has been non-functional since a few previous updates.
Other
- [UX] Make
--gpuoverride YAML'sgpuby @r4victor in dstackai/dstack#1455
dstackai/dstack#1431 - [Performance] Speed up listing runs for Python API and CLI by @r4victor in dstackai/dstack#1430
- [Performance] Speed up project loading by @r4victor in dstackai/dstack#1425
- [Bugfix] Remove
busyoffers from the top of offers list by @jvstme in dstackai/dstack#1452 - [Bugfix] Prioritize cheaper offers from the pool by @jvstme in dstackai/dstack#1453
- [Bugfix] Fix spot offers suggested for on-demand dev envs by @jvstme in dstackai/dstack#1450
- [Feature] Implement
dstack volume deleteby @r4victor in dstackai/dstack#1434 - [UX] Instances were always shown as
provisioningfor container backends by @r4victor in * [Docs] Fix typos by @jvstme in dstackai/dstack#1426 - [Docs] Fix a bad link by @tamanobi in dstackai/dstack#1422
- [Internal] Add
DSTACK_SENTRY_PROFILES_SAMPLE_RATEby @r4victor in dstackai/dstack#1428 - [Internal] Update
ruffto0.5.3by @jvstme in dstackai/dstack#1421 - [Internal] Update GitHub Actions dependencies by @jvstme in dstackai/dstack#1436
- [UX] Make
--gpuoverride YAML'sgpu: by @r4victor in dstackai/dstack#1455 - [Bugfix] Respect
regionsforrunpodby @r4victor in dstackai/dstack#1460
Full changelog: 0.18.7
0.18.6-v1
0.18.6
The update brings all the features and bug fixes introduced in version 0.18.6.
Major fixes
- Support for GitLab's authorization when the repo is using HTTP/HTTPS by @jvstme in dstackai/dstack#1412
- Add a multi-node example to the Hugging Alignment Handbook example by @deep-diver in dstackai/dstack#1409
- Fix the issue where idle instances weren't offered (occurred when a GPU name was in upper case). by @jvstme in dstackai/dstack#1417
- Fix the issue where an exception is thrown for non-standard Git repo host URLs using HTTP/HTTPS @jvstme in dstackai/dstack#1410
- Support
H100with thegcpbackend by @jvstme in dstackai/dstack#1405
Warning
If you have idle instances in your pool, it is recommended to re-create them after upgrading to version 0.18.6. Otherwise, there is a risk that these instances won't be able to execute jobs.
Other
- [Internal] Add script for checking OCI images by @jvstme in dstackai/dstack#1408
- Fix repos migration on PostgreSQL by @jvstme in dstackai/dstack#1414
- [Internal] Fix
dstack-runnerrepo tests by @jvstme in dstackai/dstack#1418 - Fix OCI listing not found errors by @jvstme in dstackai/dstack#1407
For more details, check https://github.com/dstackai/dstack/releases/0.18.6
Logs
The logs in the control plane are now displayed properly, even if they include special characters (e.g., progress bars, color codes).

0.18.5rc1-v1
This is a release candidate of the upcoming GA release 0.18.5-v1. This update brings the features introduced with dstack==0.18.5rc1:
0.18.5rc1
Volumes
When you run anything with dstack, it allows you to configure the disk size. However, once the run is finished, if you haven't stored your data in any external storage, all the data on disk will be erased. With 0.18.5, we're adding support for network volumes that allow data to persist across runs.
Once you've created a volume (e.g. named my-new-volume), you can attach it to a dev environment, task, or service.
type: dev-environment
ide: vscode
volumes:
- name: my-new-volume
path: /volume_dataThe data stored in the volume will persist across runs.
dstack allows you to create new volumes and register existing ones. To learn more about how volumes work, check out the docs.
Important
Volumes are currently experimental and only work with the aws backend. Support for other backends is coming soon.
PostgreSQL
By default, dstack stores its state in /root/.dstack/server/data using SQLite. With this update, it's now possible to configure dstack to store its state in PostgreSQL. Just pass the DSTACK_DATABASE_URL environment variable.
docker run -it -p 3000:3000 \
-v $HOME/.dstack-enterprise/server/:/root/.dstack/server \
-e DSTACK_DATABASE_URL="postgresql+asyncpg://myuser:mypassword@myhostname:5432/mydatabase"
ghcr.io/dstackai/dstack-enterprise:latest
Note
If you are using PostgreSQL, mounting the /root/.dstack/server folder is optional and is only required if you plan to use a pre-configured /root/.dstack/server/config.yml.
Important
Despite PostgreSQL support, dstack still requires that you run only one instance of the dstack server. However, this requirement will be lifted in a future update.
On-prem clusters
Previously, dstack didn't allow the use of on-prem clusters (added via dstack pool add-ssh) if there were no backends configured. This update fixes that bug. Now, you don't have to configure any backends if you only plan to use on-prem clusters.
Supported GPUs
Previously, dstack didn't support L4 and H100 GPUs with AWS. Now you can use them.
Full changelog: 0.18.5rc1
0.18.4-v1
Here's the list of what has changed in this update:
- Gateways are now always read-only in the project settings UI. To create or update a gateway, you must now use
dstack applyanddstack delete. - The UI now loads a lot faster and is more responsive.
- Global administrators can now see running instances for all users across projects via
Administration>Pools.
Additionally, the update includes all the new features introduced in versions 0.18.3 and 0.18.4. Please read below for more details.
0.18.3
Configuring VPCs for GCP
The gcp backend now also allows configuring VPCs:
projects:
- name: main
backends:
- type: gcp
project_id: my-awesome-project
creds:
type: default
vpc_name: my-custom-vpcThe VPC should belong to the same project. If you would like to use a shared VPC from another project, you can also specify vpc_project_id.
Per-region VPC configuration with AWS
Last but not least, for the aws backend, it is now possible to configure VPCs for selected regions:
projects:
- name: main
backends:
- type: aws
creds:
type: default
vpc_ids:
us-east-1: vpc-0a2b3c4d5e6f7g8h
default_vpcs: true
To use the default VPC in other regions, just need to set default_vpcs to true.
Retry policy
We have reworked how to configure the retry policy and how it is applied to runs. Here's an example:
type: task
commands:
- python train.py
retry:
on_events: [no-capacity]
duration: 2h
Now, if you run such a task, dstack will keep trying to find capacity within 2 hours. Once capacity is found, dstack will run the task.
The on_events property also supports error (in case the run fails with an error) and interruption (if the run is using a spot instance and it was interrupted).
Oracle Cloud Infrastructure
With the new update, it is now possible to run workloads with your Oracle Cloud Infrastructure (OCI) account. The backend is called oci and can be configured as follows:
projects:
- name: main
backends:
- type: oci
creds:
type: defaultThe supported credential types include default and client. In case default is used, dstack automatically picks the default OCI credentials from ~/.oci/config.
Just like other backends, oci supports dev environments, tasks, and services.
Important
Currently, the oci backend does not support creating gateways. This feature is coming soon.
Full changelog: 0.18.3
0.18.4
Private subnets with GCP
Additionally, the update allows configuring the gcp backend to use only private subnets. To achieve this, set public_ips to false.
projects:
- name: main
backends:
- type: gcp
creds:
type: default
public_ips: false
Google Cloud TPU
To request a TPU, specify the TPU architecture prefixed by tpu- (in gpu under resources):
type: task
python: "3.11"
commands:
- pip install torch~=2.3.0 torch_xla[tpu]~=2.3.0 torchvision -f https://storage.googleapis.com/libtpu-releases/index.html
- git clone --recursive https://github.com/pytorch/xla.git
- python3 xla/test/test_train_mp_imagenet.py --fake_data --model=resnet50 --num_epochs=1
resources:
gpu: tpu-v2-8
Important
Currently, you can't specify other than 8 TPU cores. This means only single TPU device workloads are supported. Support for multiple TPU devices is coming soon.
Full changelog: 0.18.4
0.18.2-v2
0.18.2
This update brings all the new features introduced with 0.18.2:
-
On-prem clusters: The
dstack pool add-sshcommand now supports the--networkargument. This argument allows you to configure the private network. If you add multiple on-prem instances sharing the same private network, you'll be able to use these instances as a cluster to run multi-node tasks. -
Private subnets: The
awsbackend now allows to setpublic_ipstofalse. In this case, instances will be created in private subnets only to ensure maximum security. -
Gateways: It's now possible to define a gateway configuration via YAML and create or update it using the
dstack applycommand. For AWS, gateways now support thepublic_ipsandcertificateproperties. Use them to run gateways in a private network under the load balancer. The certificatepropertyallows to specify the ARN of a certificate from AWS Certificate Manager.
Refer to the 0.18.2 release notes for more details.
Editing backend settings via UI
Additionally, the update introduces a YAML-based code editor to create and edit backend settings via the UI.
It's now possible to add backends using the same YAML syntax as in ~/.dstack/server/config.yml.