Skip to content

Releases: dstackai/dstack-enterprise

0.19.7-v1

01 May 14:07
89d5de7

Choose a tag to compare

Plugins

Run configurations have many options. While dstack aims to simplify them and provide rational defaults, teams may sometimes want to enforce their own defaults and configurations across projects.

To support this, we're introducing a plugin system that allows such enforcements to be defined programmatically. You can now define a plugin using dstack's Python SDK and bundle it with the dstack server.

For example, you can create your own plugin to override run configuration options—e.g., to prepend commands, set policies, and more.

For more information on plugin development, see the documentation and example.

Note

Plugins are currently an experimental feature. Backward compatibility is not guaranteed between releases.

Tenstorrent

The new update introduces initial support for Tenstorrent's Wormhole accelerators.

Now, if you create SSH fleets with hosts that have N150 or N300 PCIe boards, dstack will automatically detect them and allow you to use such a fleet for running dev environments, tasks, and services.

Dedicated examples for using dstack with Tenstorrent's accelerators will be published soon.

What's changed

Full changelog: dstackai/dstack@0.19.5...0.19.7

0.19.5-v1

23 Apr 10:41
89d5de7

Choose a tag to compare

CLI

Offers

You can now list available offers (hardware configurations) from the configured backends using the CLI—without needing to define a run configuration. Just run dstack offer and specify the resource requirements. The CLI will output available offers, including backend, region, instance type, resources, spot availability, and pricing:

$ dstack offer --gpu H100:1.. --max-offers 10

 #   BACKEND     REGION     INSTANCE TYPE          RESOURCES                                     SPOT  PRICE   
 1   datacrunch  FIN-01     1H100.80S.30V          30xCPU, 120GB, 1xH100 (80GB), 100.0GB (disk)  no    $2.19   
 2   datacrunch  FIN-02     1H100.80S.30V          30xCPU, 120GB, 1xH100 (80GB), 100.0GB (disk)  no    $2.19   
 3   datacrunch  FIN-02     1H100.80S.32V          32xCPU, 185GB, 1xH100 (80GB), 100.0GB (disk)  no    $2.19   
 4   datacrunch  ICE-01     1H100.80S.32V          32xCPU, 185GB, 1xH100 (80GB), 100.0GB (disk)  no    $2.19   
 5   runpod      US-KS-2    NVIDIA H100 PCIe       16xCPU, 251GB, 1xH100 (80GB), 100.0GB (disk)  no    $2.39   
 6   runpod      CA         NVIDIA H100 80GB HBM3  24xCPU, 251GB, 1xH100 (80GB), 100.0GB (disk)  no    $2.69   
 7   nebius      eu-north1  gpu-h100-sxm           16xCPU, 200GB, 1xH100 (80GB), 100.0GB (disk)  no    $2.95   
 8   runpod      AP-JP-1    NVIDIA H100 80GB HBM3  20xCPU, 251GB, 1xH100 (80GB), 100.0GB (disk)  no    $2.99   
 9   runpod      CA-MTL-1   NVIDIA H100 80GB HBM3  28xCPU, 251GB, 1xH100 (80GB), 100.0GB (disk)  no    $2.99   
 10  runpod      CA-MTL-2   NVIDIA H100 80GB HBM3  26xCPU, 125GB, 1xH100 (80GB), 100.0GB (disk)  no    $2.99   
     ...                                                                                                                
 Shown 10 of 99 offers, $127.816 max

Learn more about how the new CLI works in the reference

Configuration

Resource tags

It's now possible to set custom resource-level tags using the new tags property:

type: dev-environment
ide: vscode
tags:
  my_custom_tag: some_value
  another_tag: another_value_123

The tags property is supported by all configuration types: runs, fleets, volumes, gateways, and profiles. The tags are propagated to the underlying cloud resources on backends that support tags. Currently, it's AWS, Azure, and GCP.

Shell configuration

With the new shell property you can specify the shell used to run commands (or init for dev environments):

type: task
image: ubuntu

shell: bash
commands:
  # now we can use Bash features, e.g., arrays:
  - words=(dstack is)
  - words+=(awesome)
  - echo ${words[@]}  # prints "dstack is awesome"

GCP

A3 High and A3 Edge

dstack now automatically sets up GCP A3 High and A3 Edge instances with GPUDirect-TCPX optimized NCCL communication.

An example on how to provision an A3 High cluster and run NCCL tests on it using dstack is coming soon!

Volumes

Total cost

The UI now shows volumes total cost and termination date alongside volume price. Previously, only the price information was available.

Screenshot 2025-04-23 at 14 10 05

What's changed

Full changelog: dstackai/dstack@0.19.4...0.19.5

0.19.4-v1

17 Apr 10:36
89d5de7

Choose a tag to compare

Rate limits for services

You can now configure rate limits for your services running behind a gateway.

type: service
image: my-app:latest
port: 80

rate_limits:
# For /api/auth/* - 1 request per second, no bursts
- prefix: /api/auth/
  rps: 1
# For other URLs - 4 requests per second + bursts of up to 9 requests
- rps: 4
  burst: 9

Examples: TensorRT-LLM and Llama 4

We added a new example on TensorRT-LLM that shows how to deploy both DeepSeek R1 and its distilled version
using TensorRT-LLM and dstack.

The Llama example was updated to demonstrate the deployment of Llama 4 Scout using dstack.

Improved contributing experience

We continue to make contributing to dstack easier and improve dev experience. Since the last release, we moved from pip to uv in CI and dev pipelines. Dependencies installation times went from ~70 seconds to less than 10 seconds. The Development guide was updated to show how to get the dstack development setup with uv. The CI Build pipeline triggered on pull requests were optimized from 9 minutes to 4 minutes.

We also documented uv as one of the recommended installation options for dstack.

What's Changed

New Contributors

Full Changelog: dstackai/dstack@0.19.3...0.19.4

0.19.3-v1

10 Apr 10:49
89d5de7

Choose a tag to compare

Optimized networking for GCP H100 clusters

dstack now automatically sets up GCP A3 Mega instances with GPUDirect-TCPXO optimized NCCL communication to take advantage of the 1800Gbps maximum network bandwidth. Here's NCCL tests results on an A3 Mega cluster provisioned with dstack:

✗ dstack apply -f examples/misc/a3mega-clusters/nccl-tests.dstack.yml 

nccl-tests provisioning completed (running)
nThread 1 nGpus 1 minBytes 8388608 maxBytes 8589934592 step: 2(factor) warmup iters: 5 iters: 200 agg iters: 1 validation: 0 graph: 0

                                                             out-of-place                       in-place          
      size         count      type   redop    root     time   algbw   busbw #wrong     time   algbw   busbw #wrong
       (B)    (elements)                               (us)  (GB/s)  (GB/s)            (us)  (GB/s)  (GB/s)       
     8388608        131072     float    none      -1    166.6   50.34   47.19    N/A    164.1   51.11   47.92    N/A
    16777216        262144     float    none      -1    204.6   82.01   76.89    N/A    203.8   82.30   77.16    N/A
    33554432        524288     float    none      -1    284.0  118.17  110.78    N/A    281.7  119.12  111.67    N/A
    67108864       1048576     float    none      -1    447.4  150.00  140.62    N/A    443.5  151.31  141.86    N/A
   134217728       2097152     float    none      -1    808.3  166.05  155.67    N/A    801.9  167.38  156.92    N/A
   268435456       4194304     float    none      -1   1522.1  176.36  165.34    N/A   1518.7  176.76  165.71    N/A
   536870912       8388608     float    none      -1   2892.3  185.62  174.02    N/A   2894.4  185.49  173.89    N/A
  1073741824      16777216     float    none      -1   5532.7  194.07  181.94    N/A   5530.7  194.14  182.01    N/A
  2147483648      33554432     float    none      -1    10863  197.69  185.34    N/A    10837  198.17  185.78    N/A
  4294967296      67108864     float    none      -1    21481  199.94  187.45    N/A    21466  200.08  187.58    N/A
  8589934592     134217728     float    none      -1    42713  201.11  188.54    N/A    42701  201.16  188.59    N/A
Out of bounds values : 0 OK
Avg bus bandwidth    : 146.948 

Done

For more information on how to provision and use A3 Mega clusters with GPUDirect-TCPXO, see the A3 Mega example.

H200 and B200 support on Datacrunch

You can now provision H200 and B200 instances on DataCrunch. DataCrunch is the first dstack backend to support B200:

✗ dstack apply --gpu B200
 Project              main                                   
 User                 admin                                  
 Configuration        .dstack.yml                            
 Type                 dev-environment                        
 Resources            1..xCPU, 2GB.., 1xB200, 100GB.. (disk) 
 Max price            -                                      
 Max duration         -                                      
 Inactivity duration  -                                      
 Spot policy          auto                                   
 Retry policy         -                                      
 Creation policy      reuse-or-create                        
 Idle duration        5m                                     
 Reservation          -                                      

 #  BACKEND     REGION  INSTANCE   RESOURCES                                      SPOT  PRICE                
 1  datacrunch  FIN-03  1B200.31V  31xCPU, 250GB, 1xB200 (180GB), 100.0GB (disk)  yes   $1.3                 
 2  datacrunch  FIN-03  1B200.31V  31xCPU, 250GB, 1xB200 (180GB), 100.0GB (disk)  no    $4.49
 3  datacrunch  FIN-01  1B200.31V  31xCPU, 250GB, 1xB200 (180GB), 100.0GB (disk)  yes   $1.3   not available 
    ...                                                                                                      
 Shown 3 of 8 offers, $4.49 max

Submit a new run? [y/n]:                        

CUDO improvements

The CUDO backend is updated to support H100, A100, A40 and all other GPUs currently offered by CUDO.

fleets configuration property

With the new fleets property and --fleet dstack apply option, it's now possible to restrict a set of fleets considered for reuse:

type: task

fleets: [my-fleet-1, my-fleet-2]

or

dstack apply --fleet my-fleet-1 --fleet my-fleet-2

What's Changed

New Contributors

Full Changelog: dstackai/dstack@0.19.2...0.19.3

0.19.2-v1

03 Apr 14:04
89d5de7

Choose a tag to compare

Nebius

This update introduces an integration with Nebius, a cloud provider offering top-tier NVIDIA GPUs at competitive prices.

$ dstack apply
 #  BACKEND  REGION     RESOURCES                        SPOT  PRICE
 1  nebius   eu-north1  8xCPU, 32GB, 1xL40S (48GB)       no    $1.5484
 2  nebius   eu-north1  16xCPU, 200GB, 1xH100 (80GB)     no    $2.95
 3  nebius   eu-north1  16xCPU, 200GB, 1xH200 (141GB)    no    $3.5
 4  nebius   eu-north1  64xCPU, 384GB, 2xL40S (48GB)     no    $4.5688
 5  nebius   eu-north1  128xCPU, 768GB, 4xL40S (48GB)    no    $9.1376
 6  nebius   eu-north1  128xCPU, 1600GB, 8xH100 (80GB)   no    $23.6
 7  nebius   eu-north1  128xCPU, 1600GB, 8xH200 (141GB)  no    $28

The new nebius backend supports CPU and GPU instances, fleets, distributed tasks, and more. Support for network volumes and enhanced inter-node connectivity is coming in future releases. See the docs for instructions on configuring Nebius in your dstack project.

Metrics

This release brings a long-awaited feature — the Metrics page in the UI:

Screenshot 2025-04-03 at 11-49-05 dstack

In addition, the dstack stats command was renamed to dstack metrics and updated — previously, the max value of CPU utilization depended on a number of CPUs (for example, it was 400% for 4-core CPU), now it's normalized to 100%.

$ dstack metrics nccl-tests
 NAME        CPU  MEMORY            GPU
 nccl-tests  81%  2754MB/1638400MB  #0 100740MB/144384MB 100% Util
                                    #1 100740MB/144384MB 100% Util
                                    #2 100740MB/144384MB 99% Util
                                    #3 100740MB/144384MB 99% Util
                                    #4 100740MB/144384MB 99% Util
                                    #5 100740MB/144384MB 99% Util
                                    #6 100740MB/144384MB 99% Util
                                    #7 100740MB/144384MB 100% Util

What's Changed

Full Changelog: dstackai/dstack@0.19.1...0.19.2

0.19.1-v1

26 Mar 12:36
89d5de7

Choose a tag to compare

Metrics

With this update, we've added more metrics that you can export to Prometheus. The new metrics allow tracking job CPU and system memory utilization, user and project usage stats, success/error rate, and more.

Runs

Name Type Description Examples
dstack_run_count_total counter The total number of runs 537
dstack_run_count_terminated_total counter The number of terminated runs 118
dstack_run_count_failed_total counter The number of failed runs 27
dstack_run_count_done_total counter The number of successful runs 218

Run jobs

Name Type Description Examples
dstack_job_cpu_count gauge Job CPU count 32.0
dstack_job_cpu_time_seconds_total counter Total CPU time consumed by the job, seconds 11.727975
dstack_job_memory_total_bytes gauge Total memory allocated for the job, bytes 4009754624.0
dstack_job_memory_usage_bytes gauge Memory used by the job (including cache), bytes 339017728.0
dstack_job_memory_working_set_bytes gauge Memory used by the job (not including cache), bytes 147251200.0

For more details on metrics, check Metrics

Major bugfixes

Fixed a bug introduced in 0.19.0 where the working directory in the container was incorrectly set by default to / instead of /workflow.

What's changed

Full changelog: dstackai/dstack@0.19.0...0.19.1

0.19.0-v1

20 Mar 10:52
89d5de7

Choose a tag to compare

Simplified backend integration

To provide best multi-cloud experience and GPU availability, dstack integrates with many cloud GPU providers including AWS, Azure, GCP, RunPod, Lambda, Vultr, and others. As we'd like to see even more GPU providers supported by dstack, this release comes with a major internal refactoring aimed to simplify the process of adding new integrations. See the Backend integration guide for more details. Join our Discord if have any questions about the integration process.

MPI workloads and NCCL tests

dstack now configures internode SSH connectivity for distributed tasks. You can log in to any node from any node via SSH with a simple ssh <node_ip> command. The out-of-the-box SSH connectivity also allows running mpirun. See the NCCL Tests example.

Cost and usage metrics

In addition to DCGM metrics, dstack now exports a set of Prometheus metrics for cost and usage tracking. Here's how it may look in the Grafana dashboard:

image

See the documentation for a full list of metrics and labels.

Cursor IDE support

dstack can now launch Cursor dev environments. Just specify ide: cursor in the run configuration:

type: dev-environment
ide: cursor

Deprecations

  • The Python API methods get_plan(), exec_plan(), and submit() are deprecated in favor of get_run_plan(), apply_plan(), and apply_configuration(). The deprecated methods had clumsy signatures with many top-level parameters. The new signatures align better with the CLI and HTTP API.

Breaking changes

The 0.19.0 release drops several previously deprecated or undocumented features. There are no other significant breaking changes. The 0.19.0 server continues to support 0.18.x CLI versions. But the 0.19.0 CLI does not work with older 0.18.x servers, so you should update the server first or the server and the clients simultaneously.

  • Drop the dstack run CLI command.
  • Drop the --attach mode for the dstack logs CLI command.
  • Drop Pools functionality:
    • The dstack pool CLI commands.
    • /api/project/{project_name}/runs/get_offers/api/project/{project_name}/runs/create_instance/api/pools/list_instances/api/project/{project_name}/pool/* API endpoints.
    • pool_name and instance_name parameters in profiles and run configurations.
  • Remove retry_policy from profiles.
  • Remove termination_idle_time and termination_policy from profiles and fleet configurations.
  • Drop RUN_NAME and REPO_ID run environment variables.
  • Drop the /api/backends/config_values endpoint used for interactive configuration.
  • The API accepts and returns azure_config["regions"] instead of azure_config["locations"] (unified with server/config.yml).

What's Changed

Full Changelog: dstackai/dstack@0.18.44...0.19.0

0.18.44-v1

05 Mar 14:08
233bdeb

Choose a tag to compare

Single Sign-On via Microsoft Entra ID

dstack Enterprise now supports Single Sign-On via Microsoft Entra ID (formerly Azure AD). When Entra ID integration is configured, the dstack login page will display the Sign in with Microsoft Entra ID button. Users can log in to dstack using their Entra ID account without entering any dstack-specific credentials. See the Entra ID integration guide for more information.

image

GPU utilization policy

To avoid a waste of resources, you can now specify a minimum required GPU utilization for the run. If any GPU has utilization below threshold in all samples in a time window, the run is terminated.

type: task

utilization_policy:
  min_gpu_utilization: 30
  time_window: 30m

resources:
  gpu: nvidia:8:24GB

In this example, if any of 8 GPUs has utilization below 30% in all samples during last 30 minutes, the run will be terminated.

DCGM metrics

dstack can now collect and export NVIDIA DCGM metrics from running jobs on supported backends (AWS, Azure, GCP, OCI) and SSH fleets.

Metrics are disabled by default. See the documenation for how to enable and scrape them.

RunPod Community Cloud

In addition to Secure Cloud, dstack will now use Community Cloud offers in the runpod backend. Community Cloud offers are usually cheaper and can be identified by a two-letter region code.

$ dstack apply -f .dstack.yml -b runpod
 #  BACKEND  REGION    INSTANCE               SPOT  PRICE
 1  runpod   CA        NVIDIA A100 80GB PCIe  yes   $0.6
 2  runpod   CA-MTL-3  NVIDIA A100 80GB PCIe  yes   $0.82

It is possible to opt out of using Community Cloud in the backend settings.

Note

If you've previously configured the runpod backend via the dstack UI, your backend settings will likely contain a fixed set of regions. Previous dstack versions used to add it automatically. You can remove the regions property to allow all regions, including two-letter Community Cloud regions.

What's Changed

Full Changelog: dstackai/dstack@0.18.43...0.18.44

0.18.43-v1

26 Feb 11:34

Choose a tag to compare

CLI autocompletion

The dstack CLI now supports shell autocompletion for bash and zsh. It suggests completions for subcommands:

✗ dstack s
server  -- Start a server
stats   -- Show run stats
stop    -- Stop a run

and dynamic completions for resource names:

✗ dstack logs m
mighty-chicken-1  mighty-crab-1  my-dev  --

To set up the CLI autocompletion for your shell, follow the Installation guide.

max_duration set to off by default

The max_duration parameter that controls how long a run is allowed to run before stopping automatically is now set to off by default for all run configuration types. This means that dstack won't stop runs automatically unless max_duration is specified explicitly.

Previously, the max_duration defaults were 72h for tasks, 6h for dev environments, and off for services. This led to unintended runs termination and caused confusion for users unaware of max_duration. The new default makes max_duration opt-in and, thus, predictable.

If you relied on the previous max_duration defaults, ensure you've added max_duration to your run configurations.

GCP Logging for run logs

The dstack server requires storing run logs externally when for multi-replica server deployments. Previously, the only supported external storage was AWS CloudWatch, which limited production server deployments to AWS. Now the dstack server adds support for GCP Logging to store run logs. Follow the Server deployment guide for more information.

Custom IAM instance profile for AWS

The AWS backend config gets the new iam_instance_profile parameter that allows specifying IAM instance profile that will be associated with provisioned EC2 instances. You can also specify the IAM role name for roles created via the AWS console as AWS automatically creates an instance profile and gives it the same name as the role:

projects:
- name: main
  backends:
  - type: aws
    iam_instance_profile: dstack-test-role
    creds:
      type: default

This can be used to access AWS resources from runs without passing credentials explicitly.

Oracle Cloud spot instances

The oci backend can now provision interruptible spot instances, providing more cost-effective GPUs for workloads that can recover from interruptions.

> dstack apply --gpu 1.. --spot -b oci
 #  BACKEND  REGION          INSTANCE   RESOURCES                                    SPOT  PRICE     
 1  oci      eu-frankfurt-1  VM.GPU2.1  24xCPU, 72GB, 1xP100 (16GB), 50.0GB (disk)   yes   $0.6375   
 2  oci      eu-frankfurt-1  VM.GPU3.1  12xCPU, 90GB, 1xV100 (16GB), 50.0GB (disk)   yes   $1.475    
 3  oci      eu-frankfurt-1  VM.GPU3.2  24xCPU, 180GB, 2xV100 (16GB), 50.0GB (disk)  yes   $2.95

Breaking changes

  • Dropped support for python: 3.8 in run configuration.
  • Set max_duration to off by default for all run configuration types.

What's Changed

New Contributors

Full Changelog: dstackai/dstack@0.18.42...0.18.43

0.18.42-v1

17 Feb 10:16

Choose a tag to compare

Volume attachments

It's now possible to see volume attachments when listing volumes. The dstack volume -v command shows which fleets the volumes are attached to in the ATTACHED column:

✗ dstack volume -v
 NAME             BACKEND  REGION                       STATUS  ATTACHED  CREATED      ERROR 
 my-gcp-volume-1  gcp      europe-west4                 active  my-dev    1 weeks ago        
                           (europe-west4-c)                                                  
 my-aws-volume-1  aws      eu-west-1 (eu-west-1a)       active  -         3 days ago         

This can help you decide if you should use an existing volume for a run or create a new volume if all volumes are occupied.

You can also check which volumes are currently attached and which are not via the API:

import os
import requests

url = os.environ["DSTACK_URL"]
token = os.environ["DSTACK_TOKEN"]
project = os.environ["DSTACK_PROJECT"]

print("Getting volumes...")
resp = requests.post(
    url=f"{url}/api/project/{project}/volumes/list",
    headers={"Authorization": f"Bearer {token}"},
)
volumes = resp.json()

print("Checking volumes attachments...")
for volume in volumes:
    is_attached = len(volume["attachments"]) > 0
    print(f"Volume {volume['name']} attached: {is_attached}")
✗ python check_attachments.py
Getting volumes...
Checking volumes attachments...
Volume my-gcp-volume-1 attached: True
Volume my-aws-volume-1 attached: False

Bugfixes

This release contains several important bugfixes including a bugfix for fleets with placement: cluster (#2302).

What's Changed

Full Changelog: dstackai/dstack@0.18.41...0.18.42