Releases · dstackai/dstack-enterprise

01 May 14:07

un-def

0.19.7-v1

89d5de7

0.19.7-v1

Plugins

Run configurations have many options. While dstack aims to simplify them and provide rational defaults, teams may sometimes want to enforce their own defaults and configurations across projects.

To support this, we're introducing a plugin system that allows such enforcements to be defined programmatically. You can now define a plugin using dstack's Python SDK and bundle it with the dstack server.

For example, you can create your own plugin to override run configuration options—e.g., to prepend commands, set policies, and more.

For more information on plugin development, see the documentation and example.

Note

Plugins are currently an experimental feature. Backward compatibility is not guaranteed between releases.

Tenstorrent

The new update introduces initial support for Tenstorrent's Wormhole accelerators.

Now, if you create SSH fleets with hosts that have N150 or N300 PCIe boards, dstack will automatically detect them and allow you to use such a fleet for running dev environments, tasks, and services.

Dedicated examples for using dstack with Tenstorrent's accelerators will be published soon.

What's changed

Fix client backward compatibility when reapplying runs by @r4victor in dstackai/dstack#2558
Add A3 High example by @r4victor in dstackai/dstack#2559
Document GCP firewall allowing inter-VPC traffic by @r4victor in dstackai/dstack#2563
[CI] Build dstack-{shim,runner} for ARM64 by @un-def in dstackai/dstack#2561
Implement /api/project/{project_name}/fleets/apply by @r4victor in dstackai/dstack#2577
Introduce effective_spec for runs and fleets by @r4victor in dstackai/dstack#2579
Support Nebius tenancies with multiple projects by @jvstme in dstackai/dstack#2575
[UX] Shorter resource syntax for dstack apply, dstack offer, anddstack ps by @peterschmidt85 in dstackai/dstack#2572
Fix missing /fleets/apply for old servers by @r4victor in dstackai/dstack#2582
Updated runner and shim contributing guide by @peterschmidt85 in dstackai/dstack#2534
Mount volumes at /mnt/disks by @r4victor in dstackai/dstack#2584
Use gVNIC for GCP A3 VMs by @r4victor in dstackai/dstack#2585
[Bug] Several issues with vastai provider #142 #2566 by @peterschmidt85 in dstackai/dstack#2567
[Feature] Support Tenstorrent's Wormhole accelerators #2573 by @peterschmidt85 in dstackai/dstack#2574
Implement plugins by @r4victor in dstackai/dstack#2581
[Feature] Support Tenstorrent's Wormhole accelerators #2573 by @peterschmidt85 in dstackai/dstack#2589

Full changelog: dstackai/dstack@0.19.5...0.19.7

Contributors

un-def, r4victor, and 2 other contributors

Assets 2

23 Apr 10:41

r4victor

0.19.5-v1

89d5de7

0.19.5-v1

CLI

Offers

You can now list available offers (hardware configurations) from the configured backends using the CLI—without needing to define a run configuration. Just run dstack offer and specify the resource requirements. The CLI will output available offers, including backend, region, instance type, resources, spot availability, and pricing:

$ dstack offer --gpu H100:1.. --max-offers 10

 #   BACKEND     REGION     INSTANCE TYPE          RESOURCES                                     SPOT  PRICE   
 1   datacrunch  FIN-01     1H100.80S.30V          30xCPU, 120GB, 1xH100 (80GB), 100.0GB (disk)  no    $2.19   
 2   datacrunch  FIN-02     1H100.80S.30V          30xCPU, 120GB, 1xH100 (80GB), 100.0GB (disk)  no    $2.19   
 3   datacrunch  FIN-02     1H100.80S.32V          32xCPU, 185GB, 1xH100 (80GB), 100.0GB (disk)  no    $2.19   
 4   datacrunch  ICE-01     1H100.80S.32V          32xCPU, 185GB, 1xH100 (80GB), 100.0GB (disk)  no    $2.19   
 5   runpod      US-KS-2    NVIDIA H100 PCIe       16xCPU, 251GB, 1xH100 (80GB), 100.0GB (disk)  no    $2.39   
 6   runpod      CA         NVIDIA H100 80GB HBM3  24xCPU, 251GB, 1xH100 (80GB), 100.0GB (disk)  no    $2.69   
 7   nebius      eu-north1  gpu-h100-sxm           16xCPU, 200GB, 1xH100 (80GB), 100.0GB (disk)  no    $2.95   
 8   runpod      AP-JP-1    NVIDIA H100 80GB HBM3  20xCPU, 251GB, 1xH100 (80GB), 100.0GB (disk)  no    $2.99   
 9   runpod      CA-MTL-1   NVIDIA H100 80GB HBM3  28xCPU, 251GB, 1xH100 (80GB), 100.0GB (disk)  no    $2.99   
 10  runpod      CA-MTL-2   NVIDIA H100 80GB HBM3  26xCPU, 125GB, 1xH100 (80GB), 100.0GB (disk)  no    $2.99   
     ...                                                                                                                
 Shown 10 of 99 offers, $127.816 max

Learn more about how the new CLI works in the reference

Configuration

Resource tags

It's now possible to set custom resource-level tags using the new tags property:

type: dev-environment
ide: vscode
tags:
  my_custom_tag: some_value
  another_tag: another_value_123

The tags property is supported by all configuration types: runs, fleets, volumes, gateways, and profiles. The tags are propagated to the underlying cloud resources on backends that support tags. Currently, it's AWS, Azure, and GCP.

Shell configuration

With the new shell property you can specify the shell used to run commands (or init for dev environments):

type: task
image: ubuntu

shell: bash
commands:
  # now we can use Bash features, e.g., arrays:
  - words=(dstack is)
  - words+=(awesome)
  - echo ${words[@]}  # prints "dstack is awesome"

GCP

A3 High and A3 Edge

dstack now automatically sets up GCP A3 High and A3 Edge instances with GPUDirect-TCPX optimized NCCL communication.

An example on how to provision an A3 High cluster and run NCCL tests on it using dstack is coming soon!

Volumes

Total cost

The UI now shows volumes total cost and termination date alongside volume price. Previously, only the price information was available.

What's changed

Update Axolotl Examples by @Bihan in dstackai/dstack#2502
Update TGI Example with Llama 4 Scout by @Bihan in dstackai/dstack#2529
Implement custom per-resource tags by @r4victor in dstackai/dstack#2533
Add try_advisory_lock_ctx by @r4victor in dstackai/dstack#2537
[chore]: Drop is_core_model_instance by @jvstme in dstackai/dstack#2536
[runner] Rework env variables exporting by @un-def in dstackai/dstack#2535
Fix ruff version discrepancy by @jvstme in dstackai/dstack#2539
Add volume cost by @r4victor in dstackai/dstack#2541
Add shell run property by @un-def in dstackai/dstack#2542
[Feature]: Support dstack offer #2142 by @peterschmidt85 in dstackai/dstack#2540
Update Llama4 Readme with Axolotl fine-tuning example by @Bihan in dstackai/dstack#2545
[Docs] Document dstack offer by @peterschmidt85 in dstackai/dstack#2546
[Docs]: Replace vRAM -> VRAM by @jvstme in dstackai/dstack#2548
Include statics as artifacts in both wheel and sdist by @r4victor in dstackai/dstack#2544
Support A3 High/Edge GCP clusters with GPUDirect-TCPX by @r4victor in dstackai/dstack#2549

Full changelog: dstackai/dstack@0.19.4...0.19.5

Contributors

un-def, Bihan, and 3 other contributors

Assets 2

17 Apr 10:36

r4victor

0.19.4-v1

89d5de7

0.19.4-v1

Rate limits for services

You can now configure rate limits for your services running behind a gateway.

type: service
image: my-app:latest
port: 80

rate_limits:
# For /api/auth/* - 1 request per second, no bursts
- prefix: /api/auth/
  rps: 1
# For other URLs - 4 requests per second + bursts of up to 9 requests
- rps: 4
  burst: 9

Examples: TensorRT-LLM and Llama 4

We added a new example on TensorRT-LLM that shows how to deploy both DeepSeek R1 and its distilled version
using TensorRT-LLM and dstack.

The Llama example was updated to demonstrate the deployment of Llama 4 Scout using dstack.

Improved contributing experience

We continue to make contributing to dstack easier and improve dev experience. Since the last release, we moved from pip to uv in CI and dev pipelines. Dependencies installation times went from ~70 seconds to less than 10 seconds. The Development guide was updated to show how to get the dstack development setup with uv. The CI Build pipeline triggered on pull requests were optimized from 9 minutes to 4 minutes.

We also documented uv as one of the recommended installation options for dstack.

What's Changed

[Landing] Refactoring (WIP) by @peterschmidt85 in dstackai/dstack#2495
Fix CloudWatchLogStorage with sparse logs by @un-def in dstackai/dstack#2501
Migrate to uv by @colinjc in dstackai/dstack#2455
Fix docs build with uv by @r4victor in dstackai/dstack#2504
[Example] Update Llama 4 Examples by @Bihan in dstackai/dstack#2508
Move to uv in dstack-server Docker image by @r4victor in dstackai/dstack#2509
Fix dstack dependency for gateway by @r4victor in dstackai/dstack#2511
[Docs] Add uv to Installation; Minor improvements by @peterschmidt85 in dstackai/dstack#2510
Validate usernames by @r4victor in dstackai/dstack#2514
Run pytest in parallel with pytest-xdist by @r4victor in dstackai/dstack#2515
Add Llama4 AMD example by @Bihan in dstackai/dstack#2513
Use exponentially increasing retry delays for pending runs by @r4victor in dstackai/dstack#2519
Speed up frontend CI by @r4victor in dstackai/dstack#2520
Service rate limits by @jvstme in dstackai/dstack#2517
Set no-guess-dev for dev package versions by @r4victor in dstackai/dstack#2521
Detect dstack version from file instead of git by @r4victor in dstackai/dstack#2524
Add TensorrRT-LLM Example by @Bihan in dstackai/dstack#2444
Fix Nginx upstream name conflicts by @jvstme in dstackai/dstack#2526
Fix detaching from dstack attach by @jvstme in dstackai/dstack#2528

New Contributors

@colinjc made their first contribution in dstackai/dstack#2455

Full Changelog: dstackai/dstack@0.19.3...0.19.4

Contributors

un-def, Bihan, and 4 other contributors

Assets 2

10 Apr 10:49

r4victor

0.19.3-v1

89d5de7

0.19.3-v1

Optimized networking for GCP H100 clusters

dstack now automatically sets up GCP A3 Mega instances with GPUDirect-TCPXO optimized NCCL communication to take advantage of the 1800Gbps maximum network bandwidth. Here's NCCL tests results on an A3 Mega cluster provisioned with dstack:

✗ dstack apply -f examples/misc/a3mega-clusters/nccl-tests.dstack.yml 

nccl-tests provisioning completed (running)
nThread 1 nGpus 1 minBytes 8388608 maxBytes 8589934592 step: 2(factor) warmup iters: 5 iters: 200 agg iters: 1 validation: 0 graph: 0

                                                             out-of-place                       in-place          
      size         count      type   redop    root     time   algbw   busbw #wrong     time   algbw   busbw #wrong
       (B)    (elements)                               (us)  (GB/s)  (GB/s)            (us)  (GB/s)  (GB/s)       
     8388608        131072     float    none      -1    166.6   50.34   47.19    N/A    164.1   51.11   47.92    N/A
    16777216        262144     float    none      -1    204.6   82.01   76.89    N/A    203.8   82.30   77.16    N/A
    33554432        524288     float    none      -1    284.0  118.17  110.78    N/A    281.7  119.12  111.67    N/A
    67108864       1048576     float    none      -1    447.4  150.00  140.62    N/A    443.5  151.31  141.86    N/A
   134217728       2097152     float    none      -1    808.3  166.05  155.67    N/A    801.9  167.38  156.92    N/A
   268435456       4194304     float    none      -1   1522.1  176.36  165.34    N/A   1518.7  176.76  165.71    N/A
   536870912       8388608     float    none      -1   2892.3  185.62  174.02    N/A   2894.4  185.49  173.89    N/A
  1073741824      16777216     float    none      -1   5532.7  194.07  181.94    N/A   5530.7  194.14  182.01    N/A
  2147483648      33554432     float    none      -1    10863  197.69  185.34    N/A    10837  198.17  185.78    N/A
  4294967296      67108864     float    none      -1    21481  199.94  187.45    N/A    21466  200.08  187.58    N/A
  8589934592     134217728     float    none      -1    42713  201.11  188.54    N/A    42701  201.16  188.59    N/A
Out of bounds values : 0 OK
Avg bus bandwidth    : 146.948 

Done

For more information on how to provision and use A3 Mega clusters with GPUDirect-TCPXO, see the A3 Mega example.

H200 and B200 support on Datacrunch

You can now provision H200 and B200 instances on DataCrunch. DataCrunch is the first dstack backend to support B200:

✗ dstack apply --gpu B200
 Project              main                                   
 User                 admin                                  
 Configuration        .dstack.yml                            
 Type                 dev-environment                        
 Resources            1..xCPU, 2GB.., 1xB200, 100GB.. (disk) 
 Max price            -                                      
 Max duration         -                                      
 Inactivity duration  -                                      
 Spot policy          auto                                   
 Retry policy         -                                      
 Creation policy      reuse-or-create                        
 Idle duration        5m                                     
 Reservation          -                                      

 #  BACKEND     REGION  INSTANCE   RESOURCES                                      SPOT  PRICE                
 1  datacrunch  FIN-03  1B200.31V  31xCPU, 250GB, 1xB200 (180GB), 100.0GB (disk)  yes   $1.3                 
 2  datacrunch  FIN-03  1B200.31V  31xCPU, 250GB, 1xB200 (180GB), 100.0GB (disk)  no    $4.49
 3  datacrunch  FIN-01  1B200.31V  31xCPU, 250GB, 1xB200 (180GB), 100.0GB (disk)  yes   $1.3   not available 
    ...                                                                                                      
 Shown 3 of 8 offers, $4.49 max

Submit a new run? [y/n]:

CUDO improvements

The CUDO backend is updated to support H100, A100, A40 and all other GPUs currently offered by CUDO.

`fleets` configuration property

With the new fleets property and --fleet dstack apply option, it's now possible to restrict a set of fleets considered for reuse:

type: task

fleets: [my-fleet-1, my-fleet-2]

dstack apply --fleet my-fleet-1 --fleet my-fleet-2

What's Changed

[Blog] Built-in UI for monitoring basic GPU metrics by @peterschmidt85 in dstackai/dstack#2470
Fix Nebius project discovery by @jvstme in dstackai/dstack#2473
Support A3 Mega GCP clusters with GPUDirect-TCPXO by @r4victor in dstackai/dstack#2469
Fix Nebius private networks with non-default CIDR by @jvstme in dstackai/dstack#2475
Add region for Lambda by @HSaddiq in dstackai/dstack#2471
Fix relative date in CLI for weeks and months by @jvstme in dstackai/dstack#2481
Fix terminating TensorDock instances by @jvstme in dstackai/dstack#2480
Use all Lambda regions by default by @jvstme in dstackai/dstack#2478
Allow mounting volumes into /workflow by @r4victor in dstackai/dstack#2483
Improve Datacrunch backend by @r4victor in dstackai/dstack#2487
UI improvements by @olgenn in dstackai/dstack#2489
Add fleets property to run configurations and CLI by @un-def in dstackai/dstack#2488
Fix GitIgnore by @un-def in dstackai/dstack#2491
Remove hardcoded cudo regions by @r4victor in dstackai/dstack#2493
Optimize GCP list usable subnets across regions by @r4victor in dstackai/dstack#2494
Make regions filtering case insensitive by @r4victor in dstackai/dstack#2499

New Contributors

@HSaddiq made their first contribution in dstackai/dstack#2471

Full Changelog: dstackai/dstack@0.19.2...0.19.3

Contributors

un-def, olgenn, and 4 other contributors

Assets 2

03 Apr 14:04

un-def

0.19.2-v1

89d5de7

0.19.2-v1

Nebius

This update introduces an integration with Nebius, a cloud provider offering top-tier NVIDIA GPUs at competitive prices.

$ dstack apply
 #  BACKEND  REGION     RESOURCES                        SPOT  PRICE
 1  nebius   eu-north1  8xCPU, 32GB, 1xL40S (48GB)       no    $1.5484
 2  nebius   eu-north1  16xCPU, 200GB, 1xH100 (80GB)     no    $2.95
 3  nebius   eu-north1  16xCPU, 200GB, 1xH200 (141GB)    no    $3.5
 4  nebius   eu-north1  64xCPU, 384GB, 2xL40S (48GB)     no    $4.5688
 5  nebius   eu-north1  128xCPU, 768GB, 4xL40S (48GB)    no    $9.1376
 6  nebius   eu-north1  128xCPU, 1600GB, 8xH100 (80GB)   no    $23.6
 7  nebius   eu-north1  128xCPU, 1600GB, 8xH200 (141GB)  no    $28

The new nebius backend supports CPU and GPU instances, fleets, distributed tasks, and more. Support for network volumes and enhanced inter-node connectivity is coming in future releases. See the docs for instructions on configuring Nebius in your dstack project.

Metrics

This release brings a long-awaited feature — the Metrics page in the UI:

In addition, the dstack stats command was renamed to dstack metrics and updated — previously, the max value of CPU utilization depended on a number of CPUs (for example, it was 400% for 4-core CPU), now it's normalized to 100%.

$ dstack metrics nccl-tests
 NAME        CPU  MEMORY            GPU
 nccl-tests  81%  2754MB/1638400MB  #0 100740MB/144384MB 100% Util
                                    #1 100740MB/144384MB 100% Util
                                    #2 100740MB/144384MB 99% Util
                                    #3 100740MB/144384MB 99% Util
                                    #4 100740MB/144384MB 99% Util
                                    #5 100740MB/144384MB 99% Util
                                    #6 100740MB/144384MB 99% Util
                                    #7 100740MB/144384MB 100% Util

What's Changed

[Docs] Update the home page to include a updated diagram by @peterschmidt85 in dstackai/dstack#2450
Update ChatCompletionsChunk for Deepseek-R1 response by @Bihan in dstackai/dstack#2452
[Blog] Minor blog refactoring by @peterschmidt85 in dstackai/dstack#2457
[Blog] Accessing dev environments with Cursor by @peterschmidt85 in dstackai/dstack#2456
Move cachetools to base deps by @r4victor in dstackai/dstack#2459
[Blog] Prometheus by @peterschmidt85 in dstackai/dstack#2458
Add optional bearer auth to metrics endpoint by @un-def in dstackai/dstack#2460
[CLI] Rename stats command to metrics by @un-def in dstackai/dstack#2462
Add SgLang Example by @Bihan in dstackai/dstack#2461
Update NIM example with DeepSeek-R1-Distill by @Bihan in dstackai/dstack#2454
[Blog] Supporting MPI and NCCL/RCCL tests by @peterschmidt85 in dstackai/dstack#2465
Add Nebius backend by @jvstme in dstackai/dstack#2463
[Feature] Show Run metrics on the UI by @olgenn in dstackai/dstack#2446
[CLI] Divide CPU util by a number of vCPUs in dstack metrics by @un-def in dstackai/dstack#2466

Full Changelog: dstackai/dstack@0.19.1...0.19.2

Contributors

un-def, olgenn, and 4 other contributors

Assets 2

26 Mar 12:36

un-def

0.19.1-v1

89d5de7

0.19.1-v1

Metrics

With this update, we've added more metrics that you can export to Prometheus. The new metrics allow tracking job CPU and system memory utilization, user and project usage stats, success/error rate, and more.

Runs

Name	Type	Description	Examples
`dstack_run_count_total`	counter	The total number of runs	`537`
`dstack_run_count_terminated_total`	counter	The number of terminated runs	`118`
`dstack_run_count_failed_total`	counter	The number of failed runs	`27`
`dstack_run_count_done_total`	counter	The number of successful runs	`218`

Run jobs

Name	Type	Description	Examples
`dstack_job_cpu_count`	gauge	Job CPU count	`32.0`
`dstack_job_cpu_time_seconds_total`	counter	Total CPU time consumed by the job, seconds	`11.727975`
`dstack_job_memory_total_bytes`	gauge	Total memory allocated for the job, bytes	`4009754624.0`
`dstack_job_memory_usage_bytes`	gauge	Memory used by the job (including cache), bytes	`339017728.0`
`dstack_job_memory_working_set_bytes`	gauge	Memory used by the job (not including cache), bytes	`147251200.0`

For more details on metrics, check Metrics

Major bugfixes

Fixed a bug introduced in 0.19.0 where the working directory in the container was incorrectly set by default to / instead of /workflow.

What's changed

Fix trying fleet instance offers by @jvstme in dstackai/dstack#2443
Add job system metrics, run metrics by @un-def in dstackai/dstack#2445
Fix default working dir in containers by @jvstme in dstackai/dstack#2449
[Examples] Update nccl-tests by @un-def in dstackai/dstack#2451

Full changelog: dstackai/dstack@0.19.0...0.19.1

Contributors

un-def and jvstme

Assets 2

20 Mar 10:52

r4victor

0.19.0-v1

89d5de7

0.19.0-v1

Simplified backend integration

To provide best multi-cloud experience and GPU availability, dstack integrates with many cloud GPU providers including AWS, Azure, GCP, RunPod, Lambda, Vultr, and others. As we'd like to see even more GPU providers supported by dstack, this release comes with a major internal refactoring aimed to simplify the process of adding new integrations. See the Backend integration guide for more details. Join our Discord if have any questions about the integration process.

MPI workloads and NCCL tests

dstack now configures internode SSH connectivity for distributed tasks. You can log in to any node from any node via SSH with a simple ssh <node_ip> command. The out-of-the-box SSH connectivity also allows running mpirun. See the NCCL Tests example.

Cost and usage metrics

In addition to DCGM metrics, dstack now exports a set of Prometheus metrics for cost and usage tracking. Here's how it may look in the Grafana dashboard:

See the documentation for a full list of metrics and labels.

Cursor IDE support

dstack can now launch Cursor dev environments. Just specify ide: cursor in the run configuration:

type: dev-environment
ide: cursor

Deprecations

The Python API methods get_plan(), exec_plan(), and submit() are deprecated in favor of get_run_plan(), apply_plan(), and apply_configuration(). The deprecated methods had clumsy signatures with many top-level parameters. The new signatures align better with the CLI and HTTP API.

Breaking changes

The 0.19.0 release drops several previously deprecated or undocumented features. There are no other significant breaking changes. The 0.19.0 server continues to support 0.18.x CLI versions. But the 0.19.0 CLI does not work with older 0.18.x servers, so you should update the server first or the server and the clients simultaneously.

Drop the dstack run CLI command.
Drop the --attach mode for the dstack logs CLI command.
Drop Pools functionality:
- The dstack pool CLI commands.
- /api/project/{project_name}/runs/get_offers, /api/project/{project_name}/runs/create_instance, /api/pools/list_instances, /api/project/{project_name}/pool/* API endpoints.
- pool_name and instance_name parameters in profiles and run configurations.
Remove retry_policy from profiles.
Remove termination_idle_time and termination_policy from profiles and fleet configurations.
Drop RUN_NAME and REPO_ID run environment variables.
Drop the /api/backends/config_values endpoint used for interactive configuration.
The API accepts and returns azure_config["regions"] instead of azure_config["locations"] (unified with server/config.yml).

What's Changed

Fix gateways with a previously used IP address by @jvstme in dstackai/dstack#2388
Simplify backend configurators and models by @r4victor in dstackai/dstack#2389
Store BackendType as string instead of enum in the DB by @r4victor in dstackai/dstack#2393
Introduce ComputeWith classes to detect compute features by @r4victor in dstackai/dstack#2392
Move backend/compute configs from config.py to models.py by @r4victor in dstackai/dstack#2395
Provide default run_job implementation for VM backends by @r4victor in dstackai/dstack#2396
Configure inter-node SSH on multi-node tasks by @un-def in dstackai/dstack#2394
[Blog] Using SSH fleets with TensorWave's private AMD cloud by @peterschmidt85 in dstackai/dstack#2391
Add script to generate boilerplate code for new backend by @r4victor in dstackai/dstack#2397
Add datacenter-gpu-manager-4-proprietary to CUDA images by @un-def in dstackai/dstack#2399
Drop pools by @r4victor in dstackai/dstack#2401
Transition high-level Python runs API to new methods by @r4victor in dstackai/dstack#2403
Drop dstack run by @r4victor in dstackai/dstack#2404
Drop dstack logs --attach by @r4victor in dstackai/dstack#2405
Remove retry_policy from profiles by @r4victor in dstackai/dstack#2406
Remove termination_idle_time and termination_policy by @r4victor in dstackai/dstack#2407
Clean up models backward compatibility code by @r4victor in dstackai/dstack#2408
Restore removed models fields for compatibility with 0.18 clients by @r4victor in dstackai/dstack#2414
Clean up legacy repo fields by @jvstme in dstackai/dstack#2411
Switch AWS gateways from t2.micro to t3.micro by @r4victor in dstackai/dstack#2416
Remove old client excludes by @r4victor in dstackai/dstack#2417
Use new JobTerminationReason values by @r4victor in dstackai/dstack#2418
Drop RUN_NAME and REPO_ID env vars by @r4victor in dstackai/dstack#2419
Drop irrelevant Nebius backend implementation by @jvstme in dstackai/dstack#2421
[Feature]: Support the cursor IDE #2412 by @peterschmidt85 in dstackai/dstack#2413
Simplify implementation of new backends #2372 by @olgenn in dstackai/dstack#2423
Support multiple domains with Entra login by @r4victor in dstackai/dstack#2424
Support setting project members by email by @r4victor in dstackai/dstack#2429
Fix json schema reference and invalid properties errors by @r4victor in dstackai/dstack#2433
[Blog]: DeepSeek R1 inference performance: MI300X vs. H200 by @peterschmidt85 in dstackai/dstack#2425
Add new metrics by @un-def in dstackai/dstack#2434
Add instance and job cost/usage Prometheus metrics by @un-def in dstackai/dstack#2432
[Docker] Add dstackai/efa image by @un-def in dstackai/dstack#2422
Restore fleet termination_policy for 0.18 backward compatibility by @r4victor in dstackai/dstack#2436
[Bug]: Search over users doesn't work by @olgenn in dstackai/dstack#2439
[Feature]: Support activating/deactivating users via the UI by @olgenn in dstackai/dstack#2440
[Feature]: Display Assigned Gateway Information on Run Pages by @olgenn in dstackai/dstack#2438
[Docs]: Update the Metrics guide by @peterschmidt85 in dstackai/dstack#2441
[Examples] Update nccl-tests by @un-def in dstackai/dstack#2415

Full Changelog: dstackai/dstack@0.18.44...0.19.0

Contributors

un-def, olgenn, and 3 other contributors

Assets 2

05 Mar 14:08

un-def

0.18.44-v1

233bdeb

0.18.44-v1

Single Sign-On via Microsoft Entra ID

dstack Enterprise now supports Single Sign-On via Microsoft Entra ID (formerly Azure AD). When Entra ID integration is configured, the dstack login page will display the Sign in with Microsoft Entra ID button. Users can log in to dstack using their Entra ID account without entering any dstack-specific credentials. See the Entra ID integration guide for more information.

GPU utilization policy

To avoid a waste of resources, you can now specify a minimum required GPU utilization for the run. If any GPU has utilization below threshold in all samples in a time window, the run is terminated.

type: task

utilization_policy:
  min_gpu_utilization: 30
  time_window: 30m

resources:
  gpu: nvidia:8:24GB

In this example, if any of 8 GPUs has utilization below 30% in all samples during last 30 minutes, the run will be terminated.

DCGM metrics

dstack can now collect and export NVIDIA DCGM metrics from running jobs on supported backends (AWS, Azure, GCP, OCI) and SSH fleets.

Metrics are disabled by default. See the documenation for how to enable and scrape them.

RunPod Community Cloud

In addition to Secure Cloud, dstack will now use Community Cloud offers in the runpod backend. Community Cloud offers are usually cheaper and can be identified by a two-letter region code.

$ dstack apply -f .dstack.yml -b runpod
 #  BACKEND  REGION    INSTANCE               SPOT  PRICE
 1  runpod   CA        NVIDIA A100 80GB PCIe  yes   $0.6
 2  runpod   CA-MTL-3  NVIDIA A100 80GB PCIe  yes   $0.82

It is possible to opt out of using Community Cloud in the backend settings.

Note

If you've previously configured the runpod backend via the dstack UI, your backend settings will likely contain a fixed set of regions. Previous dstack versions used to add it automatically. You can remove the regions property to allow all regions, including two-letter Community Cloud regions.

What's Changed

Show inactivity_duration in run plan in CLI by @jvstme in dstackai/dstack#2366
Minor fixes noticed by aider by @r4victor in dstackai/dstack#2367
Reexport DCGM metrics from instances by @un-def in dstackai/dstack#2364
[Internal]: Update backend contributing docs by @jvstme in dstackai/dstack#2369
Allow global admins to edit user emails via the UI by @olgenn in dstackai/dstack#2377
Support sign-in via Microsoft EntraID for dstack Enterprise #251 by @olgenn in dstackai/dstack#2376
Add utilization_policy by @un-def in dstackai/dstack#2375
Support RunPod Community Cloud by @jvstme in dstackai/dstack#2378
Add ORDER BY when selecting multiple rows with FOR UPDATE by @r4victor in dstackai/dstack#2379
Allow global admins to edit user emails via the UI by @olgenn in dstackai/dstack#2381
Support inactivity_duration in-place update by @jvstme in dstackai/dstack#2380
Improve error message if pulling fails by @jvstme in dstackai/dstack#2382
Fix utilization_policy in profiles by @un-def in dstackai/dstack#2385
Set lower and upper limits of utilization_policy.time_window by @un-def in dstackai/dstack#2386
Try more offers when starting a job by @jvstme in dstackai/dstack#2387

Full Changelog: dstackai/dstack@0.18.43...0.18.44

Contributors

un-def, olgenn, and 2 other contributors

Assets 2

26 Feb 11:34

r4victor

0.18.43-v1

fa33161

0.18.43-v1

CLI autocompletion

The dstack CLI now supports shell autocompletion for bash and zsh. It suggests completions for subcommands:

✗ dstack s
server  -- Start a server
stats   -- Show run stats
stop    -- Stop a run

and dynamic completions for resource names:

✗ dstack logs m
mighty-chicken-1  mighty-crab-1  my-dev  --

To set up the CLI autocompletion for your shell, follow the Installation guide.

`max_duration` set to `off` by default

The max_duration parameter that controls how long a run is allowed to run before stopping automatically is now set to off by default for all run configuration types. This means that dstack won't stop runs automatically unless max_duration is specified explicitly.

Previously, the max_duration defaults were 72h for tasks, 6h for dev environments, and off for services. This led to unintended runs termination and caused confusion for users unaware of max_duration. The new default makes max_duration opt-in and, thus, predictable.

If you relied on the previous max_duration defaults, ensure you've added max_duration to your run configurations.

GCP Logging for run logs

The dstack server requires storing run logs externally when for multi-replica server deployments. Previously, the only supported external storage was AWS CloudWatch, which limited production server deployments to AWS. Now the dstack server adds support for GCP Logging to store run logs. Follow the Server deployment guide for more information.

Custom IAM instance profile for AWS

The AWS backend config gets the new iam_instance_profile parameter that allows specifying IAM instance profile that will be associated with provisioned EC2 instances. You can also specify the IAM role name for roles created via the AWS console as AWS automatically creates an instance profile and gives it the same name as the role:

projects:
- name: main
  backends:
  - type: aws
    iam_instance_profile: dstack-test-role
    creds:
      type: default

This can be used to access AWS resources from runs without passing credentials explicitly.

Oracle Cloud spot instances

The oci backend can now provision interruptible spot instances, providing more cost-effective GPUs for workloads that can recover from interruptions.

> dstack apply --gpu 1.. --spot -b oci
 #  BACKEND  REGION          INSTANCE   RESOURCES                                    SPOT  PRICE     
 1  oci      eu-frankfurt-1  VM.GPU2.1  24xCPU, 72GB, 1xP100 (16GB), 50.0GB (disk)   yes   $0.6375   
 2  oci      eu-frankfurt-1  VM.GPU3.1  12xCPU, 90GB, 1xV100 (16GB), 50.0GB (disk)   yes   $1.475    
 3  oci      eu-frankfurt-1  VM.GPU3.2  24xCPU, 180GB, 2xV100 (16GB), 50.0GB (disk)  yes   $2.95

Breaking changes

Dropped support for python: 3.8 in run configuration.
Set max_duration to off by default for all run configuration types.

What's Changed

Replace pagination with lazy loading in dstack UI by @olgenn in dstackai/dstack#2309
Dynamic CLI completion by @solovyevt in dstackai/dstack#2285
Remove excessive project_id check for GCP by @r4victor in dstackai/dstack#2312
[Docs] GPU blocks and proxy jump blog post (WIP) by @peterschmidt85 in dstackai/dstack#2307
[Docs] Add blocks description to Concepts/Fleets by @un-def in dstackai/dstack#2308
Replace pagination with lazy loading on Fleet list by @olgenn in dstackai/dstack#2320
Improve GCP creds validation by @r4victor in dstackai/dstack#2322
[UI]: Fix job details for multi-job runs by @jvstme in dstackai/dstack#2321
Fix instance filtering by backend to use base backend by @r4victor in dstackai/dstack#2324
[Docs]: Fix inactivity duration blog post by @jvstme in dstackai/dstack#2327
Fix CLI instance status for instances with blocks by @jvstme in dstackai/dstack#2332
Partially fixes openapi spec by @haringsrob in dstackai/dstack#2330
[Bug]: UI does not show logs of distributed tasks and replicated services by @olgenn in dstackai/dstack#2334
[Feature]: Replace pagination with lazy loading in Instances list by @olgenn in dstackai/dstack#2335
[Feature]: Replace pagination with lazy loading in volume list by @olgenn in dstackai/dstack#2336
[Bug]: Finished jobs included in run price by @olgenn in dstackai/dstack#2338
Fix DSTACK_GPUS_PER_NODE|DSTACK_GPUS_NUM when blocks are used by @un-def in dstackai/dstack#2333
Support storing run logs using GCP Logging by @r4victor in dstackai/dstack#2340
Support OCI spot instances by @jvstme in dstackai/dstack#2337
[Feature]: Replace pagination with lazy loading in models list by @olgenn in dstackai/dstack#2351
[UI] Remember filter settings in local storage by @olgenn in dstackai/dstack#2352
[Internal]: Minor tweaks in packer docs and CI by @jvstme in dstackai/dstack#2356
Use unique names for backend resources by @r4victor in dstackai/dstack#2350
Set max_duration to off by default for all run configurations by @r4victor in dstackai/dstack#2357
Print message on dstack attach exit by @r4victor in dstackai/dstack#2358
Forbid python: 3.8 in run configurations by @jvstme in dstackai/dstack#2354
Fix Fabric Manager in AWS/GCP/Azure/OCI OS images by @jvstme in dstackai/dstack#2355
Install DCGM Exporter on dstack-built OS images by @un-def in dstackai/dstack#2360
Fix volume detachment for runs started before 0.18.41 by @r4victor in dstackai/dstack#2362
Increase Lambda provisioning timeout and refactor by @jvstme in dstackai/dstack#2353
Bump default OS image version by @jvstme in dstackai/dstack#2363
Support iam_instance_profile for AWS by @r4victor in dstackai/dstack#2365

New Contributors

@haringsrob made their first contribution in dstackai/dstack#2330

Full Changelog: dstackai/dstack@0.18.42...0.18.43

Contributors

haringsrob, un-def, and 5 other contributors

Assets 2

17 Feb 10:16

r4victor

0.18.42-v1

fa33161

0.18.42-v1

Volume attachments

It's now possible to see volume attachments when listing volumes. The dstack volume -v command shows which fleets the volumes are attached to in the ATTACHED column:

✗ dstack volume -v
 NAME             BACKEND  REGION                       STATUS  ATTACHED  CREATED      ERROR 
 my-gcp-volume-1  gcp      europe-west4                 active  my-dev    1 weeks ago        
                           (europe-west4-c)                                                  
 my-aws-volume-1  aws      eu-west-1 (eu-west-1a)       active  -         3 days ago

This can help you decide if you should use an existing volume for a run or create a new volume if all volumes are occupied.

You can also check which volumes are currently attached and which are not via the API:

import os
import requests

url = os.environ["DSTACK_URL"]
token = os.environ["DSTACK_TOKEN"]
project = os.environ["DSTACK_PROJECT"]

print("Getting volumes...")
resp = requests.post(
    url=f"{url}/api/project/{project}/volumes/list",
    headers={"Authorization": f"Bearer {token}"},
)
volumes = resp.json()

print("Checking volumes attachments...")
for volume in volumes:
    is_attached = len(volume["attachments"]) > 0
    print(f"Volume {volume['name']} attached: {is_attached}")

✗ python check_attachments.py
Getting volumes...
Checking volumes attachments...
Volume my-gcp-volume-1 attached: True
Volume my-aws-volume-1 attached: False

Bugfixes

This release contains several important bugfixes including a bugfix for fleets with placement: cluster (#2302).

What's Changed

Add Deepseek and Intel Examples by @Bihan in dstackai/dstack#2291
Add volume attachments info to the API and CLI by @r4victor in dstackai/dstack#2298
Fix and test offers and pool instances filtering by @r4victor in dstackai/dstack#2303

Full Changelog: dstackai/dstack@0.18.41...0.18.42

Contributors

Bihan and r4victor

Assets 2

Releases: dstackai/dstack-enterprise

0.19.7-v1

Plugins

Tenstorrent

What's changed

Contributors

Uh oh!

0.19.5-v1

CLI

Offers

Configuration

Resource tags

Shell configuration

GCP

A3 High and A3 Edge

Volumes

Total cost

What's changed

Contributors

Uh oh!

0.19.4-v1

Rate limits for services

Examples: TensorRT-LLM and Llama 4

Improved contributing experience

What's Changed

New Contributors

Contributors

Uh oh!

0.19.3-v1

Optimized networking for GCP H100 clusters

H200 and B200 support on Datacrunch

CUDO improvements

fleets configuration property

What's Changed

New Contributors

Contributors

Uh oh!

0.19.2-v1

Nebius

Metrics

What's Changed

Contributors

Uh oh!

0.19.1-v1

Metrics

Runs

Run jobs

Major bugfixes

What's changed

Contributors

Uh oh!

0.19.0-v1

Simplified backend integration

MPI workloads and NCCL tests

Cost and usage metrics

Cursor IDE support

Deprecations

Breaking changes

What's Changed

Contributors

Uh oh!

0.18.44-v1

Single Sign-On via Microsoft Entra ID

GPU utilization policy

DCGM metrics

RunPod Community Cloud

What's Changed

Contributors

Uh oh!

0.18.43-v1

CLI autocompletion

max_duration set to off by default

GCP Logging for run logs

Custom IAM instance profile for AWS

Oracle Cloud spot instances

Breaking changes

What's Changed

New Contributors

Contributors

Uh oh!

`fleets` configuration property

`max_duration` set to `off` by default