dstackai
diff --git a/‎docker/server/README.md‎
Lines changed: 1 addition & 1 deletion b/‎docker/server/README.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/blog/posts/amd-mi300x-inference-benchmark.md‎
Lines changed: 3 additions & 3 deletions b/‎docs/blog/posts/amd-mi300x-inference-benchmark.md‎
Lines changed: 3 additions & 3 deletions
diff --git a/‎docs/docs/concepts/backends.md‎
Lines changed: 2 additions & 3 deletions b/‎docs/docs/concepts/backends.md‎
Lines changed: 2 additions & 3 deletions
diff --git a/‎docs/docs/concepts/fleets.md‎
Lines changed: 46 additions & 55 deletions b/‎docs/docs/concepts/fleets.md‎
Lines changed: 46 additions & 55 deletions
diff --git a/‎docs/docs/concepts/snippets/manage-fleets.ext‎
Lines changed: 5 additions & 10 deletions b/‎docs/docs/concepts/snippets/manage-fleets.ext‎
Lines changed: 5 additions & 10 deletions
diff --git a/‎docs/docs/concepts/tasks.md‎
Lines changed: 2 additions & 6 deletions b/‎docs/docs/concepts/tasks.md‎
Lines changed: 2 additions & 6 deletions
diff --git a/‎docs/docs/guides/protips.md‎
Lines changed: 7 additions & 11 deletions b/‎docs/docs/guides/protips.md‎
Lines changed: 7 additions & 11 deletions
@@ -39,7 +39,7 @@ Configuration is updated at ~/.dstack/config.yml
 ## Create SSH fleets
 
 If you want the `dstack` server to run containers on your on-prem servers,
-use [fleets](https://dstack.ai/docs/concepts/fleets#ssh).
+use [fleets](https://dstack.ai/docs/concepts/fleets#ssh-fleets).
 
 ## More information
 
 
@@ -10,7 +10,7 @@ categories:
 
 # Benchmarking Llama 3.1 405B on 8x AMD MI300X GPUs
 
-At `dstack`, we've been adding support for AMD GPUs with [SSH fleets](../../docs/concepts/fleets.md#ssh), 
+At `dstack`, we've been adding support for AMD GPUs with [SSH fleets](../../docs/concepts/fleets.md#ssh-fleets), 
 so we saw this as a great chance to test our integration by benchmarking AMD GPUs. Our friends at 
 [Hot Aisle :material-arrow-top-right-thin:{ .external }](https://hotaisle.xyz/){:target="_blank"}, who build top-tier 
 bare metal compute for AMD GPUs, kindly provided the hardware for the benchmark.
@@ -34,7 +34,7 @@ Here is the spec of the bare metal machine we got:
 ??? info "Set up an SSH fleet"
 
     Hot Aisle provided us with SSH access to the machine. To make it accessible via `dstack`,
-    we created an [SSH fleet](../../docs/concepts/fleets.md#ssh) using the following configuration:
+    we created an [SSH fleet](../../docs/concepts/fleets.md#ssh-fleets) using the following configuration:
 
     <div editor-title="hotaisle.dstack.yml"> 
 
@@ -215,7 +215,7 @@ If you have questions, feedback, or want to help improve the benchmark, please r
 is the primary sponsor of this benchmark, and we are sincerely grateful for their hardware and support.  
 
 If you'd like to use top-tier bare metal compute with AMD GPUs, we recommend going
-with Hot Aisle. Once you gain access to a cluster, it can be easily accessed via `dstack`'s [SSH fleet](../../docs/concepts/fleets.md#ssh) easily.
+with Hot Aisle. Once you gain access to a cluster, it can be easily accessed via `dstack`'s [SSH fleet](../../docs/concepts/fleets.md#ssh-fleets) easily.
 
 ### RunPod
 If you’d like to use on-demand compute with AMD GPUs at affordable prices, you can configure `dstack` to
 
@@ -9,9 +9,8 @@ They can be configured via `~/.dstack/server/config.yml` or through the [project
   * [Container-based](#container-based) – use either `dstack`'s native integration with cloud providers or Kubernetes to orchestrate container-based runs; provisioning in this case is delegated to the cloud provider or Kubernetes.  
   * [On-prem](#on-prem) – use `dstack`'s native support for on-prem servers without needing Kubernetes.  
 
-??? info "dstack Sky"
-    If you're using [dstack Sky :material-arrow-top-right-thin:{ .external }](https://sky.dstack.ai){:target="_blank"},  
-    you can either configure your own backends or use the pre-configured backend that gives you access to compute from the GPU marketplace.
+!!! info "dstack Sky"
+    If you're using [dstack Sky :material-arrow-top-right-thin:{ .external }](https://sky.dstack.ai){:target="_blank"}, backend configuration is optional. dstack Sky lets you use pre-configured backends to access GPU marketplace.
 
 See the examples of backend configuration below.
 
 
@@ -4,36 +4,41 @@ Fleets act both as pools of instances and as templates for how those instances a
 
 `dstack` supports two kinds of fleets: 
 
-* [Standard fleets](#standard) – dynamically provisioned through configured backends; they are supported with any type of backends: [VM-based](backends.md#vm-based), [container-based](backends.md#container-based), and [Kubernetes](backends.md#kubernetes)
+* [Backend fleets](#backend) – dynamically provisioned through configured backends; they are supported with any type of backends: [VM-based](backends.md#vm-based) and [container-based](backends.md#container-based) (incl. [`kubernetes`](backends.md#kubernetes))
 * [SSH fleets](#ssh) – created using on-prem servers; do not require backends
 
-## Standard fleets { #standard }
+When you run `dstack apply` to start a dev environment, task, or service, `dstack` will reuse idle instances from an existing fleet whenever available.
 
-When you run `dstack apply` to start a dev environment, task, or service, `dstack` will reuse idle instances  
-from an existing fleet whenever available.
+## Backend fleets { #backend-fleets }
 
-If no fleet meets the requirements or has idle capacity, `dstack` can create a new fleet on the fly.  
-However, it’s generally better to define fleets explicitly in configuration files for greater control.  
+If you configured [backends](backends.md), `dstack` can provision fleets on the fly.
+However, it’s recommended to define fleets explicitly.
 
 ### Apply a configuration
 
-Define a fleet configuration as a YAML file in your project directory. The file must have a
+To create a backend fleet, define a configuration as a YAML file in your project directory. The file must have a
 `.dstack.yml` extension (e.g. `.dstack.yml` or `fleet.dstack.yml`).
 
 <div editor-title="examples/misc/fleets/.dstack.yml">
 
     ```yaml
     type: fleet
     # The name is optional, if not specified, generated randomly
-    name: my-fleet
+    name: default-fleet
 
     # Can be a range or a fixed number
-    nodes: 2
+    # Allow to provision of up to 2 instances
+    nodes: 0..2
+
     # Uncomment to ensure instances are inter-connected
     #placement: cluster
+
+    # Deprovision instances above the minimum if they remain idle
+    idle_duration: 1h
 
     resources:
-      gpu: 24GB
+      # Allow to provision up to 8 GPUs
+      gpu: 0..8
     ```
 
 </div>
@@ -48,63 +53,49 @@ $ dstack apply -f examples/misc/fleets/.dstack.yml
 Provisioning...
 ---> 100%
 
- FLEET     INSTANCE  BACKEND              GPU             PRICE    STATUS  CREATED 
- my-fleet  0         gcp (europe-west-1)  L4:24GB (spot)  $0.1624  idle    3 mins ago      
-           1         gcp (europe-west-1)  L4:24GB (spot)  $0.1624  idle    3 mins ago    
+ FLEET     INSTANCE  BACKEND  GPU  PRICE  STATUS  CREATED 
+ my-fleet  -         -        -    -      -       -
 ```
 
 </div>
 
-Once the status of instances changes to `idle`, they can be used by dev environments, tasks, and services.
+`dstack` always keeps the minimum number of nodes provisioned. Additional instances, up to the maximum limit, are provisioned on demand.
 
-??? info "Container-based backends"
-    [Container-based](backends.md#container-based) backends don’t support pre-provisioning,
-    so `nodes` can only be set to a range starting with `0`.
-    
-    This means instances are created only when a run starts, and once it finishes, they’re terminated and released back to the provider (either a cloud service or Kubernetes).
+!!! info "Container-based backends"
+    For [container-based](backends.md#container-based) backends  (such as `kubernetes`, `runpod`, etc), `nodes` must be defined as a range starting with `0`. In these cases, instances are provisioned on demand as needed.
 
-    <div editor-title=".dstack.yml">
+    <!-- TODO: Ensure the user sees the error or warning otherwise -->
 
-    ```yaml
-    type: fleet
-    # The name is optional, if not specified, generated randomly
-    name: my-fleet
-    
-    # Specify the number of instances
-    nodes: 0..2
-    # Uncomment to ensure instances are inter-connected
-    #placement: cluster
-    
-    resources:
-      gpu: 24GB
-    ```
+??? info "Target number of nodes"
 
-    </div>
+    If `nodes` is defined as a range, you can start with more than the minimum number of instances by using the `target` parameter when creating the fleet.
 
-### Configuration options
+    <div editor-title=".dstack.yml"> 
 
-#### Nodes { #nodes }
+    ```yaml
+    type: fleet
 
-The `nodes` property controls how many instances to provision and maintain in the fleet:
+    name: my-fleet
 
-<div editor-title=".dstack.yml"> 
+    nodes:
+      min: 0
+      max: 2
 
-```yaml
-type: fleet
+      # Provision 2 instances initially
+      target: 2
 
-name: my-fleet
+    # Deprovision instances above the minimum if they remain idle
+    idle_duration: 1h
+    ```
 
-nodes:
-  min: 1 # Always maintain at least 1 idle instance. Can be 0.
-  target: 2 # (Optional) Provision 2 instances initially
-  max: 3 # (Optional) Do not allow more than 3 instances
-```
+    </div>
 
-</div>
+By default, when you submit a [dev environment](dev-environments.md), [task](tasks.md), or [service](services.md), `dstack` tries all available fleets. However, you can explicitly specify the [`fleets`](../reference/dstack.yml/dev-environment.md#fleets) in your run configuration
+or via [`--fleet`](../reference/cli/dstack/apply.md#fleet) with `dstack apply`.
 
-`dstack` ensures the fleet always has at least `nodes.min` instances, creating new instances in the background if necessary. If you don't need to keep instances in the fleet forever, you can set `nodes.min` to `0`. By default, `dstack apply` also provisions `nodes.min` instances. The `nodes.target` property allows provisioning more instances initially than needs to be maintained.
+### Configuration options
 
-#### Placement { #standard-placement }
+#### Placement { #backend-placement }
 
 To ensure instances are interconnected (e.g., for
 [distributed tasks](tasks.md#distributed-tasks)), set `placement` to `cluster`. 
@@ -190,9 +181,9 @@ and their quantity. Examples: `nvidia` (one NVIDIA GPU), `A100` (one A100), `A10
 > If you’re unsure which offers (hardware configurations) are available from the configured backends, use the
 > [`dstack offer`](../reference/cli/dstack/offer.md#list-gpu-offers) command to list them.
 
-#### Blocks { #standard-blocks }
+#### Blocks { #backend-blocks }
 
-For standard fleets, `blocks` function the same way as in SSH fleets. 
+For backend fleets, `blocks` function the same way as in SSH fleets. 
 See the [`Blocks`](#ssh-blocks) section under SSH fleets for details on the blocks concept.
 
 <div editor-title=".dstack.yml">
@@ -214,10 +205,10 @@ blocks: 4
 #### Idle duration
 
 By default, fleet instances stay `idle` for 3 days and can be reused within that time.
-If the fleet is not reused within this period, it is automatically terminated.
+If an instance is not reused within this period, it is automatically terminated.
 
 To change the default idle duration, set
-[`idle_duration`](../reference/dstack.yml/fleet.md#idle_duration) in the run configuration (e.g., `0s`, `1m`, or `off` for
+[`idle_duration`](../reference/dstack.yml/fleet.md#idle_duration) in the fleet configuration (e.g., `0s`, `1m`, or `off` for
 unlimited).
 
 <div editor-title="examples/misc/fleets/.dstack.yml">
@@ -272,13 +263,13 @@ retry:
 </div>
 
 !!! info "Reference"
-    Standard fleets support many more configuration options,
+    Backend fleets support many more configuration options,
     incl. [`backends`](../reference/dstack.yml/fleet.md#backends), 
     [`regions`](../reference/dstack.yml/fleet.md#regions), 
     [`max_price`](../reference/dstack.yml/fleet.md#max_price), and
     among [others](../reference/dstack.yml/fleet.md).
 
-## SSH fleets { #ssh }
+## SSH fleets { #ssh-fleets }
 
 If you have a group of on-prem servers accessible via SSH, you can create an SSH fleet.
 
 
@@ -1,10 +1,10 @@
 ### Creation policy
 
 By default, when you run `dstack apply` with a dev environment, task, or service,
-if no `idle` instances from the available fleets meet the requirements, `dstack` creates a new fleet 
+if no `idle` instances from the available fleets meet the requirements, `dstack` provisions a new instance
 using configured backends.
 
-To ensure `dstack apply` doesn't create a new fleet but reuses an existing one,
+To ensure `dstack apply` doesn't provision a new instance but reuses an existing one,
 pass `-R` (or `--reuse`) to `dstack apply`.
 
 <div class="termy">
@@ -19,12 +19,7 @@ Or, set [`creation_policy`](../reference/dstack.yml/dev-environment.md#creation_
 
 ### Idle duration
 
-If a fleet is created automatically, it stays `idle` for 5 minutes by default and can be reused within that time.
-If the fleet is not reused within this period, it is automatically terminated.
+If a run provisions a new instance, the instance stays `idle` for 5 minutes by default and can be reused within that time.
+If the instance is not reused within this period, it is automatically terminated.
 To change the default idle duration, set
-[`idle_duration`](../reference/dstack.yml/fleet.md#idle_duration) in the run configuration (e.g., `0s`, `1m`, or `off` for
-unlimited).
-
-!!! info "Fleets"
-    For greater control over fleet provisioning, it is recommended to create
-    [fleets](fleets.md) explicitly.
+[`idle_duration`](../reference/dstack.yml/fleet.md#idle_duration) in the run configuration (e.g., `0s`, `1m`, or `off` for unlimited).
@@ -170,12 +170,8 @@ Use `DSTACK_MASTER_NODE_IP`, `DSTACK_NODES_IPS`, `DSTACK_NODE_RANK`, and other
     For convenience, `~/.ssh/config` is preconfigured with these options, so a simple `ssh <node_ip>` is enough.
     For a list of nodes IPs check the `DSTACK_NODES_IPS` environment variable.
 
-!!! info "Fleets"
-    Distributed tasks can only run on fleets with
-    [cluster placement](fleets.md#cloud-placement).
-    While `dstack` can provision such fleets automatically, it is
-    recommended to create them via a fleet configuration
-    to ensure the highest level of inter-node connectivity.
+!!! info "Cluster fleets"
+    To run distributed tasks, you need to create a fleet with [`placement: cluster`](fleets.md#cloud-placement).
 
 > See the [Clusters](../guides/clusters.md) guide for more details on how to use `dstack` on clusters.
 
 
@@ -190,11 +190,9 @@ See more Docker examples [here](https://github.com/dstackai/dstack/tree/master/e
 ### Creation policy
 
 By default, when you run `dstack apply` with a dev environment, task, or service,
-`dstack` reuses `idle` instances from an existing [fleet](../concepts/fleets.md).
-If no `idle` instances match the requirements, `dstack` automatically creates a new fleet 
-using configured backends.
+if no `idle` instances from the available fleets meet the requirements, `dstack` provisions a new instance using configured backends.
 
-To ensure `dstack apply` doesn't create a new fleet but reuses an existing one,
+To ensure `dstack apply` doesn't provision a new instance but reuses an existing one,
 pass `-R` (or `--reuse`) to `dstack apply`.
 
 <div class="termy">
@@ -205,16 +203,14 @@ $ dstack apply -R -f examples/.dstack.yml
 
 </div>
 
+Or, set [`creation_policy`](../reference/dstack.yml/dev-environment.md#creation_policy) to `reuse` in the run configuration.
+
 ### Idle duration
 
-If a fleet is created automatically, it stays `idle` for 5 minutes by default and can be reused within that time.
-If the fleet is not reused within this period, it is automatically terminated.
+If a run provisions a new instance, the instance stays `idle` for 5 minutes by default and can be reused within that time.
+If the instance is not reused within this period, it is automatically terminated.
 To change the default idle duration, set
-[`idle_duration`](../reference/dstack.yml/fleet.md#idle_duration) in the run configuration (e.g., `0s`, `1m`, or `off` for
-unlimited).
-
-> For greater control over fleet provisioning, configuration, and lifecycle management, it is recommended to use
-> [fleets](../concepts/fleets.md) directly.
+[`idle_duration`](../reference/dstack.yml/fleet.md#idle_duration) in the run configuration (e.g., `0s`, `1m`, or `off` for unlimited).
 
 ## Volumes