Skip to content

Commit 2454666

Browse files
author
Andrey Cheptsov
committed
Update blog post on agentic orchestration and enhance CSS for heading links
- Removed 'Changelog' category from the blog post metadata. - Improved formatting in the blog post for better readability, including consistent use of backticks for code references. - Added CSS rule to hide permalink anchors on top-level headings.
1 parent 7af1b70 commit 2454666

2 files changed

Lines changed: 76 additions & 46 deletions

File tree

docs/assets/stylesheets/extra.css

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -603,6 +603,11 @@ code .md-code__nav:hover .md-code__button {
603603
font-size: 33px;
604604
}
605605

606+
/* Hide permalink anchor on top-level headings only */
607+
.md-typeset h1 > .headerlink {
608+
display: none;
609+
}
610+
606611
.md-typeset h2 {
607612
margin: 1.4em 0 0.64em;
608613
padding-top: 0.2em;

docs/blog/posts/agentic-orchestration.md

Lines changed: 71 additions & 46 deletions
Original file line numberDiff line numberDiff line change
@@ -4,8 +4,6 @@ date: 2026-03-10
44
description: "Agentic engineering pulls compute discovery, provisioning, scheduling, and observability into the execution loop. Infrastructure orchestration is becoming an agent skill."
55
slug: agentic-orchestration
66
image: https://dstack.ai/static-assets/static-assets/images/agentic-orchestration.png
7-
categories:
8-
- Changelog
97
---
108

119
# Infrastructure orchestration is an agent skill
@@ -52,23 +50,21 @@ Providers that fit that layer become much easier to integrate into agent-driven
5250

5351
## What this looks like with dstack
5452

55-
dstack is an open-source control plane for provisioning GPU compute and orchestrating GPU workloads across a range of environments, including clouds, Kubernetes, and on-prem clusters. It exposes that infrastructure surface to agents and human operators through the CLI and configuration files.
53+
`dstack` is an open-source control plane for provisioning GPU compute and orchestrating GPU workloads across a range of environments, including clouds, Kubernetes, and on-prem clusters. It exposes that infrastructure surface to agents and human operators through the CLI and configuration files.
5654

5755
**Step 1: treat available compute as queryable state**
5856

5957
`dstack offer` turns available compute into something the workflow can query directly. It returns offers from configured backends and managed capacity, including region, resources, spot availability, and price.
6058

6159
```shell
62-
dstack offer --gpu H100:1.. --max-offers 3
63-
```
64-
65-
```shell
66-
# BACKEND REGION INSTANCE TYPE RESOURCES SPOT PRICE
67-
1 verda FIN-01 1H100.80S.30V 30xCPU, 120GB, 1xH100 (80GB), 100.0GB (disk) no $2.19
68-
2 runpod US-KS-2 NVIDIA H100 PCIe 16xCPU, 251GB, 1xH100 (80GB), 100.0GB (disk) no $2.39
69-
3 nebius eu-north1 gpu-h100-sxm 16xCPU, 200GB, 1xH100 (80GB), 100.0GB (disk) no $2.95
70-
...
71-
Shown 3 of 99 offers
60+
$ dstack offer --gpu H100:1.. --max-offers 3
61+
62+
# BACKEND REGION INSTANCE TYPE RESOURCES SPOT PRICE
63+
1 verda FIN-01 1H100.80S.30V 30xCPU, 120GB, 1xH100 (80GB), 100.0GB (disk) no $2.19
64+
2 runpod US-KS-2 NVIDIA H100 PCIe 16xCPU, 251GB, 1xH100 (80GB), 100.0GB (disk) no $2.39
65+
3 nebius eu-north1 gpu-h100-sxm 16xCPU, 200GB, 1xH100 (80GB), 100.0GB (disk) no $2.95
66+
...
67+
Shown 3 of 99 offers
7268
```
7369

7470
In an agentic workflow, compute selection becomes part of execution. The workflow can inspect available capacity before deciding what to run.
@@ -89,32 +85,40 @@ resources:
8985
blocks: 4
9086
```
9187
92-
A fleet is dstack's unit of provisioning control. It can represent an elastic template over cloud or Kubernetes backends, a pre-provisioned pool, or a set of SSH-managed on-prem hosts. This is how dstack keeps provisioning explicit and bounded: the agent operates within declared capacity instead of interacting with provider infrastructure directly.
88+
A fleet is `dstack`'s unit of provisioning control. It can represent an elastic template over cloud or Kubernetes backends, a pre-provisioned pool, or a set of SSH-managed on-prem hosts. This is how `dstack` keeps provisioning explicit and bounded: the agent operates within declared capacity instead of interacting with provider infrastructure directly.
89+
90+
<div class="termy">
9391

9492
```shell
95-
dstack apply -f fleet.dstack.yml
93+
$ dstack apply -f fleet.dstack.yml
9694
```
9795

96+
</div>
97+
9898
In this context, `dstack apply` creates or updates the fleet resource. If the fleet is only a template, later runs can draw instances from it on demand. If it is pre-provisioned, the capacity is already present.
9999

100+
<div class="termy">
101+
100102
```shell
101-
dstack fleet
103+
$ dstack fleet
102104
103105
NAME NODES GPU SPOT BACKEND PRICE STATUS CREATED
104106
gpu-cluster 2..4 A100:80GB:8 auto aws $0..$32 active 2 hours ago
105107
instance=0 A100:80GB:8 spot aws (us-ea…) $28.50 busy 2 hours ago
106108
instance=1 A100:80GB:8 spot gcp (us-ce…) $26.80 busy 1 hour ago
107-
on-prem 4 - - ssh - active 3 days ago
109+
on-prem 2 - - ssh - active 3 days ago
108110
instance=0 A100:40GB:4 - ssh - busy 3 days ago
109111
instance=1 A100:40GB:4 - ssh - idle 3 days ago
110112
test-fleet 0..1 gpu:16GB on-demand * - active 10 min ago
111113
```
112114

115+
</div>
116+
113117
In an agentic workflow, this gives the agent a visible provisioning surface: it can see which fleets exist, what capacity they expose, and whether that capacity is active, busy, or idle before deciding what to run next.
114118

115119
**Step 3: run evaluation or training loops as tasks**
116120

117-
Tasks are dstack's workload type for evaluation, fine-tuning, training, and other job-oriented workflows. They can also be distributed, in which case dstack handles cluster selection and job coordination across nodes.
121+
Tasks are `dstack`'s workload type for evaluation, fine-tuning, training, and other job-oriented workflows. They can also be distributed, in which case `dstack` handles cluster selection and job coordination across nodes.
118122

119123
```yaml
120124
# train.dstack.yml
@@ -124,7 +128,7 @@ name: train-qwen
124128
image: huggingface/trl-latest-gpu
125129
working_dir: /workspace
126130
127-
repos:
131+
files:
128132
- .:/workspace
129133
130134
commands:
@@ -137,19 +141,27 @@ resources:
137141
shm_size: 16GB
138142
```
139143

140-
Once a task is running, the agent may need to re-attach to the session, open a shell inside the container, or inspect runtime state before deciding what to do next. dstack exposes each of those actions directly.
144+
Once a task is running, the agent may need to re-attach to the session, open a shell inside the container, or inspect runtime state before deciding what to do next. `dstack` exposes each of those actions directly.
145+
146+
<div class="termy">
141147

142148
```shell
143-
dstack attach train-qwen --logs
149+
$ dstack attach train-qwen --logs
144150
```
145151

152+
</div>
153+
154+
<div class="termy">
155+
146156
```shell
147-
ssh train-qwen
157+
$ ssh train-qwen
148158
```
149159

160+
</div>
161+
150162
**Step 4: run model inference as services**
151163

152-
Services are dstack's workload type for long-lived inference endpoints. The same control plane that runs training and evaluation jobs can also deploy model-serving endpoints with stable URLs, autoscaling rules, and health checks.
164+
Services are `dstack`'s workload type for long-lived inference endpoints. The same control plane that runs training and evaluation jobs can also deploy model-serving endpoints with stable URLs, autoscaling rules, and health checks.
153165

154166
```yaml
155167
# serve.dstack.yml
@@ -169,7 +181,6 @@ commands:
169181
--trust-remote-code
170182
171183
port: 8000
172-
gateway: true
173184
model: Qwen/Qwen2.5-32B-Instruct
174185
replicas: 1..4
175186
scaling:
@@ -183,69 +194,83 @@ resources:
183194

184195
The endpoint can then be accessed directly, including from another agent step:
185196

197+
<div class="termy">
198+
186199
```shell
187-
curl https://qwen25-instruct.example.com/v1/chat/completions \
188-
-H 'Content-Type: application/json' \
189-
-H 'Authorization: Bearer <dstack token>' \
190-
-d '{
191-
"model": "Qwen/Qwen2.5-32B-Instruct",
192-
"messages": [{"role": "user", "content": "Hello"}]
193-
}'
200+
$ curl https://qwen25-instruct.example.com/v1/chat/completions \
201+
-H 'Content-Type: application/json' \
202+
-H 'Authorization: Bearer <dstack token>' \
203+
-d '{
204+
"model": "Qwen/Qwen2.5-32B-Instruct",
205+
"messages": [{"role": "user", "content": "Hello"}]
206+
}'
194207
```
195208

209+
</div>
210+
196211
The agent can launch the service, call the endpoint, and scale it through the same orchestration layer.
197212

198213
**Step 5: observe through events and metrics**
199214

200215
`dstack` exposes structured lifecycle data through events and metrics, so the loop can inspect state transitions and resource usage directly instead of inferring everything from logs.
201216

202-
```shell
203-
dstack event --within-run train-qwen
204-
```
217+
<div class="termy">
205218

206219
```shell
207-
[2026-01-21 13:09:37] [run train-qwen] Run submitted. Status: SUBMITTED
208-
[2026-01-21 13:09:57] [job train-qwen-0-0] Job status changed SUBMITTED -> PROVISIONING
209-
[2026-01-21 13:11:49] [job train-qwen-0-0] Job status changed PULLING -> RUNNING
210-
```
220+
$ dstack event --within-run train-qwen
211221
212-
```shell
213-
dstack metrics train-qwen
222+
[2026-01-21 13:09:37] [run train-qwen] Run submitted. Status: SUBMITTED
223+
[2026-01-21 13:09:57] [job train-qwen-0-0] Job status changed SUBMITTED -> PROVISIONING
224+
[2026-01-21 13:11:49] [job train-qwen-0-0] Job status changed PULLING -> RUNNING
214225
```
215226

227+
</div>
228+
229+
<div class="termy">
230+
216231
```shell
232+
$ dstack metrics train-qwen
233+
217234
NAME STATUS CPU MEMORY GPU
218235
train-qwen running 92% 118GB/200GB gpu=0 mem=71GB/80GB util=97%
219236
```
220237

238+
</div>
239+
221240
Taken together, these are the fine-grained primitives a fully autonomous agent needs: discover capacity, provision it, run the right workload type, inspect state, and decide what to do next without handing orchestration back to a human operator.
222241

223242
## Skills
224243

225-
Those primitives become much more useful when they are paired with operational knowledge. dstack already ships an installable [agent skill](https://github.com/dstackai/dstack/blob/master/.agents/skills/dstack/SKILL.md) and documents how to install it:
244+
Those primitives become much more useful when they are paired with operational knowledge. `dstack` already ships an installable [agent skill](https://github.com/dstackai/dstack/blob/master/.agents/skills/dstack/SKILL.md) and documents how to install it:
245+
246+
<div class="termy">
226247

227248
```shell
228-
npx skills add dstackai/dstack
249+
$ npx skills add dstackai/dstack
229250
```
230251

231-
> Skills are where operational know-how can live: how to run training, fine-tuning, inference, evals, and other specialized workflows against the orchestration layer. This should not stop at one built-in skill. The ecosystem needs specialized skills that encode the operational patterns agents actually use for these workloads.
252+
</div>
253+
254+
Skills are where operational know-how can live: how to run training, fine-tuning, inference, evals, and other specialized workflows against the orchestration layer.
255+
256+
> One built-in skill is only a start. The ecosystem needs specialized skills that encode the operational patterns agents actually use for these workloads.
232257

233258
## Governance and permissions
234259

235260
As infrastructure management is delegated to agents, governance and observability become part of the orchestration model itself, not something added later around it.
236261

237-
dstack already exposes part of that model through projects and permissions. Projects isolate teams and resources, define access boundaries, and control which backends and infrastructure surfaces an agent or user can operate against.
262+
`dstack` already exposes part of that model through projects and permissions. Projects isolate teams and resources, define access boundaries, and control which backends and infrastructure surfaces an agent or user can operate against.
238263

239264
## Why open source and the ecosystem matter here
240265

241266
If agents are going to provision compute and orchestrate workloads directly, the control plane cannot be a black box.
242267

243268
Teams need to see which backends it supports, how scheduling decisions are made, how permissions are enforced, and how lifecycle state is exposed. They also need to extend it: add new providers, refine operational policies, and encode better training, fine-tuning, inference, and evaluation workflows as reusable skills and recipes.
244269

245-
dstack is MPL-2.0 licensed and designed around backends, fleets, projects, events, and metrics that can span different capacity sources. That matters because agentic orchestration will not be built once inside a single vendor boundary; it will be assembled across clouds, Kubernetes, on-prem infrastructure, and a growing ecosystem of specialized operational patterns.
270+
`dstack` is MPL-2.0 licensed and designed around backends, fleets, projects, events, and metrics that can span different capacity sources. That matters because agentic orchestration will not be built once inside a single vendor boundary; it will be assembled across clouds, Kubernetes, on-prem infrastructure, and a growing ecosystem of specialized operational patterns.
246271

247272
## What's next
248273

249274
If you are already running agent-driven loops, feedback on the hard parts is especially useful: what still forces a human back into the path, which signals are missing, where provider integration still feels manual, and which specialized skills or recipes would be most valuable.
250275

251-
If you want to use dstack for these workflows or contribute to the surrounding ecosystem, issues and feedback are welcome in the [GitHub repo](https://github.com/dstackai/dstack).
276+
If you want to use `dstack` for these workflows or contribute to the surrounding ecosystem, issues and feedback are welcome in the [GitHub repo](https://github.com/dstackai/dstack).

0 commit comments

Comments
 (0)