Skip to content

Commit 5b1233c

Browse files
peterschmidt85Andrey Cheptsov
andauthored
Add case study on Graphsignal's use of dstack for inference benchmarking (#3751)
* Add case study on Graphsignal's use of dstack for inference benchmarking * Update Graphsignal case study: enlarge image and add next steps section * Fix formatting of 'autodebug' term in Graphsignal case study --------- Co-authored-by: Andrey Cheptsov <andrey.cheptsov@github.com>
1 parent 2831c81 commit 5b1233c

File tree

1 file changed

+127
-0
lines changed

1 file changed

+127
-0
lines changed

docs/blog/posts/graphsignal.md

Lines changed: 127 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,127 @@
1+
---
2+
title: "How Graphsignal uses dstack for inference benchmarking"
3+
date: 2026-04-08
4+
description: "How Graphsignal uses dstack as a unified layer for GPU development, inference deployment, and benchmarking across on-prem systems and GPU clouds."
5+
slug: graphsignal
6+
image: https://dstack.ai/static-assets/static-assets/images/dstack-graphsignal.png
7+
categories:
8+
- Case studies
9+
links:
10+
- Graphsignal's autodebug blog: https://graphsignal.com/blog/autodebug-telemetry-driven-inference-optimization-loop/
11+
---
12+
13+
# How Graphsignal uses dstack for inference benchmarking
14+
15+
In a recent engineering [blog post](https://graphsignal.com/blog/autodebug-telemetry-driven-inference-optimization-loop/), Graphsignal shared `autodebug`, an autonomous loop that deploys an inference service, benchmarks it, updates the deployment config, and redeploys it again. This case study looks at the team workflow behind that setup, and how `dstack` gives Graphsignal a common layer for GPU development, inference deployment, and benchmarking.
16+
17+
<img src="https://dstack.ai/static-assets/static-assets/images/dstack-graphsignal.png" width="630" />
18+
19+
<!-- more -->
20+
21+
[Graphsignal](https://graphsignal.com/) builds inference observability and AI debugging tooling for teams running production inference across models, engines, and GPUs. That puts the team close to the systems they measure and tune: inference servers, GPU infrastructure, deployment workflows, and benchmark loops.
22+
23+
To benchmark and optimize inference efficiently, the Graphsignal team combines:
24+
25+
- on-prem GPU systems, including [NVIDIA DGX Spark](https://www.nvidia.com/en-us/products/workstations/dgx-spark/) devices managed through `dstack`
26+
- cloud GPU capacity, including [Verda](https://verda.com/) as a supported `dstack` backend
27+
- `dstack` as the common orchestration layer for GPU development and inference deployment
28+
29+
For Graphsignal, the same operational model applies across on-prem systems and GPU clouds. The team can develop on GPU-backed environments, deploy inference services, and rerun benchmarks without switching orchestration models between environments.
30+
31+
Many teams running inference need a workflow that:
32+
33+
- works across different GPU environments
34+
- supports both development and production
35+
- does not require building and maintaining custom orchestration for every provider
36+
37+
`dstack` gives the Graphsignal team a declarative way to provision GPU resources, deploy inference services, and iterate on deployment configs across environments without introducing a separate control plane for each provider.
38+
39+
> *`dstack` gives us a unified layer for GPU development and inference across on-prem systems and GPU clouds. It is fine-grained enough for serious inference engineering, but simple enough that we do not have to build and maintain custom orchestration around every GPU environment we use.*
40+
>
41+
> ***Dmitry Melikyan**, Founder at Graphsignal*
42+
43+
The Graphsignal team primarily uses these `dstack` components:
44+
45+
- [Dev environments](../../docs/concepts/dev-environments.md) — for GPU-backed development and experimentation
46+
- [Services](../../docs/concepts/services.md) — for deploying inference endpoints and running benchmarkable workloads
47+
- [Fleets](../../docs/concepts/fleets.md) — for spanning on-prem systems and cloud backends through one interface
48+
- the `dstack` CLI — with `dstack apply` used directly in the deployment and benchmarking loop
49+
50+
In practice, this gives the Graphsignal team a way to:
51+
52+
- move from GPU development to production inference without changing orchestration layers
53+
- turn a serving change into a fresh, versioned deployment
54+
- run benchmarks on real hardware across on-prem and cloud environments
55+
- keep the same workflow for development, deployment, and repeated optimization
56+
57+
The examples below are representative `dstack` configurations that illustrate the workflow described above. They are included to show how the same control plane can span on-prem hosts and cloud backends, not as Graphsignal production configs.
58+
59+
For on-prem systems such as DGX Spark devices, `dstack` can manage multiple hosts through a single SSH fleet definition.
60+
61+
<div editor-title="spark.dstack.yml">
62+
63+
```yaml
64+
65+
type: fleet
66+
name: graphsignal-onprem
67+
68+
ssh_config:
69+
user: ubuntu
70+
identity_file: ~/.ssh/id_rsa
71+
hosts:
72+
- dgx-spark-1
73+
- dgx-spark-2
74+
- dgx-spark-3
75+
```
76+
77+
</div>
78+
79+
For cloud GPU, `dstack` supports Verda as a native backend.
80+
81+
<div editor-title="~/.dstack/server/config.yml">
82+
83+
```yaml
84+
projects:
85+
- name: main
86+
backends:
87+
- type: verda
88+
creds:
89+
type: api_key
90+
client_id: YOUR_CLIENT_ID
91+
client_secret: YOUR_CLIENT_SECRET
92+
```
93+
94+
</div>
95+
96+
For Graphsignal, `dstack` acts as a unified orchestration layer for GPU development and inference across on-prem systems and GPU clouds. It gives both developers and agents a fine-grained interface for editing configs, deploying services, and iterating on infrastructure without switching tools or rebuilding workflow around each environment.
97+
98+
For agentic workflows, [`dstack` skills](https://skills.sh/dstackai/dstack/dstack) extend that same interface to tools such as Claude Code, Codex, and Cursor.
99+
100+
<div class="termy">
101+
102+
```shell
103+
$ npx skills add dstackai/dstack
104+
```
105+
106+
</div>
107+
108+
Once installed, they let an agent work directly with `dstack` configs and CLI commands: create or edit a `*.dstack.yml`, apply the configuration, check run status, and manage fleets, etc.
109+
110+
Claude Code can use Graphsignal telemetry to decide what to change next, then use `dstack` to generate the updated service config and invoke the CLI on the team’s behalf.
111+
112+
<img src="https://dstack.ai/static-assets/static-assets/images/graphsignal-debug-chat.png" width="750" />
113+
114+
The point is not a single benchmark run, but a repeatable workflow in which deployment, measurement, and optimization stay inside the same system.
115+
116+
> *Agentic engineering is changing not only how code gets written, but how compute gets orchestrated and how inference gets optimized. Once the deployment layer is programmable, agents can participate directly in benchmarking, redeployment, and performance tuning.*
117+
>
118+
> *— Dmitry Melikyan**, Founder at Graphsignal*
119+
120+
Instead of treating performance testing as a separate script, the team can run it as a loop: benchmark a live endpoint, inspect logs and telemetry for the same time window, identify bottlenecks, update the `dstack` service config, redeploy, and run the next iteration.
121+
122+
*Huge thanks to Dmitry Melikyan and Bogdan Sulima at Graphsignal for feedback and collaboration. For more details, see Graphsignal’s engineering post on [autodebug](https://graphsignal.com/blog/autodebug-telemetry-driven-inference-optimization-loop/).*
123+
124+
!!! info "What's next?"
125+
1. Follow the [`Installation`](../../docs/installation.md) and [`Quickstart`](../../docs/quickstart.md) guides
126+
2. Explore [`dev environments`](../../docs/concepts/dev-environments.md), [`tasks`](../../docs/concepts/tasks.md), [`services`](../../docs/concepts/services.md), and [`fleets`](../../docs/concepts/fleets.md)
127+
3. Use Graphsignal’s [`dstack` integration guide](https://graphsignal.com/docs/integrations/dstack/) to add profiling, tracing, and monitoring to a `dstack` inference service

0 commit comments

Comments
 (0)