Skip to content

Commit ead8c8a

Browse files
BihanBihan  Rana
andauthored
Add SgLang Example (#2461)
Co-authored-by: Bihan Rana <bihan@Bihans-MacBook-Pro.local>
1 parent ec8a12a commit ead8c8a

File tree

4 files changed

+143
-0
lines changed

4 files changed

+143
-0
lines changed

docs/examples.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,15 @@ hide:
4141
Deploy Llama 3.1 with NIM
4242
</p>
4343
</a>
44+
<a href="/examples/deployment/sglang"
45+
class="feature-cell">
46+
<h3>
47+
SGLang
48+
</h3>
49+
<p>
50+
Deploy DeepSeek-R1-Distill-Llama 8B & 70B with SGLang
51+
</p>
52+
</a>
4453
</div>
4554

4655
## Fine-tuning

docs/examples/deployment/sglang/index.md

Whitespace-only changes.
Lines changed: 133 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,133 @@
1+
# SGLang
2+
3+
This example shows how to deploy DeepSeek-R1-Distill-Llama 8B and 70B using [SGLang :material-arrow-top-right-thin:{ .external }](https://github.com/sgl-project/sglang){:target="_blank"} and `dstack`.
4+
5+
??? info "Prerequisites"
6+
Once `dstack` is [installed](https://dstack.ai/docs/installation), go ahead clone the repo, and run `dstack init`.
7+
8+
<div class="termy">
9+
10+
```shell
11+
$ git clone https://github.com/dstackai/dstack
12+
$ cd dstack
13+
$ dstack init
14+
```
15+
16+
</div>
17+
18+
## Deployment
19+
Here's an example of a service that deploys DeepSeek-R1-Distill-Llama 8B and 70B using SgLang.
20+
21+
=== "AMD"
22+
23+
<div editor-title="examples/deployment/sglang/amd/.dstack.yml">
24+
25+
```yaml
26+
type: service
27+
name: deepseek-r1-amd
28+
29+
image: lmsysorg/sglang:v0.4.1.post4-rocm620
30+
env:
31+
- MODEL_ID=deepseek-ai/DeepSeek-R1-Distill-Llama-70B
32+
33+
commands:
34+
- python3 -m sglang.launch_server
35+
--model-path $MODEL_ID
36+
--port 8000
37+
--trust-remote-code
38+
39+
port: 8000
40+
model: deepseek-ai/DeepSeek-R1-Distill-Llama-70B
41+
42+
resources:
43+
gpu: MI300x
44+
disk: 300GB
45+
```
46+
</div>
47+
48+
=== "NVIDIA"
49+
50+
<div editor-title="examples/deployment/sglang/nvidia/.dstack.yml">
51+
52+
```yaml
53+
type: service
54+
name: deepseek-r1-nvidia
55+
56+
image: lmsysorg/sglang:latest
57+
env:
58+
- MODEL_ID=deepseek-ai/DeepSeek-R1-Distill-Llama-8B
59+
60+
commands:
61+
- python3 -m sglang.launch_server
62+
--model-path $MODEL_ID
63+
--port 8000
64+
--trust-remote-code
65+
66+
port: 8000
67+
model: deepseek-ai/DeepSeek-R1-Distill-Llama-8B
68+
69+
resources:
70+
gpu: 24GB
71+
```
72+
</div>
73+
74+
75+
### Applying the configuration
76+
77+
To run a configuration, use the [`dstack apply`](https://dstack.ai/docs/reference/cli/dstack/apply.md) command.
78+
79+
<div class="termy">
80+
81+
```shell
82+
$ dstack apply -f examples/llms/deepseek/sglang/amd/.dstack.yml
83+
84+
# BACKEND REGION RESOURCES SPOT PRICE
85+
1 runpod EU-RO-1 24xCPU, 283GB, 1xMI300X (192GB) no $2.49
86+
87+
Submit the run deepseek-r1-amd? [y/n]: y
88+
89+
Provisioning...
90+
---> 100%
91+
```
92+
</div>
93+
94+
Once the service is up, the model will be available via the OpenAI-compatible endpoint
95+
at `<dstack server URL>/proxy/models/<project name>/`.
96+
97+
<div class="termy">
98+
99+
```shell
100+
curl http://127.0.0.1:3000/proxy/models/main/chat/completions \
101+
-X POST \
102+
-H 'Authorization: Bearer &lt;dstack token&gt;' \
103+
-H 'Content-Type: application/json' \
104+
-d '{
105+
"model": "deepseek-ai/DeepSeek-R1-Distill-Llama-70B",
106+
"messages": [
107+
{
108+
"role": "system",
109+
"content": "You are a helpful assistant."
110+
},
111+
{
112+
"role": "user",
113+
"content": "What is Deep Learning?"
114+
}
115+
],
116+
"stream": true,
117+
"max_tokens": 512
118+
}'
119+
```
120+
</div>
121+
122+
When a [gateway](https://dstack.ai/docs/concepts/gateways.md) is configured, the OpenAI-compatible endpoint
123+
is available at `https://gateway.<gateway domain>/`.
124+
125+
## Source code
126+
127+
The source-code of this example can be found in
128+
[`examples/llms/deepseek/sglang` :material-arrow-top-right-thin:{ .external }](https://github.com/dstackai/dstack/blob/master/examples/llms/deepseek/sglang){:target="_blank"}.
129+
130+
## What's next?
131+
132+
1. Check [services](https://dstack.ai/docs/services)
133+
2. Browse the [SgLang DeepSeek Usage](https://docs.sglang.ai/references/deepseek.html), [Supercharge DeepSeek-R1 Inference on AMD Instinct MI300X](https://rocm.blogs.amd.com/artificial-intelligence/DeepSeekR1-Part2/README.html)

mkdocs.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -268,6 +268,7 @@ nav:
268268
- vLLM: examples/deployment/vllm/index.md
269269
- TGI: examples/deployment/tgi/index.md
270270
- NIM: examples/deployment/nim/index.md
271+
- SGLang: examples/deployment/sglang/index.md
271272
- Fine-tuning:
272273
- Axolotl: examples/fine-tuning/axolotl/index.md
273274
- TRL: examples/fine-tuning/trl/index.md

0 commit comments

Comments
 (0)