Commit 7da989d
Upgrade vLLM to v0.11.2
Updated configs:
- Use FP8 kv-cache for GPT-OSS B200.
- Remove "custom_ops" from compilation-config for GPT-OSS.
- Remove "cudagraph_mode" from compilation-config for GPT-OSS.
- Remove VLLM_FLASHINFER_ALLREDUCE_FUSION_THRESHOLDS_MB env var for
GPT-OSS.
- Remove deprecated "--disable-log-requests" flag.
- Rename "cuda-graph-sizes" flag.
Signed-off-by: Po-Han Huang <pohanh@nvidia.com>1 parent 343d193 commit 7da989d
6 files changed
Lines changed: 16 additions & 19 deletions
File tree
- .github
- configs
- workflows
- benchmarks
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
202 | 202 | | |
203 | 203 | | |
204 | 204 | | |
205 | | - | |
| 205 | + | |
206 | 206 | | |
207 | 207 | | |
208 | 208 | | |
| |||
232 | 232 | | |
233 | 233 | | |
234 | 234 | | |
235 | | - | |
| 235 | + | |
236 | 236 | | |
237 | 237 | | |
238 | 238 | | |
| |||
290 | 290 | | |
291 | 291 | | |
292 | 292 | | |
293 | | - | |
| 293 | + | |
294 | 294 | | |
295 | 295 | | |
296 | 296 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
115 | 115 | | |
116 | 116 | | |
117 | 117 | | |
118 | | - | |
| 118 | + | |
119 | 119 | | |
120 | 120 | | |
121 | 121 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
29 | 29 | | |
30 | 30 | | |
31 | 31 | | |
32 | | - | |
| 32 | + | |
| 33 | + | |
33 | 34 | | |
34 | 35 | | |
35 | | - | |
| 36 | + | |
36 | 37 | | |
37 | 38 | | |
38 | 39 | | |
39 | 40 | | |
40 | 41 | | |
41 | | - | |
42 | 42 | | |
43 | 43 | | |
44 | 44 | | |
| |||
47 | 47 | | |
48 | 48 | | |
49 | 49 | | |
50 | | - | |
| 50 | + | |
51 | 51 | | |
52 | 52 | | |
53 | 53 | | |
| |||
69 | 69 | | |
70 | 70 | | |
71 | 71 | | |
72 | | - | |
| 72 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
12 | 12 | | |
13 | 13 | | |
14 | 14 | | |
15 | | - | |
16 | 15 | | |
17 | 16 | | |
18 | | - | |
| 17 | + | |
19 | 18 | | |
20 | 19 | | |
21 | 20 | | |
| |||
29 | 28 | | |
30 | 29 | | |
31 | 30 | | |
32 | | - | |
| 31 | + | |
33 | 32 | | |
34 | 33 | | |
35 | 34 | | |
| |||
51 | 50 | | |
52 | 51 | | |
53 | 52 | | |
54 | | - | |
| 53 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
13 | 13 | | |
14 | 14 | | |
15 | 15 | | |
16 | | - | |
17 | 16 | | |
18 | 17 | | |
19 | | - | |
| 18 | + | |
20 | 19 | | |
21 | 20 | | |
22 | 21 | | |
| |||
30 | 29 | | |
31 | 30 | | |
32 | 31 | | |
33 | | - | |
| 32 | + | |
34 | 33 | | |
35 | 34 | | |
36 | 35 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
27 | 27 | | |
28 | 28 | | |
29 | 29 | | |
30 | | - | |
31 | 30 | | |
32 | 31 | | |
33 | | - | |
| 32 | + | |
34 | 33 | | |
35 | 34 | | |
36 | 35 | | |
| |||
42 | 41 | | |
43 | 42 | | |
44 | 43 | | |
45 | | - | |
| 44 | + | |
46 | 45 | | |
47 | 46 | | |
48 | 47 | | |
| |||
0 commit comments