Commit 25506e8
Upgrade vLLM to v0.11.2 (#273)
* Upgrade vLLM to v0.11.2
Updated configs:
- Use FP8 kv-cache for GPT-OSS B200.
- Remove "custom_ops" from compilation-config for GPT-OSS.
- Remove "cudagraph_mode" from compilation-config for GPT-OSS.
- Remove VLLM_FLASHINFER_ALLREDUCE_FUSION_THRESHOLDS_MB env var for
GPT-OSS.
- Remove deprecated "--disable-log-requests" flag.
- Rename "cuda-graph-sizes" flag.
Signed-off-by: Po-Han Huang <pohanh@nvidia.com>
* make cw runners container writable
* undo make cw runners container writable
* coreweave cleanup
* coreweave cleanup pt 2
---------
Signed-off-by: Po-Han Huang <pohanh@nvidia.com>
Co-authored-by: Po-Han Huang <pohanh@nvidia.com>
Co-authored-by: Cameron Quilici <cjquilici@gmail.com>1 parent 93e1b3c commit 25506e8
6 files changed
Lines changed: 16 additions & 19 deletions
File tree
- .github/configs
- benchmarks
- runners
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
209 | 209 | | |
210 | 210 | | |
211 | 211 | | |
212 | | - | |
| 212 | + | |
213 | 213 | | |
214 | 214 | | |
215 | 215 | | |
| |||
240 | 240 | | |
241 | 241 | | |
242 | 242 | | |
243 | | - | |
| 243 | + | |
244 | 244 | | |
245 | 245 | | |
246 | 246 | | |
| |||
300 | 300 | | |
301 | 301 | | |
302 | 302 | | |
303 | | - | |
| 303 | + | |
304 | 304 | | |
305 | 305 | | |
306 | 306 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
29 | 29 | | |
30 | 30 | | |
31 | 31 | | |
32 | | - | |
| 32 | + | |
| 33 | + | |
33 | 34 | | |
34 | 35 | | |
35 | | - | |
| 36 | + | |
36 | 37 | | |
37 | 38 | | |
38 | 39 | | |
39 | 40 | | |
40 | 41 | | |
41 | | - | |
42 | 42 | | |
43 | 43 | | |
44 | 44 | | |
| |||
47 | 47 | | |
48 | 48 | | |
49 | 49 | | |
50 | | - | |
| 50 | + | |
51 | 51 | | |
52 | 52 | | |
53 | 53 | | |
| |||
69 | 69 | | |
70 | 70 | | |
71 | 71 | | |
72 | | - | |
| 72 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
12 | 12 | | |
13 | 13 | | |
14 | 14 | | |
15 | | - | |
16 | 15 | | |
17 | 16 | | |
18 | | - | |
| 17 | + | |
19 | 18 | | |
20 | 19 | | |
21 | 20 | | |
| |||
29 | 28 | | |
30 | 29 | | |
31 | 30 | | |
32 | | - | |
| 31 | + | |
33 | 32 | | |
34 | 33 | | |
35 | 34 | | |
| |||
51 | 50 | | |
52 | 51 | | |
53 | 52 | | |
54 | | - | |
| 53 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
13 | 13 | | |
14 | 14 | | |
15 | 15 | | |
16 | | - | |
17 | 16 | | |
18 | 17 | | |
19 | | - | |
| 18 | + | |
20 | 19 | | |
21 | 20 | | |
22 | 21 | | |
| |||
30 | 29 | | |
31 | 30 | | |
32 | 31 | | |
33 | | - | |
| 32 | + | |
34 | 33 | | |
35 | 34 | | |
36 | 35 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
27 | 27 | | |
28 | 28 | | |
29 | 29 | | |
30 | | - | |
31 | 30 | | |
32 | 31 | | |
33 | | - | |
| 32 | + | |
34 | 33 | | |
35 | 34 | | |
36 | 35 | | |
| |||
42 | 41 | | |
43 | 42 | | |
44 | 43 | | |
45 | | - | |
| 44 | + | |
46 | 45 | | |
47 | 46 | | |
48 | 47 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
27 | 27 | | |
28 | 28 | | |
29 | 29 | | |
30 | | - | |
| 30 | + | |
31 | 31 | | |
32 | 32 | | |
0 commit comments