Commit 53df78a
fix: Kimi B300 image + H100/H200 MAX_MODEL_LEN caps
Round 1 of the marathon left 3 of 4 targets fully broken. Fixes:
- B300 (kimik2.5-fp4-b300-vllm): v0.20.2 (cu129) lacks flashinfer FP4
MoE kernels for B300's reported SM (the trtllm_fp4_block_scale_moe
asserts "Only SM 10.x and 11.x are supported"; B300 reports sm_12x).
Revert to v0.20.0-cu130, the Blackwell-targeted build that already
works for the INT4 B300 sister.
- H100 (kimik2.5_int4_h100.sh): 80 GB HBM is too tight for Kimi K2.5
INT4's native 1M-token context; vLLM bails with "No available memory
for the cache blocks" at engine init. Add MAX_MODEL_LEN=32768 default.
- H200 (kimik2.5_int4_h200.sh): same shape of problem at higher cap;
141 GB HBM. Add MAX_MODEL_LEN=131072 default.
MI355X had 17/18 pass; just retrying the 1 transient failure (170s
short — looked like a vLLM worker startup race, not a real bench
failure).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>1 parent ec66535 commit 53df78a
3 files changed
Lines changed: 22 additions & 1 deletion
File tree
- .github/configs
- benchmarks/single_node/agentic
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2612 | 2612 | | |
2613 | 2613 | | |
2614 | 2614 | | |
2615 | | - | |
| 2615 | + | |
| 2616 | + | |
| 2617 | + | |
| 2618 | + | |
| 2619 | + | |
| 2620 | + | |
2616 | 2621 | | |
2617 | 2622 | | |
2618 | 2623 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
16 | 16 | | |
17 | 17 | | |
18 | 18 | | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
19 | 26 | | |
20 | 27 | | |
21 | 28 | | |
| |||
55 | 62 | | |
56 | 63 | | |
57 | 64 | | |
| 65 | + | |
58 | 66 | | |
59 | 67 | | |
60 | 68 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
16 | 16 | | |
17 | 17 | | |
18 | 18 | | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
19 | 26 | | |
20 | 27 | | |
21 | 28 | | |
| |||
55 | 62 | | |
56 | 63 | | |
57 | 64 | | |
| 65 | + | |
58 | 66 | | |
59 | 67 | | |
60 | 68 | | |
| |||
0 commit comments