Skip to content

Commit f601cfe

Browse files
Add GB200 FP8 8k prefill batch limit
1 parent 7d861f6 commit f601cfe

1 file changed

Lines changed: 1 addition & 0 deletions

File tree

benchmarks/multi_node/srt-slurm-recipes/vllm/minimax-m2.5-gb200-fp8/8k1k/disagg-gb200-1p1d-tp4ep.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -44,6 +44,7 @@ backend:
4444
tensor-parallel-size: 1
4545
pipeline-parallel-size: 1
4646
data-parallel-size: 2
47+
max-num-batched-tokens: 16384
4748
data-parallel-rpc-port: 13346
4849
enable-expert-parallel: true
4950
safetensors-load-strategy: "prefetch"

0 commit comments

Comments
 (0)