Skip to content

Commit 9c98d8e

Browse files
committed
[NV] llm-d: move --moe-backend out of server.sh into per-recipe extra-args
Signed-off-by: Ezra Silvera <ezra@il.ibm.com>
1 parent 27d50fc commit 9c98d8e

3 files changed

Lines changed: 7 additions & 1 deletion

File tree

benchmarks/multi_node/llm-d-recipes/dsr1-fp8-h200-1p1d-simple.yaml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -99,6 +99,7 @@ prefill:
9999
--block-size 256
100100
--no-enable-prefix-caching
101101
--enforce-eager
102+
--moe-backend deep_gemm
102103
env: {}
103104

104105
decode:
@@ -110,6 +111,7 @@ decode:
110111
--max-model-len 16384
111112
--block-size 256
112113
--no-enable-prefix-caching
114+
--moe-backend deep_gemm
113115
env: {}
114116

115117
# ---- SLURM resource directives ----

benchmarks/multi_node/llm-d-recipes/gptoss-fp4-h200-1p1d-simple.yaml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -83,6 +83,7 @@ prefill:
8383
--max-model-len 16384
8484
--block-size 256
8585
--no-enable-prefix-caching
86+
--moe-backend triton
8687
env: {}
8788

8889
decode:
@@ -93,6 +94,7 @@ decode:
9394
--max-model-len 16384
9495
--block-size 256
9596
--no-enable-prefix-caching
97+
--moe-backend triton
9698
env: {}
9799

98100
# ---- SLURM resource directives ----

benchmarks/multi_node/llm-d/server.sh

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -155,8 +155,10 @@ COMMON_ARGS=(
155155
--tensor-parallel-size "$TP_SIZE"
156156
--data-parallel-size "$DP_SIZE"
157157
--kv_transfer_config "$KV_TRANSFER_CONFIG"
158-
--moe-backend deep_gemm
159158
)
159+
# --moe-backend is model-specific (DSR1-FP8 wants deep_gemm, gpt-oss-MXFP4
160+
# rejects it - see vllm/.../oracle/mxfp4.py:163), so each recipe sets its
161+
# own value via prefill/decode extra-args instead of inheriting one here.
160162

161163
if [[ "$LWS_GROUP_SIZE" -gt 1 ]]; then
162164
COMMON_ARGS+=(

0 commit comments

Comments
 (0)