You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[GB300][SGLang] Enable W4A4 megamoe and bump SGLang image for dsv4-fp4-gb300-dynamo-sglang (#1382)
* [GB300][SGLang] Enable W4A4 megamoe and bump SGLang image for dsv4-fp4-gb300-dynamo-sglang
- Append SGLANG_OPT_DEEPGEMM_MEGA_MOE_USE_FP4_ACTS=1 and
SGLANG_OPT_DEEPGEMM_MEGA_MOE_USE_MXF4_KIND=1 wherever
SGLANG_OPT_USE_DEEPGEMM_MEGA_MOE is set in the gb300 non-mtp recipes.
- Update SGLang container image from
lmsysorg/sglang-staging:deepseek-v4-grace-blackwell-dev to
lmsysorg/sglang:nightly-dev-cu13-20260514-f7efff32.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Update perf-changelog.yaml
* Update perf-changelog.yaml
* Add custom_tokenizer to dsv4 non-MTP recipes for nightly image compatibility
The new nightly image's transformers does not recognize deepseek_v4
model type, causing benchmark_serving.py to crash on tokenizer loading.
* Update perf-changelog.yaml
* Enable W4A4 megamoe FP4-acts/MXF4-kind opts on GB300 disagg recipes
Adds SGLANG_OPT_DEEPGEMM_MEGA_MOE_USE_FP4_ACTS=1 and
SGLANG_OPT_DEEPGEMM_MEGA_MOE_USE_MXF4_KIND=1 to the prefill/decode env
blocks of the 5 GB300 disagg recipes that run with
moe-a2a-backend: megamoe.
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: yhyang201 <yhyang201@gmail.com>
Copy file name to clipboardExpand all lines: benchmarks/multi_node/srt-slurm-recipes/sglang/deepseek-v4/8k1k/disagg-gb300-10p1d-dep4-dep16-14-c8192.yaml
Copy file name to clipboardExpand all lines: benchmarks/multi_node/srt-slurm-recipes/sglang/deepseek-v4/8k1k/disagg-gb300-12p1d-dep4-dep12-15-c21504.yaml
Copy file name to clipboardExpand all lines: benchmarks/multi_node/srt-slurm-recipes/sglang/deepseek-v4/8k1k/disagg-gb300-8p1d-dep4-dep16-12-c4096.yaml
0 commit comments