Skip to content
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .github/configs/nvidia-master.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2146,7 +2146,7 @@ qwen3.5-fp8-b200-sglang-agentic:
- { tp: 8, ep: 1, offloading: none, conc-list: [1, 2, 4, 8, 16, 32] }

qwen3.5-fp4-b200-sglang:
image: lmsysorg/sglang:nightly-dev-20260422-de962f32
image: lmsysorg/sglang:v0.5.12-cu130
model: nvidia/Qwen3.5-397B-A17B-NVFP4
model-prefix: qwen3.5
runner: b200
Expand All @@ -2167,7 +2167,7 @@ qwen3.5-fp4-b200-sglang:
- { tp: 2, ep: 1, conc-start: 4, conc-end: 128 }

qwen3.5-fp4-b200-sglang-mtp:
image: lmsysorg/sglang:nightly-dev-20260422-de962f32
image: lmsysorg/sglang:v0.5.12-cu130
model: nvidia/Qwen3.5-397B-A17B-NVFP4
model-prefix: qwen3.5
runner: b200
Expand Down
7 changes: 7 additions & 0 deletions perf-changelog.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2653,3 +2653,10 @@
description:
- "Update SGLang image from v0.5.9-cu129-amd64 (74d old) to v0.5.12-cu130"
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1458

- config-keys:
- qwen3.5-fp4-b200-sglang
- qwen3.5-fp4-b200-sglang-mtp
description:
- "Update SGLang image from nightly-dev-20260422-de962f32 (17d/13d old) to v0.5.12-cu130"
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1474