You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: KLAUD_DEBUG.md
+4-4Lines changed: 4 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -66,7 +66,7 @@ Seen on: #1460 (dsv4-fp8-h200-sglang+mtp).
66
66
67
67
## 4. Upstream sglang v0.5.12 B300 regressions
68
68
69
-
Three distinct upstream regressions on NVIDIA B300 (Blackwell Ultra, `sm_103` — compute capability 10.3) shipped in `lmsysorg/sglang:v0.5.12-cu130`. (sm_120 is for *consumer* Blackwell / RTX 50 series, not B300 — don't propagate that.)
69
+
Two distinct upstream regressions on NVIDIA B300 (Blackwell, `sm_120`) shipped in `lmsysorg/sglang:v0.5.12-cu130`:
70
70
71
71
### 4a. DeepGemm TMA-descriptor crash (GLM-5-FP8)
72
72
**Symptom:** CUDA graph capture aborts with `CUDA_ERROR_ILLEGAL_ADDRESS (700)` at `/deepgemm/csrc/.../runtime_utils.hpp:143` on the **first batch size** for **every TP rank**. Server never serves a prompt.
@@ -86,17 +86,17 @@ Filed upstream: sgl-project/sglang#25551. Seen on #1421.
86
86
2. Comment out the MTP/EAGLE scenarios on B300 in the recipe.
87
87
3. Pin to v0.5.11-cu130.
88
88
89
-
Filed upstream: sgl-project/sglang#25563. Seen on #1420.
B300 is `sm_103` (compute capability 10.3, Blackwell Ultra) — which is *nominally inside*the asserted `sm_100..sm_110f`range, yet the assertion still fires. Best guess is the cute kernel's `Arch.sm_110f` set only matches the architecture-specific feature-flag variants it was compiled for (e.g. `sm_100`, `sm_100f`, `sm_110`, `sm_110f`) and `sm_103` / `sm_103a` isn't in that explicit list. Server never becomes healthy; warmup times out at 600s.
97
+
B300 is `sm_120`, outside the asserted range. Server never becomes healthy; warmup times out at 600s.
98
98
99
-
**Fix:** Needs an sglang image with `flash_attn` that recognises `sm_103` / `sm_103a`— no local workaround. Pin to `v0.5.11-cu130` in the meantime.
99
+
**Fix:** Needs sglang image with flash_attn supporting `sm_120`— no local workaround. Pin to v0.5.11-cu130 in the meantime.
0 commit comments