Skip to content

Commit 139f646

Browse files
committed
revert clamp for Run B (without fix) benchmark+eval at conc-64
Run B of A/B test: benchmark + eval WITHOUT dispatch token clamp. MORI_MAX_DISPATCH_TOKENS_DECODE will be 32 (<256 threshold). Expected: corrupted output, inflated AL, ~0% gsm8k.
1 parent 9983cc0 commit 139f646

2 files changed

Lines changed: 1 addition & 10 deletions

File tree

benchmarks/multi_node/amd_utils/server_sglang.sh

Lines changed: 0 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -248,15 +248,6 @@ if [[ "$DECODE_MTP_SIZE" -gt 0 ]]; then
248248
MORI_MOE_MAX_INPUT_TOKENS_DECODE=$((MORI_MOE_MAX_INPUT_TOKENS_DECODE * (DECODE_MTP_SIZE + 1)))
249249
fi
250250

251-
# Clamp dispatch tokens to >= 256 to avoid the low-latency All2All kernel
252-
# variant in MoRI which silently corrupts outputs at small buffer sizes.
253-
if [[ "$DECODE_ENABLE_DP" == "true" ]] && [[ "$DECODE_ENABLE_EP" == "true" ]]; then
254-
if [[ $MORI_MAX_DISPATCH_TOKENS_DECODE -lt 256 ]]; then
255-
echo "[WARN] Clamping MORI_MAX_DISPATCH_TOKENS_DECODE from $MORI_MAX_DISPATCH_TOKENS_DECODE to 256 (All2All kernel threshold)"
256-
MORI_MAX_DISPATCH_TOKENS_DECODE=256
257-
fi
258-
fi
259-
260251
# =============================================================================
261252
# Cluster Topology Configuration
262253
# =============================================================================

perf-changelog.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3448,5 +3448,5 @@
34483448
- config-keys:
34493449
- dsr1-fp4-mi355x-sglang-disagg-8k1k-mtp
34503450
description:
3451-
- "Throwaway: conc-64 DEP8+MTP3 benchmark+eval WITH dispatch token clamp (MORI_MAX_DISPATCH_TOKENS_DECODE >= 256). A/B test for All2All kernel corruption fix."
3451+
- "Throwaway: conc-64 DEP8+MTP3 benchmark+eval WITHOUT dispatch token clamp (Run B of A/B test). Expect corrupted output / 0pct gsm8k."
34523452
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1659

0 commit comments

Comments
 (0)