You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
add dispatch token clamp (>=256) and run benchmark+eval at conc-64
Clamp MORI_MAX_DISPATCH_TOKENS_DECODE to minimum 256 when DP+EP are
both enabled, preventing SGLang's low-latency All2All kernel from being
selected. That kernel silently corrupts outputs at small buffer sizes.
Run A of A/B test: benchmark + eval WITH clamp on conc-64 DEP8+MTP3.
0 commit comments