You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
revert: drop MAX_MODEL_LEN cap from Kimi H100/H200 launchers
Per agentic benchmark design: must not cap context. Reverts the H100
MAX=16K + gpu-mem 0.85 and H200 MAX=131K caps; runs back to no
--max-model-len flag at all (vLLM uses the model's native context).
Any OOM / KV-init failures will be diagnosed separately rather than
sidestepped via a cap.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
0 commit comments