Skip to content

Commit f8a1f5e

Browse files
cquil11claude
andcommitted
test: DSv4-Pro B300 vLLM agentic - drop --max-model-len
Drop the explicit --max-model-len so vLLM picks up DSv4-Pro's full native context window. Previous 1,000,000 was already above the trace dataset's max (937K), so behavior is unchanged for this particular dataset; the cleanup just makes the launcher consistent with the Kimi B200/B300 sister launchers and removes the synthetic cap from the server config entirely. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
1 parent 231af83 commit f8a1f5e

1 file changed

Lines changed: 0 additions & 4 deletions

File tree

benchmarks/single_node/agentic/dsv4_fp4_b300_vllm.sh

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -31,9 +31,6 @@ ADVANCE_MIN=${ADVANCE_MIN:-0.0}
3131
ADVANCE_MAX=${ADVANCE_MAX:-0.7}
3232
EP_SIZE=${EP_SIZE:-1}
3333
DP_ATTENTION=${DP_ATTENTION:-false}
34-
if [ -z "${MAX_MODEL_LEN:-}" ] || [ "$MAX_MODEL_LEN" = "0" ]; then
35-
MAX_MODEL_LEN=1000000
36-
fi
3734

3835
if [[ -n "${SLURM_JOB_ID:-}" ]]; then
3936
echo "JOB $SLURM_JOB_ID running on ${SLURMD_NODENAME:-unknown}"
@@ -136,7 +133,6 @@ vllm serve "$MODEL" \
136133
--reasoning-parser deepseek_v4 \
137134
--enable-prefix-caching \
138135
--no-disable-hybrid-kv-cache-manager \
139-
--max-model-len "$MAX_MODEL_LEN" \
140136
--max-num-seqs "$PER_ENGINE_MAX_NUM_SEQS" \
141137
$OFFLOAD_ARGS > "$SERVER_LOG" 2>&1 &
142138
SERVER_PID=$!

0 commit comments

Comments
 (0)