We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
1 parent 44aec22 commit 1f98965Copy full SHA for 1f98965
1 file changed
benchmarks/dsr1_fp4_mi355x_docker.sh
@@ -30,7 +30,9 @@ python3 -m sglang.launch_server --model-path=$MODEL --trust-remote-code \
30
--disable-radix-cache \
31
--num-continuous-decode-steps=4 \
32
--max-prefill-tokens=$PREFILL_SIZE \
33
---cuda-graph-max-bs=128 > $SERVER_LOG 2>&1 &
+--cuda-graph-max-bs=128 \
34
+--attention-backend aiter \
35
+--kv-cache-dtype fp8_e4m3 > $SERVER_LOG 2>&1 &
36
37
SERVER_PID=$!
38
0 commit comments