Skip to content

Commit 94fdfa4

Browse files
Update server args in bench speedup (#246)
* update sglang args * Update benchmarks/bench_model_speedup.py Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> --------- Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
1 parent 3f31f5e commit 94fdfa4

1 file changed

Lines changed: 5 additions & 2 deletions

File tree

benchmarks/bench_model_speedup.py

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -198,8 +198,11 @@ def launch_sglang_server(
198198
if server_args.trust_remote_code:
199199
sglang_args.extend(["--trust-remote-code"])
200200

201-
if server_args.enable_ep_moe:
202-
sglang_args.extend(["--enable-ep-moe"])
201+
if server_args.disable_radix_cache:
202+
sglang_args.extend(["--disable-radix-cache"])
203+
204+
if server_args.ep_size:
205+
sglang_args.extend(["--ep-size", str(server_args.ep_size)])
203206

204207
if server_args.attention_backend:
205208
sglang_args.extend(["--attention-backend", server_args.attention_backend])

0 commit comments

Comments
 (0)