Skip to content

Commit d4948f9

Browse files
[Klaud Cold] qwen3.5-fp8-mi355x-atom-mtp: enable --use-chat-template (#1555)
* [Klaud Cold] qwen3.5-fp8-mi355x-atom-mtp: enable --use-chat-template Aligns with other Qwen MTP recipes which apply the chat template during benchmark serving. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * perf-changelog: qwen3.5-fp8-mi355x-atom-mtp chat-template entry Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 298d8f9 commit d4948f9

2 files changed

Lines changed: 8 additions & 1 deletion

File tree

benchmarks/single_node/qwen3.5_fp8_mi355x_atom_mtp.sh

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -71,7 +71,8 @@ run_benchmark_serving \
7171
--max-concurrency "$CONC" \
7272
--result-filename "$RESULT_FILENAME" \
7373
--result-dir /workspace/ \
74-
--trust-remote-code
74+
--trust-remote-code \
75+
--use-chat-template
7576

7677
# After throughput, run evaluation only if RUN_EVAL is true
7778
if [ "${RUN_EVAL}" = "true" ]; then

perf-changelog.yaml

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3123,3 +3123,9 @@
31233123
- "Search-space: tp:8 ep:8 (TEP=8), conc-end 128 chosen at saturation per local sweep"
31243124
- "Local bench: TEP=8 peaks at C=128 with 26923 tot tps (+178% vs TEP=4 peak at C=32 in May 6 j11600242 sweep)"
31253125
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1516
3126+
3127+
- config-keys:
3128+
- qwen3.5-fp8-mi355x-atom-mtp
3129+
description:
3130+
- "Add --use-chat-template to run_benchmark_serving so prompts are formatted with the Qwen chat template (matching the other Qwen MTP recipes)"
3131+
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1555

0 commit comments

Comments
 (0)