Skip to content

Commit dfaf45e

Browse files
authored
update vLLM benchmark results for Qwen3 series Eagle3 model (#172)
1 parent 7fa71bc commit dfaf45e

5 files changed

Lines changed: 773 additions & 139 deletions

File tree

README.md

Lines changed: 130 additions & 52 deletions
Original file line numberDiff line numberDiff line change
@@ -594,65 +594,143 @@ Other models such as GLM-4.6, Qwen2.5, and Seed-OSS have been evaluated on bench
594594
595595
#### 2.1 Qwen3 Series Models
596596
597-
Benchmark results for Qwen3 series models with `Eagle3` speculative decoding algorithm on datasets including `MT-bench`, `HunmanEval`, `GSM8K`, and `Alpaca`:
597+
**vLLM v0.11.2 Benchmark Results**
598598
599-
<table>
600-
<thead>
601-
<tr>
602-
<th>&nbsp</th><th>&nbsp</th>
603-
<th colspan="2" style="text-align: center; vertical-align: middle;">MT-bench</th>
604-
<th colspan="2" style="text-align: center; vertical-align: middle;">HumanEval</th>
605-
<th colspan="2" style="text-align: center; vertical-align: middle;">GSM8K</th>
606-
<th colspan="2" style="text-align: center; vertical-align: middle;">Alpaca</th>
607-
<th colspan="2" style="text-align: center; vertical-align: middle;">Mean</th></tr>
608-
<tr><th>Temperature</th><th>Model</th><th>Speedup</th><th>τ</th><th>Speedup</th><th>τ</th><th>Speedup</th><th>τ</th><th>Speedup</th><th>τ</th><th>Speedup</th><th>τ</th></tr>
609-
</thead>
610-
<tbody>
611-
<!-- <tr><td colspan="12" style="text-align: center; vertical-align: middle;"><strong>Temperature=0</strong></td></tr> -->
612-
<tr><td rowspan="6"><strong>T=0</strong></td>
613-
<td>Qwen3-1.7B</td><td>2.05x</td><td>2.81</td><td>2.07x</td><td>2.93</td><td>2.11x</td><td>2.98</td><td>1.93x</td><td>2.69</td><td>2.04x</td><td>2.85</td></tr>
614-
<tr> <td>Qwen3-4B</td><td>2.21x</td><td>3.01</td><td>2.36x</td><td>3.24</td><td>2.42x</td><td>3.13</td><td>2.32x</td><td>2.75</td><td>2.33x</td><td>3.03</td></tr>
615-
<tr><td>Qwen3-8B</td><td>2.63x</td><td>3.65</td><td>2.76x</td><td>3.85</td><td>2.82x</td><td>3.90</td><td>2.62x</td><td>3.48</td><td>2.70x</td><td>3.72</td></tr>
616-
<tr><td>Qwen3-14B</td><td>2.23x</td><td>3.30</td><td>2.53x</td><td>3.74</td><td>2.56x</td><td>3.79</td><td>2.16x</td><td>3.13</td><td>2.37x</td><td>3.49</td></tr>
617-
<tr><td>Qwen3-32B</td><td>2.39x</td><td>2.78</td><td>2.37x</td><td>2.81</td><td>2.47x</td><td>2.92</td><td>2.42x</td><td>2.53</td><td>2.41x</td><td>2.76</td></tr>
618-
<tr><td>Qwen3-30B-A3B</td><td>2.84x</td><td>3.63</td><td>2.27x</td><td>3.09</td><td>2.64x</td><td>3.42</td><td>2.83x</td><td>3.56</td><td>2.64x</td><td>3.42</td></tr>
619-
<!-- <tr><td colspan="12" style="text-align: center; vertical-align: middle;"><strong>Temperature=1</strong></td></tr> -->
620-
<tr><td rowspan="6"><strong>T=1</strong></td>
621-
<td>Qwen3-1.7B</td><td>1.74x</td><td>2.53</td><td>1.86x</td><td>2.70</td><td>1.82x</td><td>2.69</td><td>1.72x</td><td>2.46</td><td>1.93x</td><td>2.60</td></tr>
622-
<tr><td>Qwen3-4B</td><td>1.93x</td><td>2.60</td><td>2.00x</td><td>2.84</td><td>2.11x</td><td>2.82</td><td>2.34x</td><td>2.50</td><td>1.75x</td><td>2.69</td></tr>
623-
<tr><td>Qwen3-8B</td><td>1.98x</td><td>2.75</td><td>2.25x</td><td>3.11</td><td>2.31x</td><td>3.15</td><td>2.10x</td><td>2.76</td><td>2.90x</td><td>2.94</td></tr>
624-
<tr><td>Qwen3-14B</td><td>1.71x</td><td>2.61</td><td>1.95x</td><td>2.87</td><td>2.04x</td><td>3.08</td><td>1.68x</td><td>2.55</td><td>2.90x</td><td>2.78</td></tr>
625-
<tr><td>Qwen3-32B</td><td>1.62x</td><td>1.91</td><td>1.71x</td><td>2.05</td><td>1.78x</td><td>2.10</td><td>1.80x</td><td>1.95</td><td>1.62x</td><td>2.00</td></tr>
626-
<tr><td>Qwen3-30B-A3B</td><td>1.91x</td><td>2.46</td><td>2.00x</td><td>2.64</td><td>1.90x</td><td>2.53</td><td>1.80x</td><td>2.32</td><td>1.90x</td><td>2.48</td></tr>
627-
</tbody>
628-
</table>
629-
630-
#### 2.2 Hunyuan Series Models
631-
632-
Benchmark results for Hunyuan series models with `Eagle3` speculative decoding algorithm on datasets including `MT-bench`, `HunmanEval`, `GSM8K`, and `Alpaca`:
599+
We report benchmark results of the Qwen3 series models using the Eagle3 speculative decoding algorithm across multiple evaluation suites, including **MT-bench**, **HumanEval**, **GSM8K**, and **Alpaca**.
600+
All experiments were conducted on a single NVIDIA H20 GPU with the configuration:
601+
**tp=1, ep=1, num_speculative_tokens=2, batch_size=1, output_len=1024**.
633602
634603
<table>
635604
<thead>
636605
<tr>
637-
<th>&nbsp</th><th>&nbsp</th>
638-
<th colspan="2" style="text-align: center; vertical-align: middle;">MT-bench</th>
639-
<th colspan="2" style="text-align: center; vertical-align: middle;">HumanEval</th>
640-
<th colspan="2" style="text-align: center; vertical-align: middle;">GSM8K</th>
641-
<th colspan="2" style="text-align: center; vertical-align: middle;">Alpaca</th>
642-
<th colspan="2" style="text-align: center; vertical-align: middle;">Mean</th></tr>
643-
<tr><th>Temperature</th><th>Model</th><th>Speedup</th><th>τ</th><th>Speedup</th><th>τ</th><th>Speedup</th><th>τ</th><th>Speedup</th><th>τ</th><th>Speedup</th><th>τ</th></tr>
606+
<th>Model</th>
607+
<th>Method</th>
608+
<th colspan="2" style="text-align:center;">GSM8K</th>
609+
<th colspan="2" style="text-align:center;">Alpaca</th>
610+
<th colspan="2" style="text-align:center;">HumanEval</th>
611+
<th colspan="2" style="text-align:center;">MT-bench</th>
612+
<th colspan="2" style="text-align:center;">Mean</th>
613+
</tr>
614+
<tr>
615+
<th></th><th></th>
616+
<th>throughput (tokens/s)</th><th>accept length</th>
617+
<th>throughput (tokens/s)</th><th>accept length</th>
618+
<th>throughput (tokens/s)</th><th>accept length</th>
619+
<th>throughput (tokens/s)</th><th>accept length</th>
620+
<th>throughput (tokens/s)</th><th>accept length</th>
621+
</tr>
644622
</thead>
623+
645624
<tbody>
646-
<!-- <tr><td colspan="12" style="text-align: center; vertical-align: middle;"><strong>Temperature=0</strong></td></tr> -->
647-
<tr><td rowspan="3"><strong>T=0</strong></td>
648-
<td>Hunyuan-1.8B-Instruct</td><td>1.97x</td><td>2.90</td><td>2.58x</td><td>3.73</td><td>2.61x</td><td>3.71</td><td>1.71x</td><td>2.43</td><td>2.22x</td><td>3.19</td></tr>
649-
<tr> <td>Hunyuan-4B-Instruct</td><td>1.77x</td><td>2.60</td><td>2.64x</td><td>3.35</td><td>2.14x</td><td>3.17</td><td>1.72x</td><td>2.57</td><td>2.07x</td><td>2.92</td></tr>
650-
<tr><td>Hunyuan-7B-Instruct</td><td>2.22x</td><td>3.58</td><td>3.59x</td><td>5.47</td><td>2.96x</td><td>4.68</td><td>1.64x</td><td>2.56</td><td>2.60x</td><td>4.07</td></tr>
651-
<!-- <tr><td colspan="12" style="text-align: center; vertical-align: middle;"><strong>Temperature=1</strong></td></tr> -->
652-
<tr><td rowspan="3"><strong>T=1</strong></td>
653-
<td>Hunyuan-1.8B-Instruct</td><td>1.58x</td><td>2.36</td><td>2.35x</td><td>3.56</td><td>2.23x</td><td>3.38</td><td>1.26x</td><td>1.87</td><td>1.86x</td><td>2.79</td></tr>
654-
<tr><td>Hunyuan-4B-Instruct</td><td>1.36x</td><td>2.05</td><td>1.97x</td><td>2.86</td><td>1.72x</td><td>2.68</td><td>1.14x</td><td>1.76</td><td>1.55x</td><td>2.34</td></tr>
655-
<tr><td>Hunyuan-7B-Instruct</td><td>1.90x</td><td>3.11</td><td>3.12x</td><td>5.09</td><td>2.74x</td><td>4.34</td><td>1.47x</td><td>2.39</td><td>2.31x</td><td>3.73</td></tr>
625+
<!-- Qwen3-1.7B -->
626+
<tr>
627+
<td rowspan="2">Qwen3-1.7B</td>
628+
<td>Vanilla</td>
629+
<td>376.42</td><td>1</td>
630+
<td>378.86</td><td>1</td>
631+
<td>378.38</td><td>1</td>
632+
<td>390.53</td><td>1</td>
633+
<td>318.05</td><td>1</td>
634+
</tr>
635+
<tr>
636+
<td>Eagle3</td>
637+
<td>616.9</td><td>2.13</td>
638+
<td>653.29</td><td>2.19</td>
639+
<td>680.1</td><td>2.2</td>
640+
<td>621.44</td><td>2.17</td>
641+
<td>642.93</td><td>2.18</td>
642+
</tr>
643+
<!-- Qwen3-4B -->
644+
<tr>
645+
<td rowspan="2">Qwen3-4B</td>
646+
<td>Vanilla</td>
647+
<td>229.05</td><td>1</td>
648+
<td>235.29</td><td>1</td>
649+
<td>234.66</td><td>1</td>
650+
<td>234.04</td><td>1</td>
651+
<td>233.26</td><td>1</td>
652+
</tr>
653+
<tr>
654+
<td>Eagle3</td>
655+
<td>389.35</td><td>2.07</td>
656+
<td>395.97</td><td>2.1</td>
657+
<td>377.84</td><td>2.08</td>
658+
<td>384.6</td><td>2.07</td>
659+
<td>386.94</td><td>2.08</td>
660+
</tr>
661+
<!-- Qwen3-8B -->
662+
<tr>
663+
<td rowspan="2">Qwen3-8B</td>
664+
<td>Vanilla</td>
665+
<td>149.63</td><td>1</td>
666+
<td>149.93</td><td>1</td>
667+
<td>153.85</td><td>1</td>
668+
<td>153.81</td><td>1</td>
669+
<td>151.81</td><td>1</td>
670+
</tr>
671+
<tr>
672+
<td>Eagle3</td>
673+
<td>257.32</td><td>2</td>
674+
<td>266.69</td><td>2.02</td>
675+
<td>244.89</td><td>1.97</td>
676+
<td>258.2</td><td>1.97</td>
677+
<td>257.52</td><td>1.99</td>
678+
</tr>
679+
<!-- Qwen3-14B -->
680+
<tr>
681+
<td rowspan="2">Qwen3-14B</td>
682+
<td>Vanilla</td>
683+
<td>92.97</td><td>1</td>
684+
<td>92.66</td><td>1</td>
685+
<td>92.94</td><td>1</td>
686+
<td>94.46</td><td>1</td>
687+
<td>93.26</td><td>1</td>
688+
</tr>
689+
<tr>
690+
<td>Eagle3</td>
691+
<td>153.72</td><td>1.87</td>
692+
<td>140.46</td><td>1.78</td>
693+
<td>144.68</td><td>1.76</td>
694+
<td>142.45</td><td>1.74</td>
695+
<td>145.33</td><td>1.79</td>
696+
</tr>
697+
<!-- Qwen3-32B -->
698+
<tr>
699+
<td rowspan="2">Qwen3-32B</td>
700+
<td>Vanilla</td>
701+
<td>43.49</td><td>1</td>
702+
<td>43.38</td><td>1</td>
703+
<td>43.19</td><td>1</td>
704+
<td>43.3</td><td>1</td>
705+
<td>43.32</td><td>1</td>
706+
</tr>
707+
<tr>
708+
<td>Eagle3</td>
709+
<td>80.43</td><td>2.01</td>
710+
<td>72.49</td><td>1.9</td>
711+
<td>71.57</td><td>1.86</td>
712+
<td>74.1</td><td>1.86</td>
713+
<td>74.1</td><td>1.91</td>
714+
</tr>
715+
<!-- Qwen3-30B-A3B -->
716+
<tr>
717+
<td rowspan="2">Qwen3-30B-A3B</td>
718+
<td>Vanilla</td>
719+
<td>311.84</td><td>1</td>
720+
<td>320.43</td><td>1</td>
721+
<td>325.77</td><td>1</td>
722+
<td>325.42</td><td>1</td>
723+
<td>320.87</td><td>1</td>
724+
</tr>
725+
<tr>
726+
<td>Eagle3</td>
727+
<td>453.97</td><td>2.1</td>
728+
<td>432.45</td><td>2.04</td>
729+
<td>428.81</td><td>2.02</td>
730+
<td>437.06</td><td>2.01</td>
731+
<td>438.07</td><td>2.04</td>
732+
</tr>
733+
656734
</tbody>
657735
</table>
658736

README_cn.md

Lines changed: 126 additions & 52 deletions
Original file line numberDiff line numberDiff line change
@@ -601,65 +601,139 @@ Qwen3-Omni系列模型的`BF16`、`FP8-Static`、`FP8-Dynamic`在`aime25`、`gpq
601601
602602
#### 2.1 Qwen3系列模型
603603
604-
Qwen3系列的Eagle3模型在MT-bench/HunmanEval/GSM8K/Alpaca上的加速结果如下:
604+
我们使用vLLM(v0.11.2)评测了Qwen3系列Eagle3模型在**MT-bench**、 **HumanEval**、 **GSM8K**、**Alpaca**等数据集上的接收长度和吞吐。全部结果都是在单张H20上用以下设置测得:**tp=1, ep=1, num_speculative_tokens=2, batch_size=1, output_len=1024**。
605605
606606
<table>
607607
<thead>
608608
<tr>
609-
<th>&nbsp</th><th>&nbsp</th>
610-
<th colspan="2" style="text-align: center; vertical-align: middle;">MT-bench</th>
611-
<th colspan="2" style="text-align: center; vertical-align: middle;">HumanEval</th>
612-
<th colspan="2" style="text-align: center; vertical-align: middle;">GSM8K</th>
613-
<th colspan="2" style="text-align: center; vertical-align: middle;">Alpaca</th>
614-
<th colspan="2" style="text-align: center; vertical-align: middle;">Mean</th></tr>
615-
<tr><th>Temperature</th><th>Model</th><th>Speedup</th><th>τ</th><th>Speedup</th><th>τ</th><th>Speedup</th><th>τ</th><th>Speedup</th><th>τ</th><th>Speedup</th><th>τ</th></tr>
616-
</thead>
617-
<tbody>
618-
<!-- <tr><td colspan="12" style="text-align: center; vertical-align: middle;"><strong>Temperature=0</strong></td></tr> -->
619-
<tr><td rowspan="6"><strong>T=0</strong></td>
620-
<td>Qwen3-1.7B</td><td>2.05x</td><td>2.81</td><td>2.07x</td><td>2.93</td><td>2.11x</td><td>2.98</td><td>1.93x</td><td>2.69</td><td>2.04x</td><td>2.85</td></tr>
621-
<tr> <td>Qwen3-4B</td><td>2.21x</td><td>3.01</td><td>2.36x</td><td>3.24</td><td>2.42x</td><td>3.13</td><td>2.32x</td><td>2.75</td><td>2.33x</td><td>3.03</td></tr>
622-
<tr><td>Qwen3-8B</td><td>2.63x</td><td>3.65</td><td>2.76x</td><td>3.85</td><td>2.82x</td><td>3.90</td><td>2.62x</td><td>3.48</td><td>2.70x</td><td>3.72</td></tr>
623-
<tr><td>Qwen3-14B</td><td>2.23x</td><td>3.30</td><td>2.53x</td><td>3.74</td><td>2.56x</td><td>3.79</td><td>2.16x</td><td>3.13</td><td>2.37x</td><td>3.49</td></tr>
624-
<tr><td>Qwen3-32B</td><td>2.39x</td><td>2.78</td><td>2.37x</td><td>2.81</td><td>2.47x</td><td>2.92</td><td>2.42x</td><td>2.53</td><td>2.41x</td><td>2.76</td></tr>
625-
<tr><td>Qwen3-30B-A3B</td><td>2.84x</td><td>3.63</td><td>2.27x</td><td>3.09</td><td>2.64x</td><td>3.42</td><td>2.83x</td><td>3.56</td><td>2.64x</td><td>3.42</td></tr>
626-
<!-- <tr><td colspan="12" style="text-align: center; vertical-align: middle;"><strong>Temperature=1</strong></td></tr> -->
627-
<tr><td rowspan="6"><strong>T=1</strong></td>
628-
<td>Qwen3-1.7B</td><td>1.74x</td><td>2.53</td><td>1.86x</td><td>2.70</td><td>1.82x</td><td>2.69</td><td>1.72x</td><td>2.46</td><td>1.93x</td><td>2.60</td></tr>
629-
<tr><td>Qwen3-4B</td><td>1.93x</td><td>2.60</td><td>2.00x</td><td>2.84</td><td>2.11x</td><td>2.82</td><td>2.34x</td><td>2.50</td><td>1.75x</td><td>2.69</td></tr>
630-
<tr><td>Qwen3-8B</td><td>1.98x</td><td>2.75</td><td>2.25x</td><td>3.11</td><td>2.31x</td><td>3.15</td><td>2.10x</td><td>2.76</td><td>2.90x</td><td>2.94</td></tr>
631-
<tr><td>Qwen3-14B</td><td>1.71x</td><td>2.61</td><td>1.95x</td><td>2.87</td><td>2.04x</td><td>3.08</td><td>1.68x</td><td>2.55</td><td>2.90x</td><td>2.78</td></tr>
632-
<tr><td>Qwen3-32B</td><td>1.62x</td><td>1.91</td><td>1.71x</td><td>2.05</td><td>1.78x</td><td>2.10</td><td>1.80x</td><td>1.95</td><td>1.62x</td><td>2.00</td></tr>
633-
<tr><td>Qwen3-30B-A3B</td><td>1.91x</td><td>2.46</td><td>2.00x</td><td>2.64</td><td>1.90x</td><td>2.53</td><td>1.80x</td><td>2.32</td><td>1.90x</td><td>2.48</td></tr>
634-
</tbody>
635-
</table>
636-
637-
#### 2.2 Hunyuan系列模型
638-
639-
Hunyuan系列的Eagle3模型在MT-bench/HunmanEval/GSM8K/Alpaca上的加速结果如下:
640-
641-
<table>
642-
<thead>
609+
<th>Model</th>
610+
<th>Method</th>
611+
<th colspan="2" style="text-align:center;">GSM8K</th>
612+
<th colspan="2" style="text-align:center;">Alpaca</th>
613+
<th colspan="2" style="text-align:center;">HumanEval</th>
614+
<th colspan="2" style="text-align:center;">MT-bench</th>
615+
<th colspan="2" style="text-align:center;">Mean</th>
616+
</tr>
643617
<tr>
644-
<th>&nbsp</th><th>&nbsp</th>
645-
<th colspan="2" style="text-align: center; vertical-align: middle;">MT-bench</th>
646-
<th colspan="2" style="text-align: center; vertical-align: middle;">HumanEval</th>
647-
<th colspan="2" style="text-align: center; vertical-align: middle;">GSM8K</th>
648-
<th colspan="2" style="text-align: center; vertical-align: middle;">Alpaca</th>
649-
<th colspan="2" style="text-align: center; vertical-align: middle;">Mean</th></tr>
650-
<tr><th>Temperature</th><th>Model</th><th>Speedup</th><th>τ</th><th>Speedup</th><th>τ</th><th>Speedup</th><th>τ</th><th>Speedup</th><th>τ</th><th>Speedup</th><th>τ</th></tr>
618+
<th></th><th></th>
619+
<th>throughput (tokens/s)</th><th>accept length</th>
620+
<th>throughput (tokens/s)</th><th>accept length</th>
621+
<th>throughput (tokens/s)</th><th>accept length</th>
622+
<th>throughput (tokens/s)</th><th>accept length</th>
623+
<th>throughput (tokens/s)</th><th>accept length</th>
624+
</tr>
651625
</thead>
626+
652627
<tbody>
653-
<!-- <tr><td colspan="12" style="text-align: center; vertical-align: middle;"><strong>Temperature=0</strong></td></tr> -->
654-
<tr><td rowspan="3"><strong>T=0</strong></td>
655-
<td>Hunyuan-1.8B-Instruct</td><td>1.97x</td><td>2.90</td><td>2.58x</td><td>3.73</td><td>2.61x</td><td>3.71</td><td>1.71x</td><td>2.43</td><td>2.22x</td><td>3.19</td></tr>
656-
<tr> <td>Hunyuan-4B-Instruct</td><td>1.77x</td><td>2.60</td><td>2.64x</td><td>3.35</td><td>2.14x</td><td>3.17</td><td>1.72x</td><td>2.57</td><td>2.07x</td><td>2.92</td></tr>
657-
<tr><td>Hunyuan-7B-Instruct</td><td>2.22x</td><td>3.58</td><td>3.59x</td><td>5.47</td><td>2.96x</td><td>4.68</td><td>1.64x</td><td>2.56</td><td>2.60x</td><td>4.07</td></tr>
658-
<!-- <tr><td colspan="12" style="text-align: center; vertical-align: middle;"><strong>Temperature=1</strong></td></tr> -->
659-
<tr><td rowspan="3"><strong>T=1</strong></td>
660-
<td>Hunyuan-1.8B-Instruct</td><td>1.58x</td><td>2.36</td><td>2.35x</td><td>3.56</td><td>2.23x</td><td>3.38</td><td>1.26x</td><td>1.87</td><td>1.86x</td><td>2.79</td></tr>
661-
<tr><td>Hunyuan-4B-Instruct</td><td>1.36x</td><td>2.05</td><td>1.97x</td><td>2.86</td><td>1.72x</td><td>2.68</td><td>1.14x</td><td>1.76</td><td>1.55x</td><td>2.34</td></tr>
662-
<tr><td>Hunyuan-7B-Instruct</td><td>1.90x</td><td>3.11</td><td>3.12x</td><td>5.09</td><td>2.74x</td><td>4.34</td><td>1.47x</td><td>2.39</td><td>2.31x</td><td>3.73</td></tr>
628+
<!-- Qwen3-1.7B -->
629+
<tr>
630+
<td rowspan="2">Qwen3-1.7B</td>
631+
<td>Vanilla</td>
632+
<td>376.42</td><td>1</td>
633+
<td>378.86</td><td>1</td>
634+
<td>378.38</td><td>1</td>
635+
<td>390.53</td><td>1</td>
636+
<td>318.05</td><td>1</td>
637+
</tr>
638+
<tr>
639+
<td>Eagle3</td>
640+
<td>616.9</td><td>2.13</td>
641+
<td>653.29</td><td>2.19</td>
642+
<td>680.1</td><td>2.2</td>
643+
<td>621.44</td><td>2.17</td>
644+
<td>642.93</td><td>2.18</td>
645+
</tr>
646+
<!-- Qwen3-4B -->
647+
<tr>
648+
<td rowspan="2">Qwen3-4B</td>
649+
<td>Vanilla</td>
650+
<td>229.05</td><td>1</td>
651+
<td>235.29</td><td>1</td>
652+
<td>234.66</td><td>1</td>
653+
<td>234.04</td><td>1</td>
654+
<td>233.26</td><td>1</td>
655+
</tr>
656+
<tr>
657+
<td>Eagle3</td>
658+
<td>389.35</td><td>2.07</td>
659+
<td>395.97</td><td>2.1</td>
660+
<td>377.84</td><td>2.08</td>
661+
<td>384.6</td><td>2.07</td>
662+
<td>386.94</td><td>2.08</td>
663+
</tr>
664+
<!-- Qwen3-8B -->
665+
<tr>
666+
<td rowspan="2">Qwen3-8B</td>
667+
<td>Vanilla</td>
668+
<td>149.63</td><td>1</td>
669+
<td>149.93</td><td>1</td>
670+
<td>153.85</td><td>1</td>
671+
<td>153.81</td><td>1</td>
672+
<td>151.81</td><td>1</td>
673+
</tr>
674+
<tr>
675+
<td>Eagle3</td>
676+
<td>257.32</td><td>2</td>
677+
<td>266.69</td><td>2.02</td>
678+
<td>244.89</td><td>1.97</td>
679+
<td>258.2</td><td>1.97</td>
680+
<td>257.52</td><td>1.99</td>
681+
</tr>
682+
<!-- Qwen3-14B -->
683+
<tr>
684+
<td rowspan="2">Qwen3-14B</td>
685+
<td>Vanilla</td>
686+
<td>92.97</td><td>1</td>
687+
<td>92.66</td><td>1</td>
688+
<td>92.94</td><td>1</td>
689+
<td>94.46</td><td>1</td>
690+
<td>93.26</td><td>1</td>
691+
</tr>
692+
<tr>
693+
<td>Eagle3</td>
694+
<td>153.72</td><td>1.87</td>
695+
<td>140.46</td><td>1.78</td>
696+
<td>144.68</td><td>1.76</td>
697+
<td>142.45</td><td>1.74</td>
698+
<td>145.33</td><td>1.79</td>
699+
</tr>
700+
<!-- Qwen3-32B -->
701+
<tr>
702+
<td rowspan="2">Qwen3-32B</td>
703+
<td>Vanilla</td>
704+
<td>43.49</td><td>1</td>
705+
<td>43.38</td><td>1</td>
706+
<td>43.19</td><td>1</td>
707+
<td>43.3</td><td>1</td>
708+
<td>43.32</td><td>1</td>
709+
</tr>
710+
<tr>
711+
<td>Eagle3</td>
712+
<td>80.43</td><td>2.01</td>
713+
<td>72.49</td><td>1.9</td>
714+
<td>71.57</td><td>1.86</td>
715+
<td>74.1</td><td>1.86</td>
716+
<td>74.1</td><td>1.91</td>
717+
</tr>
718+
<!-- Qwen3-30B-A3B -->
719+
<tr>
720+
<td rowspan="2">Qwen3-30B-A3B</td>
721+
<td>Vanilla</td>
722+
<td>311.84</td><td>1</td>
723+
<td>320.43</td><td>1</td>
724+
<td>325.77</td><td>1</td>
725+
<td>325.42</td><td>1</td>
726+
<td>320.87</td><td>1</td>
727+
</tr>
728+
<tr>
729+
<td>Eagle3</td>
730+
<td>453.97</td><td>2.1</td>
731+
<td>432.45</td><td>2.04</td>
732+
<td>428.81</td><td>2.02</td>
733+
<td>437.06</td><td>2.01</td>
734+
<td>438.07</td><td>2.04</td>
735+
</tr>
736+
663737
</tbody>
664738
</table>
665739

0 commit comments

Comments
 (0)