Skip to content

Latest commit

 

History

History
60 lines (48 loc) · 6.71 KB

File metadata and controls

60 lines (48 loc) · 6.71 KB

投机采样Benchmark

Eagle3

1. Qwen3 Series Models

Model Method GSM8K Alpaca HumanEval MT-bench Mean
throughput (tokens/s) accept length throughput (tokens/s) accept length throughput (tokens/s) accept length throughput (tokens/s) accept length throughput (tokens/s) accept length
Qwen3-1.7B Vanilla 376.42 1 378.86 1 378.38 1 390.53 1 381.05 1
Eagle3 616.9 2.13 653.29 2.19 680.1 2.2 621.44 2.17 642.93 2.17
Qwen3-4B Vanilla 229.05 1 235.29 1 234.66 1 234.04 1 233.26 1
Eagle3 389.35 2.07 395.97 2.1 377.84 2.08 384.6 2.07 386.94 2.08
Qwen3-8B Vanilla 149.63 1 149.93 1 153.85 1 153.81 1 151.81 1
Eagle3 257.32 2 266.69 2.02 244.89 1.97 258.2 1.97 257.52 1.99
Qwen3-14B Vanilla 92.97 1 92.66 1 92.94 1 94.46 1 93.26 1
Eagle3 153.72 1.87 140.46 1.78 144.68 1.76 142.45 1.74 145.33 1.79
Qwen3-32B Vanilla 43.39 1 43.38 1 43.19 1 43.3 1 43.32 1
Eagle3 80.43 2.01 72.49 1.9 71.57 1.86 74.1 1.86 74.1 1.91
Qwen3-30B-A3B Vanilla 311.84 1 320.43 1 325.77 1 325.42 1 320.87 1
Eagle3 453.97 2.1 432.45 2.04 428.81 2.02 437.06 2.01 438.07 2.04

2. VLM Models

2.1 Qwen3-VL Series Models

Model Method GSM8K Alpaca HumanEval MT-bench MATH-500 MMMU MMStar Mean
throughput (tokens/s) accept length throughput (tokens/s) accept length throughput (tokens/s) accept length throughput (tokens/s) accept length throughput (tokens/s) accept length throughput (tokens/s) accept length throughput (tokens/s) accept length throughput (tokens/s) accept length
Qwen3-VL-2B-Instruct Vanilla 348.55 1 350.9 1 346.07 1 346.31 1 82.96 1 83.27 1 81.63 1 234.24 1
Eagle3 511.52 2.11 560.55 2.26 826.01 3.39 555.22 2.29 163.09 2.57 154.18 2.55 139.73 2.31 415.76 2.5
Qwen3-VL-4B-Instruct Vanilla 212.87 1 213.24 1 211.69 1 212.1 1 67.96 1 65.88 1 67.75 1 150.21 1
Eagle3 415.29 2.57 372.89 2.26 459.37 2.82 382.33 2.34 141.87 2.72 104.44 2.05 107.07 2.1 283.32 2.41
Qwen3-VL-30B-A3B-Instruct Vanilla 179.94 1 184.6 1 168.68 1 180.57 1 31.08 1 31.51 1 30.93 1 115.33 1
Eagle3 281.93 2.82 241.42 2.13 223.05 2.57 240.47 2.19 75.31 2.79 48.47 1.78 52.57 1.94 166.17 2.32

2.2 HunyuanOCR Model

Model Method OmniDocBench
throughput (tokens/s) accept length
Hunyuan-OCR Vanilla 70.12 1
Eagle3 108.1 2.08

3. Audio Models

3.1 Qwen2-Audio Model

Model Method LibriSpeech
throughput (tokens/s) accept length
Qwen2-Audio Vanilla 78.76 1
Eagle3 146.66 3.51

3.2 Fun-CosyVoice3 Model

Model Method LibriTTS
throughput (tokens/s) accept length
Fun-CosyVoice3 Vanilla - 1
Eagle3 - 1.96