You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+5-5Lines changed: 5 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -355,7 +355,7 @@ We evaluated the Eagle3 model trained by AngelSlim on tasks including code gener
355
355
356
356
#### 1.1 Qwen3 Series Models
357
357
358
-
Benchmark results for Qwen3 series models using Eagle3 speculative decoding on vLLM (v0.11.2) across **MT-bench**, **HumanEval**, **GSM8K** and **Alpaca**, using a single NVIDIA H20 GPU (**tp=1, ep=1, num_speculative_tokens=2, batch_size=1, output_len=1024**).
358
+
Benchmark results for Qwen3 series models using Eagle3 speculative decoding on vLLM (v0.11.2) across **MT-bench**, **HumanEval**, **GSM8K** and **Alpaca**, using a single GPU (**tp=1, ep=1, num_speculative_tokens=2, batch_size=1, output_len=1024**).
359
359
360
360
<table>
361
361
<thead>
@@ -495,7 +495,7 @@ Benchmark results for Qwen3 series models using Eagle3 speculative decoding on v
495
495
496
496
##### 1.2.1 Qwen3-VL Series Models
497
497
498
-
Benchmark results for Qwen3-VL series models using Eagle3 speculative decoding on vLLM (v0.12.0) across language and multimodal tasks, using a single NVIDIA H20 GPU (**tp=1, ep=1, num_speculative_tokens=4, batch_size=1, output_len=1024**).
498
+
Benchmark results for Qwen3-VL series models using Eagle3 speculative decoding on vLLM (v0.12.0) across language and multimodal tasks, using a single GPU (**tp=1, ep=1, num_speculative_tokens=4, batch_size=1, output_len=1024**).
499
499
500
500
<table><thead>
501
501
<tr>
@@ -652,7 +652,7 @@ Benchmark results for Qwen3-VL series models using Eagle3 speculative decoding o
652
652
653
653
##### 1.2.2 HunyuanOCR Model
654
654
655
-
Benchmark results for HunyuanOCR using Eagle3 speculative decoding on vLLM (v0.13.0) across **[OmniDocBench](https://huggingface.co/datasets/opendatalab/OmniDocBench)** dataset, using a single NVIDIA H20 GPU (**tp=1, ep=1, num_speculative_tokens=4, batch_size=1, output_len=1024**).
655
+
Benchmark results for HunyuanOCR using Eagle3 speculative decoding on vLLM (v0.13.0) across **[OmniDocBench](https://huggingface.co/datasets/opendatalab/OmniDocBench)** dataset, using a single GPU (**tp=1, ep=1, num_speculative_tokens=4, batch_size=1, output_len=1024**).
656
656
657
657
<table><thead>
658
658
<tr>
@@ -685,7 +685,7 @@ Benchmark results for HunyuanOCR using Eagle3 speculative decoding on vLLM (v0.1
685
685
686
686
##### 1.3.1 Qwen2-Audio Model
687
687
688
-
Benchmark results for Qwen2-Audio using Eagle3 speculative decoding on vLLM (v0.12.0) across **[LibriSpeech](https://www.openslr.org/12)** dataset, using a single NVIDIA H20 GPU (**tp=1, ep=1, num_speculative_tokens=4, batch_size=1, output_len=1024**).
688
+
Benchmark results for Qwen2-Audio using Eagle3 speculative decoding on vLLM (v0.12.0) across **[LibriSpeech](https://www.openslr.org/12)** dataset, using a single GPU (**tp=1, ep=1, num_speculative_tokens=4, batch_size=1, output_len=1024**).
689
689
690
690
<table><thead>
691
691
<tr>
@@ -716,7 +716,7 @@ Benchmark results for Qwen2-Audio using Eagle3 speculative decoding on vLLM (v0.
716
716
717
717
##### 1.3.2 Fun-CosyVoice3 Model
718
718
719
-
Benchmark results for Fun-CosyVoice3 using Eagle3 speculative decoding across **[LibriTTS](https://www.openslr.org/60/)** dataset, using a single NVIDIA H20 GPU (**tp=1, ep=1, num_speculative_tokens=4, batch_size=1, output_len=1024**).
719
+
Benchmark results for Fun-CosyVoice3 using Eagle3 speculative decoding across **[LibriTTS](https://www.openslr.org/60/)** dataset, using a single GPU (**tp=1, ep=1, num_speculative_tokens=4, batch_size=1, output_len=1024**).
0 commit comments