Skip to content

Commit a62fc6c

Browse files
irisliu10root
andauthored
fix typo & add eagle3 model link (#204)
Co-authored-by: root <root@TENCENT64.site>
1 parent 502da46 commit a62fc6c

2 files changed

Lines changed: 28 additions & 32 deletions

File tree

README.md

Lines changed: 13 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ A more accessible, comprehensive, and efficient toolkit for large model compress
1717
</p>
1818

1919
## 📣Latest News
20-
- [26/01/13] We have released v0.3. We support the training and deployment of [Eagle3 for LLM/VLM/Audio](https://angelslim.readthedocs.io/zh-cn/latest/features/speculative_decoding/eagle/index.html) Multimodal models. And We released **Sherry**, the hardware-efficient 1.25 bit quantization algorithm [Paper Comming soon] | [[Code]](https://github.com/Tencent/AngelSlim/tree/sherry/Sherry)🔥🔥🔥
20+
- [26/01/13] We have released v0.3. We support the training and deployment of Eagle3 for all-scale LLMs/VLMs/Audio models, as detailed in the [guidance documentation](https://angelslim.readthedocs.io/zh-cn/latest/features/speculative_decoding/eagle/index.html). And We released **Sherry**, the hardware-efficient 1.25 bit quantization algorithm [Paper Comming soon] | [[Code]](https://github.com/Tencent/AngelSlim/tree/sherry/Sherry)🔥🔥🔥
2121
- [25/11/05] We have released v0.2. Quantization support for new models, such as `GLM-4.6`, `Qwen3-VL` and `Qwen3-Omni`, open-sources the Eagle3 speculative decoding training framework, and updates the Diffusion model quantization tools.
2222
- [25/09/30] We have released **SpecExit**, the reasoning early-exit algorithm: [[Paper]](http://arxiv.org/abs/2509.24248) | [[Docs]](https://angelslim.readthedocs.io/zh-cn/latest/features/speculative_decoding/spec_exit.html) | [[vLLM Code]](https://github.com/vllm-project/vllm/pull/27192)
2323
- [25/09/26] We have released **TEQUILA**, the ternary quantization algorithm [[Paper]](https://arxiv.org/abs/2509.23809) | [[Code]](https://github.com/Tencent/AngelSlim/tree/tequila/TernaryQuant)
@@ -232,8 +232,6 @@ bash scripts/speculative/generate_data_for_target_model.sh
232232
bash scripts/speculative/train_eagle3_online.sh
233233
```
234234

235-
For detailed training configurations and vLLM performance benchmarks of Eagle3, please refer to the [Quick Start Guide for Speculative Sampling](https://angelslim.readthedocs.io/zh-cn/latest/getting_started/quickstrat.html#id5).
236-
237235
Training and Deployment Guide for Multimodal Model Eagle3—Supporting LLM, VLM, and Audio (ASR & TTS) Models: [LLM](https://angelslim.readthedocs.io/zh-cn/latest/features/speculative_decoding/eagle/eagle.html) | [VLM](https://angelslim.readthedocs.io/zh-cn/latest/features/speculative_decoding/eagle/vlm_eagle.html) | [Audio(ASR)](https://angelslim.readthedocs.io/zh-cn/latest/features/speculative_decoding/eagle/audio_asr_eagle.html) | [Audio(TTS)](https://angelslim.readthedocs.io/zh-cn/latest/features/speculative_decoding/eagle/audio_tts_eagle.html).
238236

239237
#### 2.2 LLM/VLM Model Quantization
@@ -392,7 +390,7 @@ Benchmark results for Qwen3 series models using Eagle3 speculative decoding on v
392390
<td>381.05</td><td>1</td>
393391
</tr>
394392
<tr>
395-
<td>Eagle3</td>
393+
<td><a href="https://huggingface.co/AngelSlim/Qwen3-1.7B_eagle3">Eagle3</a></td>
396394
<td>616.9</td><td>2.13</td>
397395
<td>653.29</td><td>2.19</td>
398396
<td>680.1</td><td>2.2</td>
@@ -410,7 +408,7 @@ Benchmark results for Qwen3 series models using Eagle3 speculative decoding on v
410408
<td>233.26</td><td>1</td>
411409
</tr>
412410
<tr>
413-
<td>Eagle3</td>
411+
<td><a href="https://huggingface.co/AngelSlim/Qwen3-4B_eagle3">Eagle3</a></td>
414412
<td>389.35</td><td>2.07</td>
415413
<td>395.97</td><td>2.1</td>
416414
<td>377.84</td><td>2.08</td>
@@ -428,7 +426,7 @@ Benchmark results for Qwen3 series models using Eagle3 speculative decoding on v
428426
<td>151.81</td><td>1</td>
429427
</tr>
430428
<tr>
431-
<td>Eagle3</td>
429+
<td><a href="https://huggingface.co/AngelSlim/Qwen3-8B_eagle3">Eagle3</a></td>
432430
<td>257.32</td><td>2</td>
433431
<td>266.69</td><td>2.02</td>
434432
<td>244.89</td><td>1.97</td>
@@ -446,7 +444,7 @@ Benchmark results for Qwen3 series models using Eagle3 speculative decoding on v
446444
<td>93.26</td><td>1</td>
447445
</tr>
448446
<tr>
449-
<td>Eagle3</td>
447+
<td><a href="https://huggingface.co/AngelSlim/Qwen3-14B_eagle3">Eagle3</a></td>
450448
<td>153.72</td><td>1.87</td>
451449
<td>140.46</td><td>1.78</td>
452450
<td>144.68</td><td>1.76</td>
@@ -464,7 +462,7 @@ Benchmark results for Qwen3 series models using Eagle3 speculative decoding on v
464462
<td>43.32</td><td>1</td>
465463
</tr>
466464
<tr>
467-
<td>Eagle3</td>
465+
<td><a href="https://huggingface.co/AngelSlim/Qwen3-32B_eagle3">Eagle3</a></td>
468466
<td>80.43</td><td>2.01</td>
469467
<td>72.49</td><td>1.9</td>
470468
<td>71.57</td><td>1.86</td>
@@ -482,7 +480,7 @@ Benchmark results for Qwen3 series models using Eagle3 speculative decoding on v
482480
<td>320.87</td><td>1</td>
483481
</tr>
484482
<tr>
485-
<td>Eagle3</td>
483+
<td><a href="https://huggingface.co/AngelSlim/Qwen3-a3B_eagle3">Eagle3</a></td>
486484
<td>453.97</td><td>2.1</td>
487485
<td>432.45</td><td>2.04</td>
488486
<td>428.81</td><td>2.02</td>
@@ -554,7 +552,7 @@ Benchmark results for Qwen3-VL series models using Eagle3 speculative decoding o
554552
<td>1</td>
555553
</tr>
556554
<tr>
557-
<td>Eagle3</td>
555+
<td><a href="https://huggingface.co/AngelSlim/Qwen3-VL-2B-Instruct_eagle3">Eagle3</a></td>
558556
<td>511.52</td>
559557
<td>2.11</td>
560558
<td>560.55</td>
@@ -593,7 +591,7 @@ Benchmark results for Qwen3-VL series models using Eagle3 speculative decoding o
593591
<td>1</td>
594592
</tr>
595593
<tr>
596-
<td>Eagle3</td>
594+
<td><a href="https://huggingface.co/AngelSlim/Qwen3-VL-4B-Instruct_eagle3">Eagle3</a></td>
597595
<td>415.29</td>
598596
<td>2.57</td>
599597
<td>372.89</td>
@@ -632,7 +630,7 @@ Benchmark results for Qwen3-VL series models using Eagle3 speculative decoding o
632630
<td>1</td>
633631
</tr>
634632
<tr>
635-
<td>Eagle3</td>
633+
<td><a href="https://huggingface.co/AngelSlim/Qwen3-VL-30B-A3B-Instruct_eagle3">Eagle3</a></td>
636634
<td>281.93</td>
637635
<td>2.82</td>
638636
<td>241.42</td>
@@ -676,7 +674,7 @@ Benchmark results for HunyuanOCR using Eagle3 speculative decoding on vLLM (v0.1
676674
<td>1</td>
677675
</tr>
678676
<tr>
679-
<td>Eagle3</td>
677+
<td><a href="https://huggingface.co/AngelSlim/HunyuanOCR_eagle3">Eagle3</a></td>
680678
<td>108.1</td>
681679
<td>2.08</td>
682680
</tr>
@@ -709,7 +707,7 @@ Benchmark results for Qwen2-Audio using Eagle3 speculative decoding on vLLM (v0.
709707
<td>1</td>
710708
</tr>
711709
<tr>
712-
<td>Eagle3</td>
710+
<td><a href="https://huggingface.co/AngelSlim/Qwen2-Audio-7B-Instruct_eagle3">Eagle3</a></td>
713711
<td>146.66</td>
714712
<td>3.51</td>
715713
</tr>
@@ -740,7 +738,7 @@ Benchmark results for Fun-CosyVoice3 using Eagle3 speculative decoding across **
740738
<td>1</td>
741739
</tr>
742740
<tr>
743-
<td>Eagle3</td>
741+
<td><a href="https://huggingface.co/AngelSlim/Fun-CosyVoice3-0.5B-2512_eagle3">Eagle3</a></td>
744742
<td>-</td>
745743
<td>1.96</td>
746744
</tr>

README_cn.md

Lines changed: 15 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -17,10 +17,10 @@
1717
</p>
1818

1919
## 📣最新进展
20-
- [26/01/13]我们发布V0.2版本, 支持了全模态场景的投机采样训练及部署,文档:[Eagle3 for LLM/VLM/Audio](https://angelslim.readthedocs.io/zh-cn/latest/features/speculative_decoding/eagle/index.html)。并且我们发布了 **Sherry** 新的硬件高效的1.25bit三值量化算法 [论文即将发布] | [[代码]](https://github.com/Tencent/AngelSlim/tree/sherry/Sherry)🔥🔥🔥
20+
- [26/01/13]我们发布V0.3版本, 支持了全模态场景的投机采样训练及部署,文档:[Eagle3 for LLM/VLM/Audio](https://angelslim.readthedocs.io/zh-cn/latest/features/speculative_decoding/eagle/index.html)。并且我们发布了 **Sherry** 新的硬件高效的1.25bit三值量化算法 [论文即将发布] | [[代码]](https://github.com/Tencent/AngelSlim/tree/sherry/Sherry)🔥🔥🔥
2121
- [25/11/05] 我们发布V0.2版本,支持了包括GLM-4.6/Qwen3-VL/Qwen3-Omni等更多模型的量化,开源投机采样Eagle3训练框架,更新Diffusion模型量化工具。
22-
- [25/09/30] 我们开源了思考早退新算法 **SpecExit** [[论文]](http://arxiv.org/abs/2509.24248) | [[文档]](https://angelslim.readthedocs.io/zh-cn/latest/features/speculative_decoding/spec_exit.html) | [[vLLM代码]](https://github.com/vllm-project/vllm/pull/27192)🔥🔥🔥
23-
- [25/09/30] 我们发布了三值量化新算法 **Tequila** [[论文]](https://arxiv.org/abs/2509.23809) | [[代码]](https://github.com/Tencent/AngelSlim/tree/tequila/TernaryQuant)。🔥🔥🔥
22+
- [25/09/30] 我们开源了思考早退新算法 **SpecExit** [[论文]](http://arxiv.org/abs/2509.24248) | [[文档]](https://angelslim.readthedocs.io/zh-cn/latest/features/speculative_decoding/spec_exit.html) | [[vLLM代码]](https://github.com/vllm-project/vllm/pull/27192)
23+
- [25/09/30] 我们发布了三值量化新算法 **Tequila** [[论文]](https://arxiv.org/abs/2509.23809) | [[代码]](https://github.com/Tencent/AngelSlim/tree/tequila/TernaryQuant)
2424
- [25/09/24] 我们支持了Qwen3系列模型的NVFP4的PTQ量化,我们还开源了[Qwen3-32B-NVFP4](https://huggingface.co/AngelSlim/Qwen3-32B_nvfp4)[Qwen3-235B-A22B-NVFP4](https://huggingface.co/AngelSlim/Qwen3-235B-A22B_nvfp4)权重。
2525

2626
<details>
@@ -233,8 +233,6 @@ bash scripts/speculative/generate_data_for_target_model.sh
233233
bash scripts/speculative/train_eagle3_online.sh
234234
```
235235

236-
详细训练配置,以及`Eagle3`的vLLM性能测试,详情请参考投机采样[快速开始文档](https://angelslim.readthedocs.io/zh-cn/latest/getting_started/quickstrat.html#id5)
237-
238236
多模态模型 Eagle3 训练与部署指南,支持LLM / VLM / Audio (ASR & TTS) 模型:[LLM](https://angelslim.readthedocs.io/zh-cn/latest/features/speculative_decoding/eagle/eagle.html) | [VLM](https://angelslim.readthedocs.io/zh-cn/latest/features/speculative_decoding/eagle/vlm_eagle.html) | [Audio(ASR)](https://angelslim.readthedocs.io/zh-cn/latest/features/speculative_decoding/eagle/audio_asr_eagle.html) | [Audio(TTS)](https://angelslim.readthedocs.io/zh-cn/latest/features/speculative_decoding/eagle/audio_tts_eagle.html).
239237
#### 2.2 LLM/VLM模型量化
240238
完成安装`AngelSlim`后,您可以通过以下脚本快速开始,完成`Qwen3-1.7B`模型的静态`FP8`量化:
@@ -395,7 +393,7 @@ bash scripts/deploy/lm_eval.sh -d 0,1 -t 2 -g 0.8 -r $RESULT_PATH -b "auto" --ta
395393
<td>381.05</td><td>1</td>
396394
</tr>
397395
<tr>
398-
<td>Eagle3</td>
396+
<td><a href="https://huggingface.co/AngelSlim/Qwen3-1.7B_eagle3">Eagle3</a></td>
399397
<td>616.9</td><td>2.13</td>
400398
<td>653.29</td><td>2.19</td>
401399
<td>680.1</td><td>2.2</td>
@@ -413,7 +411,7 @@ bash scripts/deploy/lm_eval.sh -d 0,1 -t 2 -g 0.8 -r $RESULT_PATH -b "auto" --ta
413411
<td>233.26</td><td>1</td>
414412
</tr>
415413
<tr>
416-
<td>Eagle3</td>
414+
<td><a href="https://huggingface.co/AngelSlim/Qwen3-4B_eagle3">Eagle3</a></td>
417415
<td>389.35</td><td>2.07</td>
418416
<td>395.97</td><td>2.1</td>
419417
<td>377.84</td><td>2.08</td>
@@ -431,7 +429,7 @@ bash scripts/deploy/lm_eval.sh -d 0,1 -t 2 -g 0.8 -r $RESULT_PATH -b "auto" --ta
431429
<td>151.81</td><td>1</td>
432430
</tr>
433431
<tr>
434-
<td>Eagle3</td>
432+
<td><a href="https://huggingface.co/AngelSlim/Qwen3-8B_eagle3">Eagle3</a></td>
435433
<td>257.32</td><td>2</td>
436434
<td>266.69</td><td>2.02</td>
437435
<td>244.89</td><td>1.97</td>
@@ -449,7 +447,7 @@ bash scripts/deploy/lm_eval.sh -d 0,1 -t 2 -g 0.8 -r $RESULT_PATH -b "auto" --ta
449447
<td>93.26</td><td>1</td>
450448
</tr>
451449
<tr>
452-
<td>Eagle3</td>
450+
<td><a href="https://huggingface.co/AngelSlim/Qwen3-14B_eagle3">Eagle3</a></td>
453451
<td>153.72</td><td>1.87</td>
454452
<td>140.46</td><td>1.78</td>
455453
<td>144.68</td><td>1.76</td>
@@ -467,7 +465,7 @@ bash scripts/deploy/lm_eval.sh -d 0,1 -t 2 -g 0.8 -r $RESULT_PATH -b "auto" --ta
467465
<td>43.32</td><td>1</td>
468466
</tr>
469467
<tr>
470-
<td>Eagle3</td>
468+
<td><a href="https://huggingface.co/AngelSlim/Qwen3-32B_eagle3">Eagle3</a></td>
471469
<td>80.43</td><td>2.01</td>
472470
<td>72.49</td><td>1.9</td>
473471
<td>71.57</td><td>1.86</td>
@@ -485,7 +483,7 @@ bash scripts/deploy/lm_eval.sh -d 0,1 -t 2 -g 0.8 -r $RESULT_PATH -b "auto" --ta
485483
<td>320.87</td><td>1</td>
486484
</tr>
487485
<tr>
488-
<td>Eagle3</td>
486+
<td><a href="https://huggingface.co/AngelSlim/Qwen3-a3B_eagle3">Eagle3</a></td>
489487
<td>453.97</td><td>2.1</td>
490488
<td>432.45</td><td>2.04</td>
491489
<td>428.81</td><td>2.02</td>
@@ -557,7 +555,7 @@ bash scripts/deploy/lm_eval.sh -d 0,1 -t 2 -g 0.8 -r $RESULT_PATH -b "auto" --ta
557555
<td>1</td>
558556
</tr>
559557
<tr>
560-
<td>Eagle3</td>
558+
<td><a href="https://huggingface.co/AngelSlim/Qwen3-VL-2B-Instruct_eagle3">Eagle3</a></td>
561559
<td>511.52</td>
562560
<td>2.11</td>
563561
<td>560.55</td>
@@ -596,7 +594,7 @@ bash scripts/deploy/lm_eval.sh -d 0,1 -t 2 -g 0.8 -r $RESULT_PATH -b "auto" --ta
596594
<td>1</td>
597595
</tr>
598596
<tr>
599-
<td>Eagle3</td>
597+
<td><a href="https://huggingface.co/AngelSlim/Qwen3-VL-4B-Instruct_eagle3">Eagle3</a></td>
600598
<td>415.29</td>
601599
<td>2.57</td>
602600
<td>372.89</td>
@@ -635,7 +633,7 @@ bash scripts/deploy/lm_eval.sh -d 0,1 -t 2 -g 0.8 -r $RESULT_PATH -b "auto" --ta
635633
<td>1</td>
636634
</tr>
637635
<tr>
638-
<td>Eagle3</td>
636+
<td><a href="https://huggingface.co/AngelSlim/Qwen3-VL-30B-A3B-Instruct_eagle3">Eagle3</a></td>
639637
<td>281.93</td>
640638
<td>2.82</td>
641639
<td>241.42</td>
@@ -679,7 +677,7 @@ bash scripts/deploy/lm_eval.sh -d 0,1 -t 2 -g 0.8 -r $RESULT_PATH -b "auto" --ta
679677
<td>1</td>
680678
</tr>
681679
<tr>
682-
<td>Eagle3</td>
680+
<td><a href="https://huggingface.co/AngelSlim/HunyuanOCR_eagle3">Eagle3</a></td>
683681
<td>108.1</td>
684682
<td>2.08</td>
685683
</tr>
@@ -712,7 +710,7 @@ bash scripts/deploy/lm_eval.sh -d 0,1 -t 2 -g 0.8 -r $RESULT_PATH -b "auto" --ta
712710
<td>1</td>
713711
</tr>
714712
<tr>
715-
<td>Eagle3</td>
713+
<td><a href="https://huggingface.co/AngelSlim/Qwen2-Audio-7B-Instruct_eagle3">Eagle3</a></td>
716714
<td>146.66</td>
717715
<td>3.51</td>
718716
</tr>
@@ -742,7 +740,7 @@ bash scripts/deploy/lm_eval.sh -d 0,1 -t 2 -g 0.8 -r $RESULT_PATH -b "auto" --ta
742740
<td>1</td>
743741
</tr>
744742
<tr>
745-
<td>Eagle3</td>
743+
<td><a href="https://huggingface.co/AngelSlim/Fun-CosyVoice3-0.5B-2512_eagle3">Eagle3</a></td>
746744
<td>-</td>
747745
<td>1.96</td>
748746
</tr>

0 commit comments

Comments
 (0)