Skip to content

Commit 823bdfb

Browse files
irisliu10root
andauthored
fix bug in readme & update index.md of do (#202)
Co-authored-by: root <root@TENCENT64.site>
1 parent 6966570 commit 823bdfb

4 files changed

Lines changed: 94 additions & 94 deletions

File tree

README.md

Lines changed: 46 additions & 46 deletions
Original file line numberDiff line numberDiff line change
@@ -77,7 +77,7 @@ A more accessible, comprehensive, and efficient toolkit for large model compress
7777
</td>
7878
<td>
7979
<ul style="padding-left: 0; list-style-position: inside;">
80-
<li><a href="https://angelslim.readthedocs.io/zh-cn/latest/features/speculative_decoding/eagle.html">Eagle3</a></li>
80+
<li><a href="https://angelslim.readthedocs.io/zh-cn/latest/features/speculative_decoding/eagle/index.html">Eagle3</a></li>
8181
<li><a href="https://angelslim.readthedocs.io/zh-cn/latest/features/speculative_decoding/spec_exit.html">SpecExit</a></li>
8282
</ul>
8383
</td>
@@ -111,7 +111,7 @@ A more accessible, comprehensive, and efficient toolkit for large model compress
111111
</td>
112112
<td>
113113
<ul style="padding-left: 0; list-style-position: inside;">
114-
<li><a href="https://angelslim.readthedocs.io/zh-cn/latest/features/speculative_decoding/eagle.html">Eagle3</a></li>
114+
<li><a href="https://angelslim.readthedocs.io/zh-cn/latest/features/speculative_decoding/eagle/index.html">Eagle3</a></li>
115115
</ul>
116116
</td>
117117
<td>
@@ -181,7 +181,7 @@ A more accessible, comprehensive, and efficient toolkit for large model compress
181181
</td>
182182
<td>
183183
<ul style="padding-left: 0; list-style-position: inside;">
184-
<li><a href="https://angelslim.readthedocs.io/zh-cn/latest/features/speculative_decoding/eagle.html">Eagle3</a></li>
184+
<li><a href="https://angelslim.readthedocs.io/zh-cn/latest/features/speculative_decoding/eagle/index.html">Eagle3</a></li>
185185
</ul>
186186
</td>
187187
<td>
@@ -233,7 +233,7 @@ bash scripts/speculative/train_eagle3_online.sh
233233

234234
For detailed training configurations and vLLM performance benchmarks of Eagle3, please refer to the [Quick Start Guide for Speculative Sampling](https://angelslim.readthedocs.io/zh-cn/latest/getting_started/quickstrat.html#id5).
235235

236-
Training and Deployment Guide for Multimodal Model Eagle3—Supporting LLM, VLM, and Audio (ASR & TTS) Models: [LLM](https://angelslim.readthedocs.io/zh-cn/latest/features/speculative_decoding/eagle/eagle.html) | [VLM](https://angelslim.readthedocs.io/zh-cn/latest/features/speculative_decoding/eagle/vlm_eagle.html) | [Audio(ASR)](https://angelslim.readthedocs.io/zh-cn/latest/features/speculative_decoding/eagle/audio_eagle.html) | [Audio(TTS)](https://angelslim.readthedocs.io/zh-cn/latest/features/speculative_decoding/eagle/audio_tts_eagle.html).
236+
Training and Deployment Guide for Multimodal Model Eagle3—Supporting LLM, VLM, and Audio (ASR & TTS) Models: [LLM](https://angelslim.readthedocs.io/zh-cn/latest/features/speculative_decoding/eagle/eagle.html) | [VLM](https://angelslim.readthedocs.io/zh-cn/latest/features/speculative_decoding/eagle/vlm_eagle.html) | [Audio(ASR)](https://angelslim.readthedocs.io/zh-cn/latest/features/speculative_decoding/eagle/audio_asr_eagle.html) | [Audio(TTS)](https://angelslim.readthedocs.io/zh-cn/latest/features/speculative_decoding/eagle/audio_tts_eagle.html).
237237

238238
#### 2.2 LLM/VLM Model Quantization
239239

@@ -502,37 +502,36 @@ Benchmark results for Qwen3-VL series models using Eagle3 speculative decoding o
502502
<tr>
503503
<th>Model</th>
504504
<th>Method</th>
505-
<th colspan="2">GSM8K</th>
506-
<th colspan="2">Alpaca</th>
507-
<th colspan="2">HumanEval</th>
508-
<th colspan="2">MT-bench</th>
509-
<th colspan="2">MATH-500</th>
510-
<th colspan="2">MMMU</th>
511-
<th colspan="2">MMStar</th>
512-
<th>Mean</th>
513-
<th></th>
514-
</tr></thead>
515-
<tbody>
505+
<th colspan="2" style="text-align:center;">GSM8K</th>
506+
<th colspan="2" style="text-align:center;">Alpaca</th>
507+
<th colspan="2" style="text-align:center;">HumanEval</th>
508+
<th colspan="2" style="text-align:center;">MT-bench</th>
509+
<th colspan="2" style="text-align:center;">MATH-500</th>
510+
<th colspan="2" style="text-align:center;">MMMU</th>
511+
<th colspan="2" style="text-align:center;">MMStar</th>
512+
<th colspan="2" style="text-align:center;">Mean</th>
516513
<tr>
517514
<td></td>
518515
<td></td>
519-
<td>throughput (tokens/s)</td>
520-
<td>accept length</td>
521-
<td>throughput (tokens/s)</td>
522-
<td>accept length</td>
523-
<td>throughput (tokens/s)</td>
524-
<td>accept length</td>
525-
<td>throughput (tokens/s)</td>
526-
<td>accept length</td>
527-
<td>throughput (tokens/s)</td>
528-
<td>accept length</td>
529-
<td>throughput (tokens/s)</td>
530-
<td>accept length</td>
531-
<td>throughput (tokens/s)</td>
532-
<td>accept length</td>
533-
<td>throughput (tokens/s)</td>
534-
<td>accept length</td>
516+
<th>throughput (tokens/s)</th>
517+
<th>accept length</th>
518+
<th>throughput (tokens/s)</th>
519+
<th>accept length</th>
520+
<th>throughput (tokens/s)</th>
521+
<th>accept length</th>
522+
<th>throughput (tokens/s)</th>
523+
<th>accept length</th>
524+
<th>throughput (tokens/s)</th>
525+
<th>accept length</th>
526+
<th>throughput (tokens/s)</th>
527+
<th>accept length</th>
528+
<th>throughput (tokens/s)</th>
529+
<th>accept length</th>
530+
<th>throughput (tokens/s)</th>
531+
<th>accept length</th>
535532
</tr>
533+
</tr></thead>
534+
<tbody>
536535
<tr>
537536
<td rowspan="2">Qwen3-VL-2B-Instruct</td>
538537
<td>Vanilla</td>
@@ -655,19 +654,20 @@ Benchmark results for Qwen3-VL series models using Eagle3 speculative decoding o
655654
##### 1.2.2 HunyuanOCR Model
656655

657656
Benchmark results for HunyuanOCR using Eagle3 speculative decoding on vLLM (v0.13.0) across **[OmniDocBench](https://huggingface.co/datasets/opendatalab/OmniDocBench)** dataset, using a single NVIDIA H20 GPU (**tp=1, ep=1, num_speculative_tokens=4, batch_size=1, output_len=1024**).
657+
658658
<table><thead>
659659
<tr>
660660
<th>Model</th>
661661
<th>Method</th>
662-
<th colspan="2">OmniDocBench</th>
663-
</tr></thead>
664-
<tbody>
662+
<th colspan="2" style="text-align:center;">OmniDocBench</th>
665663
<tr>
666664
<td></td>
667665
<td></td>
668-
<td>throughput (tokens/s)</td>
669-
<td>accept length</td>
666+
<th>throughput (tokens/s)</th>
667+
<th>accept length</th>
670668
</tr>
669+
</tr></thead>
670+
<tbody>
671671
<tr>
672672
<td rowspan="2">Hunyuan-OCR</td>
673673
<td>Vanilla</td>
@@ -692,15 +692,15 @@ Benchmark results for Qwen2-Audio using Eagle3 speculative decoding on vLLM (v0.
692692
<tr>
693693
<th>Model</th>
694694
<th>Method</th>
695-
<th colspan="2">LibriSpeech</th>
696-
</tr></thead>
697-
<tbody>
695+
<th colspan="2" style="text-align:center;">LibriSpeech</th>
698696
<tr>
699697
<td></td>
700698
<td></td>
701-
<td>throughput (tokens/s)</td>
702-
<td>accept length</td>
699+
<th>throughput (tokens/s)</th>
700+
<th>accept length</th>
703701
</tr>
702+
</tr></thead>
703+
<tbody>
704704
<tr>
705705
<td rowspan="2">Qwen2-Audio</td>
706706
<td>Vanilla</td>
@@ -723,15 +723,15 @@ Benchmark results for Fun-CosyVoice3 using Eagle3 speculative decoding across **
723723
<tr>
724724
<th>Model</th>
725725
<th>Method</th>
726-
<th colspan="2">LibriTTS</th>
727-
</tr></thead>
728-
<tbody>
726+
<th colspan="2" style="text-align:center;">LibriTTS</th>
729727
<tr>
730728
<td></td>
731729
<td></td>
732-
<td>throughput (tokens/s)</td>
733-
<td>accept length</td>
730+
<th>throughput (tokens/s)</th>
731+
<th>accept length</th>
734732
</tr>
733+
</tr></thead>
734+
<tbody>
735735
<tr>
736736
<td rowspan="2">Fun-CosyVoice3</td>
737737
<td>Vanilla</td>

README_cn.md

Lines changed: 45 additions & 46 deletions
Original file line numberDiff line numberDiff line change
@@ -78,7 +78,7 @@
7878
</td>
7979
<td>
8080
<ul style="padding-left: 0; list-style-position: inside;">
81-
<li><a href="https://angelslim.readthedocs.io/zh-cn/latest/features/speculative_decoding/eagle.html">Eagle3</a></li>
81+
<li><a href="https://angelslim.readthedocs.io/zh-cn/latest/features/speculative_decoding/eagle/index.html">Eagle3</a></li>
8282
<li><a href="https://angelslim.readthedocs.io/zh-cn/latest/features/speculative_decoding/spec_exit.html">SpecExit</a></li>
8383
</ul>
8484
</td>
@@ -112,7 +112,7 @@
112112
</td>
113113
<td>
114114
<ul style="padding-left: 0; list-style-position: inside;">
115-
<li><a href="https://angelslim.readthedocs.io/zh-cn/latest/features/speculative_decoding/eagle.html">Eagle3</a></li>
115+
<li><a href="https://angelslim.readthedocs.io/zh-cn/latest/features/speculative_decoding/eagle/index.html">Eagle3</a></li>
116116
</ul>
117117
</td>
118118
<td>
@@ -182,7 +182,7 @@
182182
</td>
183183
<td>
184184
<ul style="padding-left: 0; list-style-position: inside;">
185-
<li><a href="https://angelslim.readthedocs.io/zh-cn/latest/features/speculative_decoding/eagle.html">Eagle3</a></li>
185+
<li><a href="https://angelslim.readthedocs.io/zh-cn/latest/features/speculative_decoding/eagle/index.html">Eagle3</a></li>
186186
</ul>
187187
</td>
188188
<td>
@@ -234,7 +234,7 @@ bash scripts/speculative/train_eagle3_online.sh
234234

235235
详细训练配置,以及`Eagle3`的vLLM性能测试,详情请参考投机采样[快速开始文档](https://angelslim.readthedocs.io/zh-cn/latest/getting_started/quickstrat.html#id5)
236236

237-
多模态模型 Eagle3 训练与部署指南,支持LLM / VLM / Audio (ASR & TTS) 模型:[LLM](https://angelslim.readthedocs.io/zh-cn/latest/features/speculative_decoding/eagle/eagle.html) | [VLM](https://angelslim.readthedocs.io/zh-cn/latest/features/speculative_decoding/eagle/vlm_eagle.html) | [Audio(ASR)](https://angelslim.readthedocs.io/zh-cn/latest/features/speculative_decoding/eagle/audio_eagle.html) | [Audio(TTS)](https://angelslim.readthedocs.io/zh-cn/latest/features/speculative_decoding/eagle/audio_tts_eagle.html).
237+
多模态模型 Eagle3 训练与部署指南,支持LLM / VLM / Audio (ASR & TTS) 模型:[LLM](https://angelslim.readthedocs.io/zh-cn/latest/features/speculative_decoding/eagle/eagle.html) | [VLM](https://angelslim.readthedocs.io/zh-cn/latest/features/speculative_decoding/eagle/vlm_eagle.html) | [Audio(ASR)](https://angelslim.readthedocs.io/zh-cn/latest/features/speculative_decoding/eagle/audio_asr_eagle.html) | [Audio(TTS)](https://angelslim.readthedocs.io/zh-cn/latest/features/speculative_decoding/eagle/audio_tts_eagle.html).
238238
#### 2.2 LLM/VLM模型量化
239239
完成安装`AngelSlim`后,您可以通过以下脚本快速开始,完成`Qwen3-1.7B`模型的静态`FP8`量化:
240240

@@ -505,37 +505,36 @@ bash scripts/deploy/lm_eval.sh -d 0,1 -t 2 -g 0.8 -r $RESULT_PATH -b "auto" --ta
505505
<tr>
506506
<th>Model</th>
507507
<th>Method</th>
508-
<th colspan="2">GSM8K</th>
509-
<th colspan="2">Alpaca</th>
510-
<th colspan="2">HumanEval</th>
511-
<th colspan="2">MT-bench</th>
512-
<th colspan="2">MATH-500</th>
513-
<th colspan="2">MMMU</th>
514-
<th colspan="2">MMStar</th>
515-
<th>Mean</th>
516-
<th></th>
517-
</tr></thead>
518-
<tbody>
508+
<th colspan="2" style="text-align:center;">GSM8K</th>
509+
<th colspan="2" style="text-align:center;">Alpaca</th>
510+
<th colspan="2" style="text-align:center;">HumanEval</th>
511+
<th colspan="2" style="text-align:center;">MT-bench</th>
512+
<th colspan="2" style="text-align:center;">MATH-500</th>
513+
<th colspan="2" style="text-align:center;">MMMU</th>
514+
<th colspan="2" style="text-align:center;">MMStar</th>
515+
<th colspan="2" style="text-align:center;">Mean</th>
519516
<tr>
520517
<td></td>
521518
<td></td>
522-
<td>throughput (tokens/s)</td>
523-
<td>accept length</td>
524-
<td>throughput (tokens/s)</td>
525-
<td>accept length</td>
526-
<td>throughput (tokens/s)</td>
527-
<td>accept length</td>
528-
<td>throughput (tokens/s)</td>
529-
<td>accept length</td>
530-
<td>throughput (tokens/s)</td>
531-
<td>accept length</td>
532-
<td>throughput (tokens/s)</td>
533-
<td>accept length</td>
534-
<td>throughput (tokens/s)</td>
535-
<td>accept length</td>
536-
<td>throughput (tokens/s)</td>
537-
<td>accept length</td>
519+
<th>throughput (tokens/s)</th>
520+
<th>accept length</th>
521+
<th>throughput (tokens/s)</th>
522+
<th>accept length</th>
523+
<th>throughput (tokens/s)</th>
524+
<th>accept length</th>
525+
<th>throughput (tokens/s)</th>
526+
<th>accept length</th>
527+
<th>throughput (tokens/s)</th>
528+
<th>accept length</th>
529+
<th>throughput (tokens/s)</th>
530+
<th>accept length</th>
531+
<th>throughput (tokens/s)</th>
532+
<th>accept length</th>
533+
<th>throughput (tokens/s)</th>
534+
<th>accept length</th>
538535
</tr>
536+
</tr></thead>
537+
<tbody>
539538
<tr>
540539
<td rowspan="2">Qwen3-VL-2B-Instruct</td>
541540
<td>Vanilla</td>
@@ -663,15 +662,15 @@ bash scripts/deploy/lm_eval.sh -d 0,1 -t 2 -g 0.8 -r $RESULT_PATH -b "auto" --ta
663662
<tr>
664663
<th>Model</th>
665664
<th>Method</th>
666-
<th colspan="2">OmniDocBench</th>
667-
</tr></thead>
668-
<tbody>
665+
<th colspan="2" style="text-align:center;">OmniDocBench</th>
669666
<tr>
670667
<td></td>
671668
<td></td>
672-
<td>throughput (tokens/s)</td>
673-
<td>accept length</td>
669+
<th>throughput (tokens/s)</th>
670+
<th>accept length</th>
674671
</tr>
672+
</tr></thead>
673+
<tbody>
675674
<tr>
676675
<td rowspan="2">Hunyuan-OCR</td>
677676
<td>Vanilla</td>
@@ -696,15 +695,15 @@ bash scripts/deploy/lm_eval.sh -d 0,1 -t 2 -g 0.8 -r $RESULT_PATH -b "auto" --ta
696695
<tr>
697696
<th>Model</th>
698697
<th>Method</th>
699-
<th colspan="2">LibriSpeech</th>
700-
</tr></thead>
701-
<tbody>
698+
<th colspan="2" style="text-align:center;">LibriSpeech</th>
702699
<tr>
703700
<td></td>
704701
<td></td>
705-
<td>throughput (tokens/s)</td>
706-
<td>accept length</td>
702+
<th>throughput (tokens/s)</th>
703+
<th>accept length</th>
707704
</tr>
705+
</tr></thead>
706+
<tbody>
708707
<tr>
709708
<td rowspan="2">Qwen2-Audio</td>
710709
<td>Vanilla</td>
@@ -726,15 +725,15 @@ bash scripts/deploy/lm_eval.sh -d 0,1 -t 2 -g 0.8 -r $RESULT_PATH -b "auto" --ta
726725
<tr>
727726
<th>Model</th>
728727
<th>Method</th>
729-
<th colspan="2">LibriTTS</th>
730-
</tr></thead>
731-
<tbody>
728+
<th colspan="2" style="text-align:center;">LibriTTS</th>
732729
<tr>
733730
<td></td>
734731
<td></td>
735-
<td>throughput (tokens/s)</td>
736-
<td>accept length</td>
732+
<th>throughput (tokens/s)</th>
733+
<th>accept length</th>
737734
</tr>
735+
</tr></thead>
736+
<tbody>
738737
<tr>
739738
<td rowspan="2">Fun-CosyVoice3</td>
740739
<td>Vanilla</td>

docs/source/index.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -99,9 +99,10 @@ AngelSlim是腾讯自研的,致力于打造更易用、更全面和更高效
9999
* - **语音(TTS/ASR)**
100100
- - Qwen3-Omni
101101
- Qwen2-Audio
102+
- Fun-CosyVoice3
102103
- - FP8-Static/Dynamic
103104
- INT8-Dynamic
104-
- - 建设中
105+
- - Eagle3
105106
- - **Token剪枝**
106107
107108
- 建设中

docs/source/performance/speculative_decoding/benchmarks.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@
2323
### 2. VLM Models
2424
#### 2.1 Qwen3-VL Series Models
2525

26-
| Model | Method | **GSM8K** | | **Alpaca** | | **HumanEval** | | **MT-bench** | | **MATH-500** | | **MMMU** | | **MMStar** | | **Mean** | |
26+
| Model | Method | GSM8K | | Alpaca | | HumanEval | | MT-bench | | MATH-500 | | MMMU | | MMStar | | Mean | |
2727
|-------------------------------|---------|---------------------------|-------------------|---------------------------|-------------------|---------------------------|-------------------|---------------------------|-------------------|---------------------------|-------------------|---------------------------|-------------------|---------------------------|-------------------|---------------------------|-------------------|
2828
| | | **throughput (tokens/s)** | **accept length** | **throughput (tokens/s)** | **accept length** | **throughput (tokens/s)** | **accept length** | **throughput (tokens/s)** | **accept length** | **throughput (tokens/s)** | **accept length** | **throughput (tokens/s)** | **accept length** | **throughput (tokens/s)** | **accept length** | **throughput (tokens/s)** | **accept length** |
2929
| **Qwen3-VL-2B-Instruct** | Vanilla | 348.55 | 1 | 350.9 | 1 | 346.07 | 1 | 346.31 | 1 | 82.96 | 1 | 83.27 | 1 | 81.63 | 1 | 234.24 | 1 |

0 commit comments

Comments
 (0)