Skip to content

Commit 33682c6

Browse files
Jiang-Jia-JunjiangjiajunCopilot
authored
[Docs] Update docs for release/2.5 (#7267)
* Update docs for release/2.5 * Update English docs for release/2.5 - Update README_EN.md: add v2.5 news entry, reformat v2.4 entry with release link - Update docs/get_started/installation/nvidia_gpu.md: - Docker image: 2.4.0 -> 2.5.0, notice now shows SM80/86/89/90 support - paddlepaddle-gpu: 3.3.0 -> 3.3.1, add CUDA 12.9 alternatives - fastdeploy-gpu: 2.4.0 -> 2.5.0, unified arch install with CUDA 12.9 option - Update docs/zh/get_started/installation/nvidia_gpu.md: - Fix remaining paddlepaddle-gpu==3.3.0 refs in sections 4&5 -> 3.3.1 Agent-Logs-Url: https://github.com/PaddlePaddle/FastDeploy/sessions/fa0be381-324e-4b0d-b7a6-e2c1fa12174f Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com> * Clarify --extra-index-url usage in installation docs Add note explaining that --extra-index-url is only for downloading fastdeploy-gpu dependencies; fastdeploy-gpu itself must be installed from the Paddle source specified by -i. Applied to both Chinese and English nvidia_gpu.md installation guides. Agent-Logs-Url: https://github.com/PaddlePaddle/FastDeploy/sessions/9fa8b3c9-7555-4eae-b9b9-026cddd7e74c Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com> * Update nvidia_gpu.md --------- Co-authored-by: jiang-jia-jun <jiangjiajun@baidu.com> Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
1 parent 85c6773 commit 33682c6

4 files changed

Lines changed: 52 additions & 46 deletions

File tree

README_CN.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,9 @@
2727

2828
## 最新活动
2929

30-
**[2026-01] FastDeploy v2.4 全新发布!** 新增 DeepSeek V3 与 Qwen3-MoE 模型的 PD 分离部署,增强MTP 投机解码能力,全面优化多硬件平台上的 MoE 推理与多模态前缀缓存性能,升级全部内容参阅 [v2.4 ReleaseNote](https://github.com/PaddlePaddle/FastDeploy/releases/tag/v2.4.0)
30+
**[2026-03] FastDeploy v2.5 全新发布!** 新增Qwen3-VL与Qwen3-VL MoE模型部署支持,新增W4AFP8量化方法,增强强化学习训练支持能力,包含170+项Bug修复与性能优化,升级全部内容参阅 [v2.5 ReleaseNote](https://github.com/PaddlePaddle/FastDeploy/releases/tag/v2.5.0)
31+
32+
**[2026-01] FastDeploy v2.4**: 新增 DeepSeek V3 与 Qwen3-MoE 模型的 PD 分离部署,增强MTP 投机解码能力,全面优化多硬件平台上的 MoE 推理与多模态前缀缓存性能,升级全部内容参阅 [v2.4 ReleaseNote](https://github.com/PaddlePaddle/FastDeploy/releases/tag/v2.4.0)
3133

3234
**[2025-11] FastDeploy v2.3**: 新增[ERNIE-4.5-VL-28B-A3B-Thinking](docs/zh/get_started/ernie-4.5-vl-thinking.md)[PaddleOCR-VL-0.9B](docs/zh/best_practices/PaddleOCR-VL-0.9B.md)两大重磅模型在多硬件平台上的部署支持,进一步优化全方位推理性能,以及带来更多部署功能和易用性的提升,升级全部内容参阅[v2.3 ReleaseNote](https://github.com/PaddlePaddle/FastDeploy/releases/tag/v2.3.0)
3335

README_EN.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,9 @@ English | [简体中文](README_CN.md)
2727

2828
## News
2929

30-
[2026-01] FastDeploy v2.4 is released! Featuring PD-separated deployment for DeepSeek V3 and Qwen3-MoE, enhanced MTP speculative decoding, and comprehensive performance boosts for MoE inference and multi-modal Prefix Caching across various hardware backends. See the full v2.4 ReleaseNote for more details.
30+
**[2026-03] FastDeploy v2.5 is released!** It adds deployment support for Qwen3-VL and Qwen3-VL MoE models, introduces the W4AFP8 quantization method, enhances reinforcement learning training capabilities, and includes 170+ bug fixes and performance optimizations. For all the upgrade details, refer to the [v2.5 Release Note](https://github.com/PaddlePaddle/FastDeploy/releases/tag/v2.5.0).
31+
32+
**[2026-01] FastDeploy v2.4**: Featuring PD-separated deployment for DeepSeek V3 and Qwen3-MoE, enhanced MTP speculative decoding, and comprehensive performance boosts for MoE inference and multi-modal Prefix Caching across various hardware backends. For all the upgrade details, refer to the [v2.4 Release Note](https://github.com/PaddlePaddle/FastDeploy/releases/tag/v2.4.0).
3133

3234
**[2025-11] FastDeploy v2.3**: It adds deployment support for two major models, [ERNIE-4.5-VL-28B-A3B-Thinking](docs/get_started/ernie-4.5-vl-thinking.md) and [PaddleOCR-VL-0.9B](docs/best_practices/PaddleOCR-VL-0.9B.md), across multiple hardware platforms. It further optimizes comprehensive inference performance and brings more deployment features and usability enhancements. For all the upgrade details, refer to the [v2.3 Release Note](https://github.com/PaddlePaddle/FastDeploy/releases/tag/v2.3.0).
3335

docs/get_started/installation/nvidia_gpu.md

Lines changed: 23 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -12,41 +12,44 @@ The following installation methods are available when your environment meets the
1212

1313
## 1. Pre-built Docker Installation (Recommended)
1414

15-
**Notice**: The pre-built image only supports SM80/90 GPU(e.g. H800/A800),if you are deploying on SM86/89GPU(L40/4090/L20), please reinstall ```fastdeploy-gpu``` after you create the container.
15+
**Notice**: The pre-built image supports SM 80/86/89/90 architecture GPUs (e.g. A800/H800/L20/L40/4090).
1616

1717
```shell
18-
docker pull ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlepaddle/fastdeploy-cuda-12.6:2.4.0
18+
docker pull ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlepaddle/fastdeploy-cuda-12.6:2.5.0
1919
```
2020

2121
## 2. Pre-built Pip Installation
2222

2323
First install paddlepaddle-gpu. For detailed instructions, refer to [PaddlePaddle Installation](https://www.paddlepaddle.org.cn/en/install/quick?docurl=/documentation/docs/en/develop/install/pip/linux-pip_en.html)
2424
```shell
2525
# Install stable release
26-
python -m pip install paddlepaddle-gpu==3.3.0 -i https://www.paddlepaddle.org.cn/packages/stable/cu126/
26+
# CUDA 12.6
27+
python -m pip install paddlepaddle-gpu==3.3.1 -i https://www.paddlepaddle.org.cn/packages/stable/cu126/
28+
# CUDA 12.9
29+
python -m pip install paddlepaddle-gpu==3.3.1 -i https://www.paddlepaddle.org.cn/packages/stable/cu129/
2730

2831
# Install latest Nightly build
32+
# CUDA 12.6
2933
python -m pip install --pre paddlepaddle-gpu -i https://www.paddlepaddle.org.cn/packages/nightly/cu126/
34+
# CUDA 12.9
35+
python -m pip install --pre paddlepaddle-gpu -i https://www.paddlepaddle.org.cn/packages/nightly/cu129/
3036
```
3137

32-
Then install fastdeploy. **Do not install from PyPI**. Use the following methods instead:
38+
Then install fastdeploy. **Do not install from PyPI**. Use the following methods instead (supports SM80/86/89/90 GPU architectures).
3339

34-
For SM80/90 architecture GPUs(e.g A30/A100/H100/):
40+
**Note**: Stable FastDeploy release pairs with stable PaddlePaddle; Nightly Build FastDeploy pairs with Nightly Build PaddlePaddle. The `--extra-index-url` is only used for downloading fastdeploy-gpu's dependencies; fastdeploy-gpu itself must be installed from the Paddle source specified by `-i`.
3541
```
36-
# Install stable release
37-
python -m pip install fastdeploy-gpu==2.4.0 -i https://www.paddlepaddle.org.cn/packages/stable/fastdeploy-gpu-80_90/ --extra-index-url https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple
38-
39-
# Install latest Nightly build
40-
python -m pip install fastdeploy-gpu -i https://www.paddlepaddle.org.cn/packages/nightly/cu126/ --extra-index-url https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple
41-
```
42-
43-
For SM86/89 architecture GPUs(e.g A10/4090/L20/L40):
44-
```
45-
# Install stable release
46-
python -m pip install fastdeploy-gpu==2.4.0 -i https://www.paddlepaddle.org.cn/packages/stable/fastdeploy-gpu-86_89/ --extra-index-url https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple
47-
48-
# Install latest Nightly build
42+
# Install stable release FastDeploy
43+
# CUDA 12.6
44+
python -m pip install fastdeploy-gpu==2.5.0 -i https://www.paddlepaddle.org.cn/packages/stable/cu126/ --extra-index-url https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple
45+
# CUDA 12.9
46+
python -m pip install fastdeploy-gpu==2.5.0 -i https://www.paddlepaddle.org.cn/packages/stable/cu129/ --extra-index-url https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple
47+
48+
# Install Nightly Build FastDeploy
49+
# CUDA 12.6
4950
python -m pip install fastdeploy-gpu -i https://www.paddlepaddle.org.cn/packages/nightly/cu126/ --extra-index-url https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple
51+
# CUDA 12.9
52+
python -m pip install fastdeploy-gpu -i https://www.paddlepaddle.org.cn/packages/nightly/cu129/ --extra-index-url https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple
5053
```
5154

5255
## 3. Build from Source Using Docker
@@ -64,10 +67,8 @@ docker build -f dockerfiles/Dockerfile.gpu -t fastdeploy:gpu .
6467

6568
First install paddlepaddle-gpu. For detailed instructions, refer to [PaddlePaddle Installation](https://www.paddlepaddle.org.cn/en/install/quick?docurl=/documentation/docs/en/develop/install/pip/linux-pip_en.html)
6669
```shell
67-
python -m pip install paddlepaddle-gpu==3.3.0 -i https://www.paddlepaddle.org.cn/packages/stable/cu126/
70+
python -m pip install paddlepaddle-gpu==3.3.1 -i https://www.paddlepaddle.org.cn/packages/stable/cu126/
6871
```
69-
70-
Then clone the source code and build:
7172
```shell
7273
git clone https://github.com/PaddlePaddle/FastDeploy
7374
cd FastDeploy
@@ -92,7 +93,7 @@ First, install paddlepaddle-gpu.
9293
For detailed instructions, please refer to the [PaddlePaddle Installation Guide](https://www.paddlepaddle.org.cn/).
9394

9495
```shell
95-
python -m pip install paddlepaddle-gpu==3.3.0 -i https://www.paddlepaddle.org.cn/packages/stable/cu126/
96+
python -m pip install paddlepaddle-gpu==3.3.1 -i https://www.paddlepaddle.org.cn/packages/stable/cu126/
9697
```
9798

9899
Then, clone the FastDeploy repository and build using the precompiled operator wheels:

docs/zh/get_started/installation/nvidia_gpu.md

Lines changed: 23 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -14,10 +14,10 @@
1414

1515
## 1. 预编译Docker安装(推荐)
1616

17-
**注意**如下镜像仅支持SM 80/90架构GPU(A800/H800等),如果你是在L20/L40/4090等SM 86/89架构的GPU上部署,请在创建容器后,卸载```fastdeploy-gpu```再重新安装如下文档指定支持86/89架构的`fastdeploy-gpu`
17+
**注意**预编译镜像支持 80/86/89/90 架构的GPU硬件 (如 A800/H800/L20/L40/4090)
1818

1919
``` shell
20-
docker pull ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlepaddle/fastdeploy-cuda-12.6:2.4.0
20+
docker pull ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlepaddle/fastdeploy-cuda-12.6:2.5.0
2121
```
2222

2323
## 2. 预编译Pip安装
@@ -26,32 +26,33 @@ docker pull ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlepaddle/fastdeploy-cuda-12
2626

2727
``` shell
2828
# Install stable release
29-
python -m pip install paddlepaddle-gpu==3.3.0 -i https://www.paddlepaddle.org.cn/packages/stable/cu126/
29+
# CUDA 12.6
30+
python -m pip install paddlepaddle-gpu==3.3.1 -i https://www.paddlepaddle.org.cn/packages/stable/cu126/
31+
# CUDA 12.9
32+
python -m pip install paddlepaddle-gpu==3.3.1 -i https://www.paddlepaddle.org.cn/packages/stable/cu129/
3033

3134
# Install latest Nightly build
35+
# CUDA 12.6
3236
python -m pip install --pre paddlepaddle-gpu -i https://www.paddlepaddle.org.cn/packages/nightly/cu126/
37+
# CUDA 12.9
38+
python -m pip install --pre paddlepaddle-gpu -i https://www.paddlepaddle.org.cn/packages/nightly/cu129/
3339
```
3440

35-
再安装 fastdeploy,**注意不要通过pypi源安装**,需要通过如下方式安装
41+
再安装 fastdeploy,**注意不要通过pypi源安装**,需要通过如下方式安装(目前支持80/86/89/90四个架构GPU)
3642

37-
如你的 GPU 是 SM80/90 架构(A100/H100等),按如下方式安装
38-
39-
```
40-
# 安装稳定版本fastdeploy
41-
python -m pip install fastdeploy-gpu==2.4.0 -i https://www.paddlepaddle.org.cn/packages/stable/fastdeploy-gpu-80_90/ --extra-index-url https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple
42-
43-
# 安装Nightly Build的最新版本fastdeploy
44-
python -m pip install fastdeploy-gpu -i https://www.paddlepaddle.org.cn/packages/nightly/cu126/ --extra-index-url https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple
43+
**注意**: 稳定版本的FastDeploy搭配稳定版本的PaddlePaddle; 而Nightly Build的FastDeploy则对应Nightly Build的PaddlePaddle。其中 `--extra-index-url` 仅用于安装 fastdeploy-gpu 所需的依赖包,fastdeploy-gpu 本身必须从 `-i` 指定的 Paddle 源安装。
4544
```
46-
47-
如你的 GPU 是 SM86/89 架构(4090/L20/L40等),按如下方式安装
48-
49-
```
50-
# 安装稳定版本fastdeploy
51-
python -m pip install fastdeploy-gpu==2.4.0 -i https://www.paddlepaddle.org.cn/packages/stable/fastdeploy-gpu-86_89/ --extra-index-url https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple
52-
53-
# 安装Nightly Build的最新版本fastdeploy
45+
# 安装稳定版本FastDeploy
46+
# CUDA 12.6
47+
python -m pip install fastdeploy-gpu==2.5.0 -i https://www.paddlepaddle.org.cn/packages/stable/cu126/ --extra-index-url https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple
48+
# CUDA 12.9
49+
python -m pip install fastdeploy-gpu==2.5.0 -i https://www.paddlepaddle.org.cn/packages/stable/cu129/ --extra-index-url https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple
50+
51+
# 安装Nightly Build版本FastDeploy
52+
# CUDA 12.6
5453
python -m pip install fastdeploy-gpu -i https://www.paddlepaddle.org.cn/packages/nightly/cu126/ --extra-index-url https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple
54+
# CUDA 12.9
55+
python -m pip install fastdeploy-gpu -i https://www.paddlepaddle.org.cn/packages/nightly/cu129/ --extra-index-url https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple
5556
```
5657

5758
## 3. 镜像自行构建
@@ -70,7 +71,7 @@ docker build -f dockerfiles/Dockerfile.gpu -t fastdeploy:gpu .
7071
首先安装 paddlepaddle-gpu,详细安装方式参考 [PaddlePaddle安装](https://www.paddlepaddle.org.cn/)
7172

7273
``` shell
73-
python -m pip install paddlepaddle-gpu==3.3.0 -i https://www.paddlepaddle.org.cn/packages/stable/cu126/
74+
python -m pip install paddlepaddle-gpu==3.3.1 -i https://www.paddlepaddle.org.cn/packages/stable/cu126/
7475
```
7576

7677
接着克隆源代码,编译安装
@@ -98,7 +99,7 @@ FastDeploy 提供了 GPU 算子预编译版 Wheel 包,可在无需完整源码
9899
首先安装 paddlepaddle-gpu,详细安装方式参考 [PaddlePaddle安装](https://www.paddlepaddle.org.cn/)
99100

100101
``` shell
101-
python -m pip install paddlepaddle-gpu==3.3.0 -i https://www.paddlepaddle.org.cn/packages/stable/cu126/
102+
python -m pip install paddlepaddle-gpu==3.3.1 -i https://www.paddlepaddle.org.cn/packages/stable/cu126/
102103
```
103104

104105
接着克隆源代码,拉取 whl 包并安装

0 commit comments

Comments
 (0)