Skip to content

Commit 9701d03

Browse files
authored
Merge pull request #454 from Charles2530/feat/wan2.2-t2v
feat: add wan2.2_t2v model and quantization config
2 parents 261b785 + e3e1243 commit 9701d03

File tree

17 files changed

+1331
-43
lines changed

17 files changed

+1331
-43
lines changed

.gitignore

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,5 +22,8 @@ save*
2222
.log
2323
*.pid
2424
*.ipynb*
25+
model/
26+
output_*
27+
datasets/
2528
.venv/
2629
*.sh
Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,54 @@
1+
base:
2+
seed: &seed 42
3+
model:
4+
type: Wan2T2V
5+
path: /path/to/wan_t2v
6+
torch_dtype: auto
7+
use_cpu_to_save_cuda_mem_for_catcher: True
8+
calib:
9+
name: t2v
10+
download: False
11+
path: ./assets/wan_t2v/calib/
12+
sample_steps: 20
13+
bs: 1
14+
target_height: 480
15+
target_width: 832
16+
num_frames: 81
17+
guidance_scale: 4.0
18+
guidance_scale_2: 3.0
19+
seed: *seed
20+
eval:
21+
eval_pos: []
22+
type: video_gen
23+
name: t2v
24+
download: False
25+
path: ./assets/wan_t2v/calib/
26+
bs: 1
27+
target_height: 480
28+
target_width: 832
29+
num_frames: 81
30+
guidance_scale: 4.0
31+
guidance_scale_2: 3.0
32+
output_video_path: ./output_videos_awq/
33+
quant:
34+
video_gen:
35+
method: Awq
36+
weight:
37+
quant_type: int-quant
38+
bit: 4
39+
symmetric: True
40+
granularity: per_channel
41+
group_size: -1
42+
act:
43+
quant_type: int-quant
44+
bit: 4
45+
symmetric: True
46+
granularity: per_token
47+
special:
48+
trans: True
49+
trans_version: v2
50+
weight_clip: True
51+
clip_sym: True
52+
save:
53+
save_lightx2v: True
54+
save_path: /path/to/x2v/

configs/quantization/video_gen/wan_i2v/awq_w_a.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -46,4 +46,4 @@ quant:
4646
clip_sym: True
4747
save:
4848
save_lightx2v: True
49-
save_path: /path/to/x2v/
49+
save_path: /path/to/x2v/

configs/quantization/video_gen/wan_t2v/awq_w_a.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -46,4 +46,4 @@ quant:
4646
clip_sym: True
4747
save:
4848
save_lightx2v: True
49-
save_path: /path/to/x2v/
49+
save_path: /path/to/x2v/

configs/quantization/video_gen/wan_t2v/rtn_w_a.yaml

100755100644
Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -29,4 +29,4 @@ quant:
2929
granularity: per_token
3030
save:
3131
save_lightx2v: True
32-
save_path: /path/to/x2v/
32+
save_path: /path/to/x2v/

configs/quantization/video_gen/wan_t2v/smoothquant_w_a.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -42,4 +42,4 @@ quant:
4242
alpha: 0.7
4343
save:
4444
save_lightx2v: True
45-
save_path: /path/to/x2v/
45+
save_path: /path/to/x2v/

docs/wan2.1_quantization_guide.md

Lines changed: 288 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,288 @@
1+
# Wan2.1 视频生成模型量化指南
2+
3+
## 概述
4+
5+
llmc 框架现已全面支持 Wan2.1 系列视频生成模型的量化,并提供真正量化的 INT8/FP8 权重导出,与 lightx2v 推理框架兼容。
6+
7+
## 支持的模型类型
8+
9+
- **WanI2V**: Image-to-Video (图像到视频)
10+
- **WanT2V**: Text-to-Video (文本到视频)
11+
12+
## 支持的量化方法
13+
14+
### FP8 量化 (推荐)
15+
16+
**配置文件**: `configs/quantization/video_gen/wan_i2v/smoothquant_w_a_fp8.yaml`
17+
18+
**特点**:
19+
- 使用 E4M3 FP8 格式 (8-bit 浮点数,4位指数,3位尾数)
20+
- SmoothQuant 算法,平衡权重和激活的量化难度
21+
- 适合 GPU 推理,性能损失小
22+
23+
**量化配置**:
24+
```yaml
25+
quant:
26+
video_gen:
27+
method: SmoothQuant
28+
weight:
29+
quant_type: float-quant
30+
bit: e4m3 # FP8 E4M3 格式
31+
symmetric: True
32+
granularity: per_channel
33+
use_qtorch: True
34+
act:
35+
quant_type: float-quant
36+
bit: e4m3 # FP8 E4M3 格式
37+
symmetric: True
38+
granularity: per_token
39+
use_qtorch: True
40+
special:
41+
alpha: 0.75 # SmoothQuant 平衡参数
42+
```
43+
44+
### INT8 量化
45+
46+
#### 1. RTN (Round-to-Nearest)
47+
**配置文件**: `configs/quantization/video_gen/wan_i2v/rtn_w_a.yaml`
48+
49+
**特点**:
50+
- 最简单的量化方法
51+
- 直接四舍五入到最近的量化级别
52+
- 速度快,精度略低
53+
54+
#### 2. AWQ (Activation-aware Weight Quantization)
55+
**配置文件**: `configs/quantization/video_gen/wan_i2v/awq_w_a.yaml`
56+
57+
**特点**:
58+
- 基于激活分布优化权重量化
59+
- 保护重要通道,减少精度损失
60+
- 需要校准数据
61+
62+
#### 3. SmoothQuant
63+
**配置文件**: `configs/quantization/video_gen/wan_i2v/smoothquant_w_a.yaml`
64+
65+
**特点**:
66+
- 平衡权重和激活的量化难度
67+
- 数学上等价于平滑激活异常值
68+
- 通常提供最佳精度
69+
70+
### LoRA 模型量化
71+
72+
支持对 LoRA 适配器模型的量化:
73+
- `smoothquant_w_a_int8_lora.yaml`
74+
- `rtn_w_a_lora.yaml`
75+
76+
## 运行步骤
77+
78+
### 1. 准备环境
79+
80+
```bash
81+
# 设置 llmc 路径
82+
export llmc=/path/to/llmc
83+
export PYTHONPATH=$llmc:$PYTHONPATH
84+
85+
# 设置 GPU
86+
export CUDA_VISIBLE_DEVICES=0
87+
```
88+
89+
### 2. 准备校准数据
90+
91+
为 I2V 模型准备校准数据:
92+
```
93+
assets/wan_i2v/calib/
94+
├── image_1.jpg
95+
├── image_2.jpg
96+
└── ...
97+
```
98+
99+
为 T2V 模型准备校准数据:
100+
```
101+
assets/wan_t2v/calib/
102+
├── prompt_1.txt
103+
├── prompt_2.txt
104+
└── ...
105+
```
106+
107+
### 3. 修改配置文件
108+
109+
编辑对应的 YAML 配置文件,设置:
110+
- `model.path`: Wan2.1 模型路径
111+
- `calib.path`: 校准数据路径
112+
- `save.save_path`: 量化模型保存路径
113+
114+
**示例 (FP8 量化)**:
115+
```yaml
116+
base:
117+
seed: 42
118+
model:
119+
type: WanI2V
120+
path: /path/to/wan2.1-i2v-model # 修改为你的模型路径
121+
torch_dtype: auto
122+
calib:
123+
name: i2v
124+
download: False
125+
path: /path/to/calibration/data # 修改为校准数据路径
126+
sample_steps: 40
127+
bs: 1
128+
target_height: 480
129+
target_width: 832
130+
num_frames: 81
131+
guidance_scale: 5.0
132+
save:
133+
save_lightx2v: True
134+
save_path: /path/to/save/quantized/model # 修改为保存路径
135+
```
136+
137+
### 4. 运行量化
138+
139+
#### 使用脚本运行 (推荐)
140+
141+
```bash
142+
# 运行 FP8 量化 (I2V)
143+
./run_llmc.sh wan_i2v_fp8
144+
145+
# 运行 INT8 RTN 量化 (I2V)
146+
./run_llmc.sh wan_i2v_int8_rtn
147+
148+
# 运行 INT8 AWQ 量化 (I2V)
149+
./run_llmc.sh wan_i2v_int8_awq
150+
151+
# 运行 INT8 SmoothQuant 量化 (I2V)
152+
./run_llmc.sh wan_i2v_int8_smoothquant
153+
154+
# 运行 T2V 模型量化
155+
./run_llmc.sh wan_t2v_int8_rtn
156+
./run_llmc.sh wan_t2v_int8_awq
157+
./run_llmc.sh wan_t2v_int8_smoothquant
158+
```
159+
160+
#### 直接运行命令
161+
162+
```bash
163+
torchrun \
164+
--nnodes 1 \
165+
--nproc_per_node 1 \
166+
--rdzv_id $RANDOM \
167+
--rdzv_backend c10d \
168+
--rdzv_endpoint 127.0.0.1:29500 \
169+
${llmc}/llmc/__main__.py \
170+
--config configs/quantization/video_gen/wan_i2v/smoothquant_w_a_fp8.yaml \
171+
--task_id my_quant_task
172+
```
173+
174+
### 5. 监控进度
175+
176+
```bash
177+
# 查看日志
178+
tail -f wan_i2v_fp8.log
179+
180+
# 查看进程
181+
ps aux | grep __main__.py
182+
```
183+
184+
### 6. 停止任务
185+
186+
```bash
187+
# 使用保存的 PID 文件
188+
xargs kill -9 < wan_i2v_fp8.pid
189+
```
190+
191+
## 配置参数说明
192+
193+
### 模型配置
194+
- `type`: 模型类型 (`WanI2V``WanT2V`)
195+
- `path`: 模型权重路径
196+
- `torch_dtype`: 数据类型 (`auto`, `bfloat16`, `float32`)
197+
198+
### 校准配置
199+
- `sample_steps`: 采样步数 (通常 20-40)
200+
- `bs`: 批大小 (通常 1,视频生成显存占用大)
201+
- `target_height`: 目标视频高度 (默认 480)
202+
- `target_width`: 目标视频宽度 (默认 832)
203+
- `num_frames`: 视频帧数 (默认 81)
204+
- `guidance_scale`: CFG 引导强度 (默认 5.0)
205+
206+
### 量化配置
207+
- `method`: 量化方法 (`RTN`, `Awq`, `SmoothQuant`)
208+
- `weight.bit`: 权重位宽 (8, e4m3)
209+
- `act.bit`: 激活位宽 (8, e4m3)
210+
- `granularity`: 量化粒度 (`per_channel`, `per_token`)
211+
- `special.alpha`: SmoothQuant 平衡参数 (0.5-1.0)
212+
213+
## 在 lightx2v 中使用量化模型
214+
215+
### 1. 配置 lightx2v
216+
217+
编辑 `lightx2v/configs/quantization/wan_i2v.json`:
218+
```json
219+
{
220+
"infer_steps": 40,
221+
"target_video_length": 81,
222+
"target_height": 480,
223+
"target_width": 832,
224+
"dit_quantized_ckpt": "/path/to/quantized/model",
225+
"dit_quantized": true,
226+
"dit_quant_scheme": "int8-vllm"
227+
}
228+
```
229+
230+
对于 FP8 模型,设置 `"dit_quant_scheme": "fp8"`
231+
232+
### 2. 运行推理
233+
234+
```bash
235+
python -m lightx2v.infer \
236+
--model_cls wan2.1 \
237+
--task i2v \
238+
--model_path /path/to/original/model \
239+
--config_json configs/quantization/wan_i2v.json \
240+
--prompt "Your prompt here" \
241+
--image_path /path/to/input/image.jpg \
242+
--save_result_path output.mp4
243+
```
244+
245+
## 性能建议
246+
247+
1. **FP8 vs INT8**:
248+
- FP8: 精度更高,适合对质量要求高的场景
249+
- INT8: 压缩率更高,适合对速度要求高的场景
250+
251+
2. **量化方法选择**:
252+
- 快速原型: RTN
253+
- 平衡精度和速度: SmoothQuant
254+
- 最高精度: AWQ
255+
256+
3. **校准数据**:
257+
- 使用 10-50 个样本
258+
- 覆盖典型使用场景
259+
- I2V: 使用多样化图像
260+
- T2V: 使用多样化文本描述
261+
262+
4. **资源需求**:
263+
- GPU: 建议 24GB+ 显存
264+
- 校准时间: 30分钟 - 2小时 (取决于数据量)
265+
- 存储空间: 量化后模型约原模型 25-50% 大小
266+
267+
## 故障排除
268+
269+
### 显存不足
270+
- 减小 `bs` 到 1
271+
- 减小 `num_frames`
272+
- 减小 `target_height``target_width`
273+
274+
### 量化精度损失过大
275+
- 尝试 SmoothQuant 方法
276+
- 增加校准数据数量
277+
- 调整 `alpha` 参数 (0.5-1.0)
278+
279+
### lightx2v 兼容性问题
280+
- 确保使用 `save_lightx2v: True`
281+
- 检查 `dit_quant_scheme` 设置
282+
- 确认量化模型路径正确
283+
284+
## 参考
285+
286+
- lightx2v 文档: [lightx2v 项目地址]
287+
- llmc 框架: [llmc 项目地址]
288+
- Wan2.1 模型: [模型地址]

0 commit comments

Comments
 (0)