ModelTC
diff --git a/‎configs/quantization/video_gen/wan2_2_t2v/awq_w_a_skip_first.yaml‎
Lines changed: 0 additions & 73 deletions b/‎configs/quantization/video_gen/wan2_2_t2v/awq_w_a_skip_first.yaml‎
Lines changed: 0 additions & 73 deletions
diff --git a/‎docs/wan2.2_quantization_guide.md‎
Lines changed: 139 additions & 0 deletions b/‎docs/wan2.2_quantization_guide.md‎
Lines changed: 139 additions & 0 deletions
diff --git a/‎llmc/compression/quantization/base_blockwise_quantization.py‎
Lines changed: 2 additions & 59 deletions b/‎llmc/compression/quantization/base_blockwise_quantization.py‎
Lines changed: 2 additions & 59 deletions
@@ -0,0 +1,139 @@
+# Wan2.2 视频生成模型量化指南
+
+## 概述
+
+本仓库为 **Wan2.2-T2V** 提供的现成示例是 **4-bit AWQ 模拟量化**（`configs/quantization/video_gen/wan2_2_t2v/awq_w_a.yaml`）。
+
+Wan2.2 为 **MoE 双专家**：高噪声（`transformer`）与低噪声（`transformer_2`），校准与块级量化会覆盖两条支路。保存侧默认示例为 `save_fake`，推理对接需按你的推理栈自行对齐。
+
+**模型示例（原生 checkpoint 布局）**：[Wan-AI/Wan2.2-T2V-A14B](https://huggingface.co/Wan-AI/Wan2.2-T2V-A14B)
+
+## Wan2.2 相对 Wan2.1 的要点
+
+| 项目 | 说明 |
+|------|------|
+| 注册名 | `Wan2T2V` |
+| 结构 | 双专家 MoE，非单路 DiT |
+| 推理后端 | 优先官方 `wan` + 原生目录；可按 YAML 注释回退 Diffusers |
+| CFG | `guidance_scale`（高噪声）与 `guidance_scale_2`（低噪声），与官方双引导一致 |
+
+## 量化配置示例
+
+`awq_w_a.yaml` 中 `quant` 段与仓库一致，例如：
+
+```yaml
+quant:
+    video_gen:
+        method: Awq
+        weight:
+            quant_type: hif4
+            bit: 4
+            symmetric: True
+            granularity: per_channel
+            group_size: -1
+        act:
+            quant_type: hif4
+            bit: 4
+            symmetric: True
+            granularity: per_token
+        special:
+            trans: True
+            trans_version: v2
+            weight_clip: True
+            clip_sym: True
+```
+
+首次使用需编译仓库内 hif4 GPU 扩展：`HiFloat4/hif4_gpu/build.sh`。
+
+## 运行步骤
+
+### 1. 环境
+
+```bash
+export llmc=/path/to/LightCompress
+export PYTHONPATH=$llmc:$PYTHONPATH
+export CUDA_VISIBLE_DEVICES=0
+```
+
+原生布局需要能 `import wan`，通常：
+
+```bash
+pip install -e /path/to/Wan2.2
+```
+
+或在 YAML 里设置 `wan2_repo_path: /path/to/Wan2.2`。
+
+### 2. 校准数据
+
+与 Wan2.1 T2V 相同，文本 prompt 文件目录，例如：
+
+```
+assets/wan_t2v/calib/
+├── prompt_1.txt
+├── prompt_2.txt
+└── ...
+```
+
+配置中 `calib.name: t2v`，`calib.path` 指向该目录。
+
+### 3. 修改 `awq_w_a.yaml`
+
+必改：
+
+- `model.path`：Wan2.2 权重路径  
+- `calib.path` / `eval.path`：校准与评估数据  
+- `save.save_path`：输出目录  
+
+可选（见 YAML 注释）：
+
+- `use_cpu_to_save_cuda_mem_for_catcher: True`：校准显存紧张时减轻峰值  
+- `allow_diffusers_fallback: True`：无法用官方后端时回退 Diffusers  
+
+双引导示例：
+
+```yaml
+calib:
+    guidance_scale: 4.0      # high_noise
+    guidance_scale_2: 3.0    # low_noise
+eval:
+    guidance_scale: 4.0
+    guidance_scale_2: 3.0
+```
+
+### 4. 启动量化
+
+```bash
+torchrun \
+  --nnodes 1 \
+  --nproc_per_node 1 \
+  --rdzv_id $RANDOM \
+  --rdzv_backend c10d \
+  --rdzv_endpoint 127.0.0.1:29500 \
+  ${llmc}/llmc/__main__.py \
+  --config ${llmc}/configs/quantization/video_gen/wan2_2_t2v/awq_w_a.yaml \
+  --task_id wan22_awq_int4
+```
+
+`scripts/run_llmc.sh` 中把 `model_name=wan2_2_t2v`、`task_name=awq_w_a` 等与上述 YAML 对齐即可（需按本机修改脚本里的 Python 路径等）。
+
+## 参数速查
+
+| 区域 | 说明 |
+|------|------|
+| `model.type` | `Wan2T2V` |
+| `quant.video_gen.method` | `Awq` |
+| `weight` / `act` | `bit: 4`（具体 `quant_type` 以 YAML 为准） |
+| `save` | 示例 `save_fake: True` 与 `save_path` |
+
+## 常见问题
+
+- **OOM**：减小 `sample_steps`、`num_frames`、分辨率；`bs: 1`；可开 `use_cpu_to_save_cuda_mem_for_catcher`。  
+- **无法 `import wan`**：安装官方仓库或配置 `wan2_repo_path`。  
+- **hif4 扩展编译失败**：核对 CUDA / PyTorch 与 `HiFloat4/hif4_gpu/build.sh` 日志。  
+- **画质下降**：增加/多样化校准 prompt；在支持范围内微调 `special` 与校准规模。  
+
+## 参考
+
+- `configs/quantization/video_gen/wan2_2_t2v/awq_w_a.yaml`  
+- `llmc/models/wan2_2_t2v.py`  
+- 其它精度（如 FP8、INT8）可参考 `docs/wan2.1_quantization_guide.md` 的思路，自行新增 `wan2_2_t2v` 下 YAML 并替换 `model.type` 与路径。
@@ -38,7 +38,6 @@
                            RotateLinear)
 from .quant import (
     FloatQuantizer,
-    HiFloat4Quantizer,
     IntegerQuantizer,
     Weight48IntegerQuantizer,
 )
@@ -163,8 +162,6 @@ def set_quant_config(self):
                 self.weight_quant_module = IntegerQuantizer
         elif quant_type == 'float-quant':
             self.weight_quant_module = FloatQuantizer
-        elif quant_type == 'hif4':
-            self.weight_quant_module = HiFloat4Quantizer
         logger.info(f'The used Weight Quant Module is {self.weight_quant_module}')
         self.wquantizer = self.weight_quant_module(**self.quant_config['weight'])
 
@@ -183,8 +180,6 @@ def set_quant_config(self):
                     self.act_quant_module = IntegerQuantizer
             elif quant_type == 'float-quant':
                 self.act_quant_module = FloatQuantizer
-            elif quant_type == 'hif4':
-                self.act_quant_module = HiFloat4Quantizer
             else:
                 raise ValueError(
                     f"Unsupported act quant_type: {quant_type}. "
@@ -1060,58 +1055,6 @@ def contiguous_params(self):
                     if not param.is_contiguous():
                         param.data = param.data.contiguous()
 
-    def _copy_wan22_native_checkpoint(self, src, dst):
-        if not isinstance(src, str) or not os.path.isdir(src):
-            raise RuntimeError(
-                'Wan2.2 official save expects a local native checkpoint directory, '
-                f'but got src={src!r}.'
-            )
-        if os.path.abspath(src) == os.path.abspath(dst):
-            raise RuntimeError(
-                'Wan2.2 official save path must differ from source checkpoint path '
-                f'(src=dst={src}).'
-            )
-        if os.path.exists(dst):
-            shutil.rmtree(dst)
-        shutil.copytree(src, dst)
-        logger.info(f'Copied original Wan2.2 native checkpoint from {src} to {dst}')
-
-    def _validate_wan22_native_save_structure(self, save_path, source_path=None):
-        if not os.path.isdir(save_path):
-            raise RuntimeError(f'Wan2.2 saved path is not a directory: {save_path}')
-
-        required_entries = ['configuration.json', 'high_noise_model', 'low_noise_model']
-        missing_required = [
-            name for name in required_entries
-            if not os.path.exists(os.path.join(save_path, name))
-        ]
-        if missing_required:
-            raise RuntimeError(
-                'Wan2.2 saved structure is incomplete. Missing required entries: '
-                f'{missing_required}. save_path={save_path}'
-            )
-
-        if isinstance(source_path, str) and os.path.isdir(source_path):
-            source_entries = set(os.listdir(source_path))
-            source_non_expert_entries = sorted(
-                name for name in source_entries
-                if name not in {'high_noise_model', 'low_noise_model'}
-            )
-            missing_non_expert = [
-                name for name in source_non_expert_entries
-                if not os.path.exists(os.path.join(save_path, name))
-            ]
-            if missing_non_expert:
-                raise RuntimeError(
-                    'Wan2.2 saved structure lost original non-expert files/directories: '
-                    f'{missing_non_expert}. source_path={source_path}, save_path={save_path}'
-                )
-
-        logger.info(
-            f'Wan2.2 native save structure verified. '
-            f'top-level entries={sorted(os.listdir(save_path))}'
-        )
-
     @torch.no_grad()
     def save_model(self, path):
         if int(os.environ['RANK']) != 0:
@@ -1135,7 +1078,7 @@ def save_model(self, path):
         elif self.config.model.type in ['Wan2T2V']:
             if getattr(self.model.Pipeline, '_is_wan_official', False):
                 src = getattr(self.model, 'pipeline_model_path', self.model.model_path)
-                self._copy_wan22_native_checkpoint(src, path)
+                self.model.copy_native_checkpoint(src, path)
 
                 self.model.Pipeline.transformer.save_pretrained(
                     os.path.join(path, 'high_noise_model')
@@ -1149,7 +1092,7 @@ def save_model(self, path):
                         os.path.join(path, 'low_noise_model')
                     )
                     logger.info('save Wan2.2 low_noise_model done --')
-                self._validate_wan22_native_save_structure(path, source_path=src)
+                self.model.validate_native_save_structure(path, source_path=src)
                 return
 
             # Copy the full original pipeline (VAE, text encoder, tokenizer, scheduler, etc.)