feat: add QAT pipline by ali-88123 · Pull Request #265 · Tencent/AngelSlim

ali-88123 · 2026-03-19T09:59:07Z

This PR introduces Quantization Aware Training (QAT) as a new compression method in AngelSlim. QAT enables models to learn quantization parameters during training by inserting fake-quantization operations into the forward pass, resulting in significantly better accuracy under low-bit quantization scenarios (e.g., W4A8, INT4) compared to post-training quantization (PTQ) alone.

Training Modes

End-to-End (end2end): Uses HuggingFace Seq2SeqTrainer with a custom AdamW optimizer targeting only quantization parameters (scale, zero_point).
Blockwise (blockwise): Trains each Transformer block independently using MSE loss between full-precision and quantized block outputs.

Model Conversion & Save

convert(): Replaces all QuantLinear modules with inference-ready QDQModule (from the existing PTQ module), extracting learned weight_scale and input_scale from trained quantizers.
save(): Supports two formats:
- "fake": Saves raw state_dict via torch.save() — useful for checkpoint resumption.
- "real": Delegates to the model-specific save function (e.g., vLLM/TRT-LLM compatible formats).

Key configuration sections:

training.plugin_config: Plugin toggles (enable_scale, enable_rotation), per-plugin quant_config overrides
training.hf_args (end2end): Full HuggingFace Seq2SeqTrainingArguments
training.block_wise_config (blockwise): epochs, batch_size, quant_lr, weight_lr, min_lr_factor

Documentation

A comprehensive user guide has been added at docs/source/features/quantization/qat.md.

yghstill · 2026-03-19T13:34:26Z

+from torch.utils.data import Dataset, IterableDataset
+
+
+class QATDataset(IterableDataset):


mv to data folder

yghstill · 2026-03-19T13:35:21Z

@@ -0,0 +1,350 @@
+# Copyright 2025 Tencent Inc. All Rights Reserved.


-> quanter.py

yghstill · 2026-03-19T13:41:02Z

+
+
+@dataclass
+class TrainingConfig:


name: Union[str, List[str]]
QAT: Optional[QATTrainingConfig] = None

QATTrainingConfig里放QAT专用超参
其他共用

ali-88123 added 2 commits March 19, 2026 16:45

feat: add QAT pipline

a032e1f

fix conflicts

6bc78e4

ali-88123 requested review from WOODchen7, irisliu10, liusong1222 and yghstill March 19, 2026 09:59

irisliu10 reviewed Mar 19, 2026

View reviewed changes

Comment thread angelslim/compressor/qat/modules/__pycache__/quant.cpython-312.pyc.139964120597328 Outdated

modify gitignore

b07bb4a

yghstill reviewed Mar 19, 2026

View reviewed changes

ali-88123 added 3 commits March 20, 2026 21:29

fix MoE lazy init & move code

ce323b7

mv config

50f95c5

remove init_optimizer() & modify save_path

f0e48c7

yghstill approved these changes Mar 23, 2026

View reviewed changes

ali-88123 merged commit 028f7ab into Tencent:main Mar 24, 2026
5 checks passed

ali-88123 deleted the qat branch March 24, 2026 07:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add QAT pipline#265

feat: add QAT pipline#265
ali-88123 merged 6 commits into
Tencent:mainfrom
ali-88123:qat

ali-88123 commented Mar 19, 2026

Uh oh!

Uh oh!

yghstill Mar 19, 2026

Uh oh!

yghstill Mar 19, 2026

Uh oh!

yghstill Mar 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		from torch.utils.data import Dataset, IterableDataset


		class QATDataset(IterableDataset):

		@@ -0,0 +1,350 @@
		# Copyright 2025 Tencent Inc. All Rights Reserved.

Conversation

ali-88123 commented Mar 19, 2026

Training Modes

Model Conversion & Save

Key configuration sections:

Documentation

Uh oh!

Uh oh!

yghstill Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

yghstill Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

yghstill Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants