Skip to content

feat: add QAT pipline#265

Merged
ali-88123 merged 6 commits into
Tencent:mainfrom
ali-88123:qat
Mar 24, 2026
Merged

feat: add QAT pipline#265
ali-88123 merged 6 commits into
Tencent:mainfrom
ali-88123:qat

Conversation

@ali-88123
Copy link
Copy Markdown
Collaborator

This PR introduces Quantization Aware Training (QAT) as a new compression method in AngelSlim. QAT enables models to learn quantization parameters during training by inserting fake-quantization operations into the forward pass, resulting in significantly better accuracy under low-bit quantization scenarios (e.g., W4A8, INT4) compared to post-training quantization (PTQ) alone.

Training Modes

  1. End-to-End (end2end): Uses HuggingFace Seq2SeqTrainer with a custom AdamW optimizer targeting only quantization parameters (scale, zero_point).

  2. Blockwise (blockwise): Trains each Transformer block independently using MSE loss between full-precision and quantized block outputs.

Model Conversion & Save

  • convert(): Replaces all QuantLinear modules with inference-ready QDQModule (from the existing PTQ module), extracting learned weight_scale and input_scale from trained quantizers.

  • save(): Supports two formats:

    • "fake": Saves raw state_dict via torch.save() — useful for checkpoint resumption.

    • "real": Delegates to the model-specific save function (e.g., vLLM/TRT-LLM compatible formats).

Key configuration sections:

  • training.plugin_config: Plugin toggles (enable_scale, enable_rotation), per-plugin quant_config overrides

  • training.hf_args (end2end): Full HuggingFace Seq2SeqTrainingArguments

  • training.block_wise_config (blockwise): epochs, batch_size, quant_lr, weight_lr, min_lr_factor

Documentation

A comprehensive user guide has been added at docs/source/features/quantization/qat.md.

from torch.utils.data import Dataset, IterableDataset


class QATDataset(IterableDataset):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mv to data folder

@@ -0,0 +1,350 @@
# Copyright 2025 Tencent Inc. All Rights Reserved.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-> quanter.py

Comment thread angelslim/utils/config_parser.py Outdated


@dataclass
class TrainingConfig:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

name: Union[str, List[str]]
QAT: Optional[QATTrainingConfig] = None

QATTrainingConfig里放QAT专用超参
其他共用

@ali-88123 ali-88123 merged commit 028f7ab into Tencent:main Mar 24, 2026
5 checks passed
@ali-88123 ali-88123 deleted the qat branch March 24, 2026 07:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants