Implement _dequantize for TorchAO quantizer by jiqing-feng · Pull Request #13538 · huggingface/diffusers

jiqing-feng · 2026-04-22T01:41:35Z

What does this PR do?

Implements the _dequantize() method for TorchAoHfQuantizer, enabling model.dequantize() to convert TorchAO-quantized models back to standard float weights.

Changes

Add _dequantize() method: Iterates all nn.Linear modules, calls weight.dequantize() on TorchAOBaseTensor weights, replaces them with standard nn.Parameter, and resets any overridden extra_repr.
Fix _verify_if_layer_quantized: Added isinstance(module.weight, TorchAOBaseTensor) check so that dequantized layers (which are still nn.Linear but with plain tensor weights) are correctly detected as non-quantized.

jiqing-feng · 2026-05-06T05:19:56Z

Hi @sayakpaul . Would you please review this PR? Thanks!

sayakpaul

@vkuzo could you review too?

sayakpaul · 2026-05-13T02:52:40Z

+        assert isinstance(module.weight, TorchAOBaseTensor), (
+            f"Layer {name} weight is {type(module.weight)}, expected TorchAOBaseTensor"
+        )


Can we also enable dequantization tests for TorchAO tester mixin?

- Add _dequantize() method in TorchAoHfQuantizer that dequantizes TorchAOBaseTensor weights back to standard nn.Parameter - Fix _verify_if_layer_quantized to check isinstance(weight, TorchAOBaseTensor) so dequantized layers are correctly detected as non-quantized

HuggingFaceDocBuilderDev · 2026-05-13T03:01:29Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

sayakpaul

Will also let @vkuzo review this.

vkuzo · 2026-05-13T13:35:52Z

+        from torchao.utils import TorchAOBaseTensor
+
+        for name, module in model.named_modules():
+            if isinstance(module, nn.Linear) and isinstance(module.weight, TorchAOBaseTensor):


TorchAOBaseTensor does not expose dequantize as a public API, it is defined on child classes. I agree that it would make sense to do so in the future. If you want to be safe here it might be better to check for individual tensor subclasses that do expose it.

Thanks for the review @vkuzo! You're right that dequantize() is defined on child classes rather than on TorchAOBaseTensor itself. I've added a hasattr guard so we safely skip any subclass that doesn't expose it. In practice all quantized tensor subclasses we encounter do implement dequantize(), but this makes it future-proof.

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

sayakpaul · 2026-05-14T02:40:01Z

+        ],
+        ids=["int4wo", "int8wo", "int8dq"],
+    )
+    def test_torchao_dequantize(self, quant_type):


I ran the tests with the following command: ``

And there are test failures:

FAILED tests/models/transformers/test_models_transformer_flux.py::TestFluxTransformerTorchAo::test_torchao_dequantize[int4wo] - NotImplementedError: Int4Tensor dispatch: attempting to run unimplemented operator/function: func=<OpOverload... FAILED tests/models/transformers/test_models_transformer_flux.py::TestFluxTransformerTorchAo::test_torchao_dequantize[int8wo] - RuntimeError: mat1 and mat2 must have the same dtype, but got Float and BFloat16 FAILED tests/models/transformers/test_models_transformer_flux.py::TestFluxTransformerTorchAo::test_torchao_dequantize[int8dq] - NotImplementedError: LinearActivationQuantizedTensor dispatch: attempting to run unimplemented operator/funct...

With the following diff I managed to get it down to two:

diff --git a/tests/models/transformers/test_models_transformer_flux.py b/tests/models/transformers/test_models_transformer_flux.py index 840eaa338..e73c31561 100644 --- a/tests/models/transformers/test_models_transformer_flux.py +++ b/tests/models/transformers/test_models_transformer_flux.py @@ -367,6 +367,10 @@ class TestFluxTransformerQuanto(FluxTransformerTesterConfig, QuantoTesterMixin): class TestFluxTransformerTorchAo(FluxTransformerTesterConfig, TorchAoTesterMixin): """TorchAO quantization tests for Flux Transformer.""" + @property + def torch_dtype(self): + return torch.bfloat16 + class TestFluxTransformerGGUF(FluxTransformerTesterConfig, GGUFTesterMixin): @property

FAILED tests/models/transformers/test_models_transformer_flux.py::TestFluxTransformerTorchAo::test_torchao_dequantize[int4wo] - NotImplementedError: Int4Tensor dispatch: attempting to run unimplemented operator/function: func=<OpOverload... FAILED tests/models/transformers/test_models_transformer_flux.py::TestFluxTransformerTorchAo::test_torchao_dequantize[int8dq] - NotImplementedError: LinearActivationQuantizedTensor dispatch: attempting to run unimplemented operator/funct...

I am on an H100.

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

jiqing-feng · 2026-05-14T04:40:26Z

Hi @sayakpaul . I have fixed the dtype issue and skip [int4wo] and [int8dq] since torchao didn't implement them.

github-actions Bot added quantization tests size/S PR with diff < 50 LOC labels Apr 22, 2026

jiqing-feng mentioned this pull request Apr 22, 2026

Improve TorchAO quantization test coverage and XPU support #13530

Closed

github-actions Bot added size/S PR with diff < 50 LOC and removed size/S PR with diff < 50 LOC labels Apr 22, 2026

github-actions Bot added size/S PR with diff < 50 LOC and removed size/S PR with diff < 50 LOC labels May 6, 2026

sayakpaul reviewed May 13, 2026

View reviewed changes

enable dequantize for TorchAO tester mixin

9102fb8

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

jiqing-feng force-pushed the torchao-dequantize branch from a819214 to 9102fb8 Compare May 13, 2026 03:05

jiqing-feng requested a review from sayakpaul May 13, 2026 03:20

sayakpaul approved these changes May 13, 2026

View reviewed changes

vkuzo reviewed May 13, 2026

View reviewed changes

check dequantize

df36f1a

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

jiqing-feng force-pushed the torchao-dequantize branch from 95d0118 to df36f1a Compare May 14, 2026 02:02

sayakpaul reviewed May 14, 2026

View reviewed changes

jiqing-feng force-pushed the torchao-dequantize branch from 2cbe719 to 83431bf Compare May 14, 2026 02:47

fix dequantize: clear is_quantized flag and cast dtype after dequantize

450d0e4

jiqing-feng force-pushed the torchao-dequantize branch from 83431bf to 450d0e4 Compare May 14, 2026 02:58

fix

34d5c2d

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

Merge branch 'main' into torchao-dequantize

27d0d99

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement _dequantize for TorchAO quantizer#13538

Implement _dequantize for TorchAO quantizer#13538
jiqing-feng wants to merge 6 commits into
huggingface:mainfrom
jiqing-feng:torchao-dequantize

jiqing-feng commented Apr 22, 2026

Uh oh!

jiqing-feng commented May 6, 2026

Uh oh!

sayakpaul left a comment

Uh oh!

sayakpaul May 13, 2026

Uh oh!

jiqing-feng May 13, 2026

Uh oh!

HuggingFaceDocBuilderDev commented May 13, 2026

Uh oh!

sayakpaul left a comment

Uh oh!

vkuzo May 13, 2026

Uh oh!

jiqing-feng May 14, 2026

Uh oh!

sayakpaul May 14, 2026

Uh oh!

jiqing-feng commented May 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

jiqing-feng commented Apr 22, 2026

What does this PR do?

Changes

Uh oh!

jiqing-feng commented May 6, 2026

Uh oh!

sayakpaul left a comment

Choose a reason for hiding this comment

Uh oh!

sayakpaul May 13, 2026

Choose a reason for hiding this comment

Uh oh!

jiqing-feng May 13, 2026

Choose a reason for hiding this comment

Uh oh!

HuggingFaceDocBuilderDev commented May 13, 2026

Uh oh!

sayakpaul left a comment

Choose a reason for hiding this comment

Uh oh!

vkuzo May 13, 2026

Choose a reason for hiding this comment

Uh oh!

jiqing-feng May 14, 2026

Choose a reason for hiding this comment

Uh oh!

sayakpaul May 14, 2026

Choose a reason for hiding this comment

Uh oh!

jiqing-feng commented May 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants