Skip to content

Commit 74a0f0b

Browse files
authored
remove torchao autoquant from diffusers docs (#13048)
Summary: Context: pytorch/ao#3739 Test Plan: CI, since this does not change any Python code
1 parent 2c669e8 commit 74a0f0b

File tree

3 files changed

+1
-21
lines changed

3 files changed

+1
-21
lines changed

docs/source/en/quantization/torchao.md

Lines changed: 0 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -83,25 +83,6 @@ Refer to this [table](https://github.com/huggingface/diffusers/pull/10009#issue-
8383
> [!TIP]
8484
> The FP8 post-training quantization schemes in torchao are effective for GPUs with compute capability of at least 8.9 (RTX-4090, Hopper, etc.). FP8 often provides the best speed, memory, and quality trade-off when generating images and videos. We recommend combining FP8 and torch.compile if your GPU is compatible.
8585
86-
## autoquant
87-
88-
torchao provides [autoquant](https://docs.pytorch.org/ao/stable/generated/torchao.quantization.autoquant.html#torchao.quantization.autoquant) an automatic quantization API. Autoquantization chooses the best quantization strategy by comparing the performance of each strategy on chosen input types and shapes. This is only supported in Diffusers for individual models at the moment.
89-
90-
```py
91-
import torch
92-
from diffusers import DiffusionPipeline
93-
from torchao.quantization import autoquant
94-
95-
# Load the pipeline
96-
pipeline = DiffusionPipeline.from_pretrained(
97-
"black-forest-labs/FLUX.1-schnell",
98-
torch_dtype=torch.bfloat16,
99-
device_map="cuda"
100-
)
101-
102-
transformer = autoquant(pipeline.transformer)
103-
```
104-
10586
## Supported quantization types
10687

10788
torchao supports weight-only quantization and weight and dynamic-activation quantization for int8, float3-float8, and uint1-uint7.

src/diffusers/quantizers/quantization_config.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -623,7 +623,7 @@ def _get_torchao_quant_type_to_method(cls):
623623
"""
624624

625625
if is_torchao_available():
626-
# TODO(aryan): Support autoquant and sparsify
626+
# TODO(aryan): Support sparsify
627627
from torchao.quantization import (
628628
float8_dynamic_activation_float8_weight,
629629
float8_static_activation_float8_weight,

src/diffusers/quantizers/torchao/torchao_quantizer.py

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -344,7 +344,6 @@ def get_cuda_warm_up_factor(self):
344344
from torchao.core.config import AOBaseConfig
345345

346346
quant_type = self.quantization_config.quant_type
347-
# For autoquant case, it will be treated in the string implementation below in map_to_target_dtype
348347
if isinstance(quant_type, AOBaseConfig):
349348
# Extract size digit using fuzzy match on the class name
350349
config_name = quant_type.__class__.__name__

0 commit comments

Comments
 (0)