Update PT2E quantization link to stable version (pytorch#20002)

wirthual · web-flow · commit a6f0cf1859c2 · 2026-06-04T09:30:20.000-07:00
diff --git a/docs/source/quantization-overview.md b/docs/source/quantization-overview.md
@@ -9,7 +9,7 @@ Quantization is especially important for deploying models on edge devices such a
 ExecuTorch uses [torchao](https://github.com/pytorch/ao/tree/main/torchao) as its quantization library. This integration allows ExecuTorch to leverage PyTorch-native tools for preparing, calibrating, and converting quantized models.
 
 
-Quantization in ExecuTorch is backend-specific. Each backend defines how models should be quantized based on its hardware capabilities. Most ExecuTorch backends use the torchao [PT2E quantization](https://docs.pytorch.org/ao/main/tutorials_source/pt2e_quant_ptq.html) flow, which works on models exported with torch.export and enables quantization that is tailored for each backend.
+Quantization in ExecuTorch is backend-specific. Each backend defines how models should be quantized based on its hardware capabilities. Most ExecuTorch backends use the torchao [PT2E quantization](https://docs.pytorch.org/ao/stable/pt2e_quantization/pt2e_quant_ptq.html) flow, which works on models exported with torch.export and enables quantization that is tailored for each backend.
 
 The PT2E quantization workflow has three main steps: