diff --git a/CHANGELOG.rst b/CHANGELOG.rst
index 51ef630f6c..86b17315a3 100755
--- a/CHANGELOG.rst
+++ b/CHANGELOG.rst
@@ -20,7 +20,7 @@ NVIDIA Model Optimizer Changelog
 - Add ``nvfp4_omlp_only`` quantization format for NVFP4 quantization. This is similar to ``nvfp4_mlp_only`` but also quantizes the output projection layer in attention.
 - ``pass_through_bwd`` in the quantization config is now default to True. Please set it to False if you want to use STE with zeroed outlier gradients for potentially better QAT accuracy.
 - Add :meth:`compute_quantization_mse <modelopt.torch.quantization.model_quant.compute_quantization_mse>` API to measure per-quantizer mean-squared quantization error, with flexible wildcard and callable filtering.
-- **AutoQDQ**: New tool for automated Q/DQ (Quantize/Dequantize) placement optimization for ONNX models. Uses TensorRT latency measurements to choose insertion schemes that minimize inference time. Discovers regions automatically, groups them by structural pattern, and tests multiple Q/DQ schemes per pattern. Supports INT8 and FP8 quantization, pattern cache for warm-start on similar models, checkpoint/resume, and importing patterns from an existing QDQ baseline. CLI: ``python -m modelopt.onnx.quantization.autotune``. See the AutoQDQ guide in the documentation.
+- **Autotune**: New tool for automated Q/DQ (Quantize/Dequantize) placement optimization for ONNX models. Uses TensorRT latency measurements to choose insertion schemes that minimize inference time. Discovers regions automatically, groups them by structural pattern, and tests multiple Q/DQ schemes per pattern. Supports INT8 and FP8 quantization, pattern cache for warm-start on similar models, checkpoint/resume, and importing patterns from an existing QDQ baseline. CLI: ``python -m modelopt.onnx.quantization.autotune``. See the Autotune guide in the documentation.
 - Add ``get_auto_quantize_config`` API to extract a flat quantization config from ``auto_quantize`` search results, enabling re-quantization at different effective bit targets without re-running calibration.
 - Improve ``auto_quantize`` checkpoint/resume: calibration state is now saved and restored across runs, avoiding redundant calibration when resuming a search.
 - Add support for Nemotron-3 (NemotronHForCausalLM) model quantization and support for NemotronH MoE expert support in ``auto_quantize`` grouping and scoring rules.
diff --git a/docs/source/guides/9_autoqdq.rst b/docs/source/guides/9_autotune.rst
similarity index 99%
rename from docs/source/guides/9_autoqdq.rst
rename to docs/source/guides/9_autotune.rst
index 041f17ce3d..6561ba1d20 100644
--- a/docs/source/guides/9_autoqdq.rst
+++ b/docs/source/guides/9_autotune.rst
@@ -1,5 +1,5 @@
 ===============================================
-Automated Q/DQ Placement Optimization (ONNX)
+Autotune (ONNX)
 ===============================================
 
 .. contents:: Table of Contents
@@ -9,7 +9,7 @@ Automated Q/DQ Placement Optimization (ONNX)
 Overview
 ========
 
-The ``modelopt.onnx.quantization.autotune`` module automates Q/DQ (Quantize/Dequantize) placement in ONNX models. It explores placement strategies and uses TensorRT latency measurements to choose a configuration that minimizes inference time.
+The ``modelopt.onnx.quantization.autotune`` module automates Q/DQ (Quantize/Dequantize) placement optimization in ONNX models. It explores placement strategies and uses TensorRT latency measurements to choose a configuration that minimizes inference time.
 
 **Key Features:**
 
diff --git a/examples/cnn_qat/README.md b/examples/cnn_qat/README.md
index c421ce868c..3d578c5930 100644
--- a/examples/cnn_qat/README.md
+++ b/examples/cnn_qat/README.md
@@ -143,4 +143,4 @@ Your actual results will vary based on the dataset, specific hyperparameters, an
 
 ## Deployment with TensorRT
 
-The final model after QAT, saved using `mto.save()`, contains both the model weights and the quantization metadata. This model can be deployed to TensorRT for inference after ONNX export. The process is generally similar to [deploying a ONNX PTQ](../onnx_ptq/README.md#evaluate-the-quantized-onnx-model) model from ModelOpt.
+The final model after QAT, saved using `mto.save()`, contains both the model weights and the quantization metadata. This model can be deployed to TensorRT for inference after ONNX export. The process is generally similar to [deploying an ONNX PTQ](../onnx_ptq/README.md#evaluate-the-quantized-onnx-model) model from ModelOpt.
diff --git a/examples/onnx_ptq/README.md b/examples/onnx_ptq/README.md
index 980a264938..0cfd4ea62f 100644
--- a/examples/onnx_ptq/README.md
+++ b/examples/onnx_ptq/README.md
@@ -219,6 +219,25 @@ trtexec --onnx=/path/to/identity_neural_network.quant.onnx \
     --staticPlugins=/path/to/libidentity_conv_iplugin_v2_io_ext.so
 ```
 
+### Optimize Q/DQ node placement with Autotune
+
+This feature automates Q/DQ (Quantize/Dequantize) node placement optimization for ONNX models using TensorRT performance measurements.
+For more information on the standalone toolkit, please refer to [autotune](./autotune).
+
+To access this feature in the ONNX quantization workflow, simply add `--autotune` in your CLI:
+
+```bash
+python -m modelopt.onnx.quantization \
+    --onnx_path=vit_base_patch16_224.onnx \
+    --quantize_mode=<fp8|int8|int4> \
+    --calibration_data=calib.npy \
+    --calibration_method=<max|entropy|awq_clip|rtn_dq> \
+    --output_path=vit_base_patch16_224.quant.onnx \
+    --autotune=<quick,default,extensive>
+```
+
+For more fine-tuned Autotune flags, please refer to the [API guide](https://nvidia.github.io/Model-Optimizer/guides/_onnx_quantization.html).
+
 ## Resources
 
 - 📅 [Roadmap](https://github.com/NVIDIA/Model-Optimizer/issues/146)
diff --git a/examples/onnx/autoqdq/README.md b/examples/onnx_ptq/autotune/README.md
similarity index 100%
rename from examples/onnx/autoqdq/README.md
rename to examples/onnx_ptq/autotune/README.md
diff --git a/tests/gpu/onnx/test_concat_elim.py b/tests/gpu/onnx/quantization/test_concat_elim.py
similarity index 100%
rename from tests/gpu/onnx/test_concat_elim.py
rename to tests/gpu/onnx/quantization/test_concat_elim.py
diff --git a/tests/gpu/onnx/test_plugin.py b/tests/gpu/onnx/quantization/test_plugin.py
similarity index 100%
rename from tests/gpu/onnx/test_plugin.py
rename to tests/gpu/onnx/quantization/test_plugin.py
diff --git a/tests/gpu/onnx/test_qdq_utils_fp8.py b/tests/gpu/onnx/quantization/test_qdq_utils_fp8.py
similarity index 100%
rename from tests/gpu/onnx/test_qdq_utils_fp8.py
rename to tests/gpu/onnx/quantization/test_qdq_utils_fp8.py
diff --git a/tests/gpu/onnx/test_quantize_fp8.py b/tests/gpu/onnx/quantization/test_quantize_fp8.py
similarity index 100%
rename from tests/gpu/onnx/test_quantize_fp8.py
rename to tests/gpu/onnx/quantization/test_quantize_fp8.py
diff --git a/tests/gpu/onnx/test_quantize_onnx_torch_int4_awq.py b/tests/gpu/onnx/quantization/test_quantize_onnx_torch_int4_awq.py
similarity index 100%
rename from tests/gpu/onnx/test_quantize_onnx_torch_int4_awq.py
rename to tests/gpu/onnx/quantization/test_quantize_onnx_torch_int4_awq.py
diff --git a/tests/unit/onnx/test_convtranspose_qdq.py b/tests/unit/onnx/quantization/test_convtranspose_qdq.py
similarity index 100%
rename from tests/unit/onnx/test_convtranspose_qdq.py
rename to tests/unit/onnx/quantization/test_convtranspose_qdq.py
diff --git a/tests/unit/onnx/test_dq_transpose_surgery.py b/tests/unit/onnx/quantization/test_dq_transpose_surgery.py
similarity index 100%
rename from tests/unit/onnx/test_dq_transpose_surgery.py
rename to tests/unit/onnx/quantization/test_dq_transpose_surgery.py
diff --git a/tests/unit/onnx/test_qdq_rules_int8.py b/tests/unit/onnx/quantization/test_qdq_rules_int8.py
similarity index 100%
rename from tests/unit/onnx/test_qdq_rules_int8.py
rename to tests/unit/onnx/quantization/test_qdq_rules_int8.py
diff --git a/tests/unit/onnx/test_qdq_utils.py b/tests/unit/onnx/quantization/test_qdq_utils.py
similarity index 100%
rename from tests/unit/onnx/test_qdq_utils.py
rename to tests/unit/onnx/quantization/test_qdq_utils.py
diff --git a/tests/unit/onnx/test_quant_utils.py b/tests/unit/onnx/quantization/test_quant_utils.py
similarity index 100%
rename from tests/unit/onnx/test_quant_utils.py
rename to tests/unit/onnx/quantization/test_quant_utils.py
diff --git a/tests/unit/onnx/test_quantize_api.py b/tests/unit/onnx/quantization/test_quantize_api.py
similarity index 100%
rename from tests/unit/onnx/test_quantize_api.py
rename to tests/unit/onnx/quantization/test_quantize_api.py
diff --git a/tests/unit/onnx/test_quantize_int8.py b/tests/unit/onnx/quantization/test_quantize_int8.py
similarity index 100%
rename from tests/unit/onnx/test_quantize_int8.py
rename to tests/unit/onnx/quantization/test_quantize_int8.py
diff --git a/tests/unit/onnx/test_quantize_zint4.py b/tests/unit/onnx/quantization/test_quantize_zint4.py
similarity index 100%
rename from tests/unit/onnx/test_quantize_zint4.py
rename to tests/unit/onnx/quantization/test_quantize_zint4.py