Skip to content

Commit 4012727

Browse files
committed
Update documentation for TRT support in Autotune: 10.16 -> 10.15
Signed-off-by: gcunhase <4861122+gcunhase@users.noreply.github.com>
1 parent b6c6ec3 commit 4012727

4 files changed

Lines changed: 35 additions & 9 deletions

File tree

docs/source/guides/9_autotune.rst

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -221,6 +221,32 @@ If the model uses custom TensorRT operations, provide the plugin libraries:
221221
--output_dir ./results \
222222
--plugin_libraries /path/to/plugin1.so /path/to/plugin2.so
223223
224+
Remote Autotuning
225+
-----------------------
226+
227+
TensorRT 10.15+ supports remote autotuning in safety mode (``--safe``), which allows TensorRT's optimization process to be offloaded to a remote hardware. This is useful when optimizing models for different target GPUs without having direct access to them.
228+
229+
To use remote autotuning during Q/DQ placement optimization, run with ``trtexec`` and pass extra args:
230+
231+
.. code-block:: bash
232+
python -m modelopt.onnx.quantization.autotune \
233+
--onnx_path resnet50_Opset17_bs128.onnx \
234+
--output_dir ./resnet50_remote_autotuned \
235+
--schemes_per_region 50 \
236+
--use_trtexec \
237+
--trtexec_benchmark_args "--remoteAutoTuningConfig=\"<remote autotuning config>\" --safe --skipInference"
238+
```
239+
240+
**Requirements:**
241+
242+
* TensorRT 10.15 or later
243+
* Valid remote autotuning configuration
244+
* ``--use_trtexec`` must be set (benchmarking uses ``trtexec`` instead of the TensorRT Python API)
245+
* ``--safe --skipInference`` must be enabled via ``--trtexec_benchmark_args``
246+
247+
Replace ``<remote autotuning config>`` with an actual remote autotuning configuration string (see ``trtexec --help`` for more details).
248+
Other TensorRT benchmark options (e.g. ``--timing_cache``, ``--warmup_runs``, ``--timing_runs``, ``--plugin_libraries``) are also available; run ``--help`` for details.
249+
224250
Low-Level API Usage
225251
===================
226252

examples/onnx_ptq/autotune/README.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -229,7 +229,7 @@ python3 -m modelopt.onnx.quantization.autotune \
229229

230230
## Remote Autotuning with TensorRT
231231

232-
TensorRT 10.16+ supports remote autotuning in safety mode (`--safe`), which allows TensorRT's optimization process to be offloaded to a remote hardware. This is useful when optimizing models for different target GPUs without having direct access to them.
232+
TensorRT 10.15+ supports remote autotuning in safety mode (`--safe`), which allows TensorRT's optimization process to be offloaded to a remote hardware. This is useful when optimizing models for different target GPUs without having direct access to them.
233233

234234
To use remote autotuning during Q/DQ placement optimization, run with `trtexec` and pass extra args:
235235

@@ -239,15 +239,15 @@ python3 -m modelopt.onnx.quantization.autotune \
239239
--output_dir ./resnet50_remote_autotuned \
240240
--schemes_per_region 50 \
241241
--use_trtexec \
242-
--trtexec_benchmark_args "--remoteAutoTuningConfig=\"<remote autotuning config>\" --safe"
242+
--trtexec_benchmark_args "--remoteAutoTuningConfig=\"<remote autotuning config>\" --safe --skipInference"
243243
```
244244

245245
**Requirements:**
246246

247-
- TensorRT 10.16 or later
247+
- TensorRT 10.15 or later
248248
- Valid remote autotuning configuration
249249
- `--use_trtexec` must be set (benchmarking uses `trtexec` instead of the TensorRT Python API)
250-
- `--safe` must be enabled via `--trtexec_benchmark_args`
250+
- `--safe --skipInference` must be enabled via `--trtexec_benchmark_args`
251251

252252
Replace `<remote autotuning config>` with an actual remote autotuning configuration string (see `trtexec --help` for more details).
253253
Other TensorRT benchmark options (e.g. `--timing_cache`, `--warmup_runs`, `--timing_runs`, `--plugin_libraries`) are also available; run `--help` for details.

modelopt/onnx/quantization/__main__.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -398,8 +398,8 @@ def get_parser() -> argparse.ArgumentParser:
398398
type=str,
399399
default=None,
400400
help=(
401-
"Additional trtexec arguments as a single quoted string. "
402-
"Example: --autotune_trtexec_args '--fp16 --workspace=4096'"
401+
"Additional 'trtexec' arguments as a single quoted string. Only relevant with the 'trtexec' workflow "
402+
"enabled. Example (simple): '--fp16 --workspace=4096'"
403403
),
404404
)
405405
return argparser

modelopt/onnx/quantization/autotune/benchmark.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -208,8 +208,8 @@ def __init__(
208208

209209
if has_remote_config:
210210
try:
211-
_check_for_tensorrt(min_version="10.16")
212-
self.logger.debug("TensorRT Python API version >= 10.16 detected")
211+
_check_for_tensorrt(min_version="10.15")
212+
self.logger.debug("TensorRT Python API version >= 10.15 detected")
213213
if "--safe" not in trtexec_args:
214214
self.logger.warning(
215215
"Remote autotuning requires '--safe' to be set. Adding it to trtexec arguments."
@@ -218,7 +218,7 @@ def __init__(
218218
return
219219
except ImportError:
220220
self.logger.warning(
221-
"Remote autotuning is not supported with TensorRT version < 10.16. "
221+
"Remote autotuning is not supported with TensorRT version < 10.15. "
222222
"Removing --remoteAutoTuningConfig from trtexec arguments"
223223
)
224224
trtexec_args = [

0 commit comments

Comments
 (0)