You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[6034518] Downgrade TRT support for remote autotuning in Autotune from 10.16 to 10.15 (#1259)
### What does this PR do?
Type of change: Bug fix
Remote autotuning is supported in TensorRT from version 10.15, but fails
with Autotune as it's checking for 10.16+. This PR fixes that check and
updates documentation accordingly.
### Usage
```python
# Add a code snippet demonstrating how to use this
```
### Testing
See bug 6034518.
### Before your PR is "*Ready for review*"
- Is this change backward compatible?: ✅
- If you copied code from any other sources or added a new PIP
dependency, did you follow guidance in `CONTRIBUTING.md`: N/A <!---
Mandatory -->
- Did you write any new necessary tests?: N/A <!--- Mandatory for new
features or examples. -->
- Did you update
[Changelog](https://github.com/NVIDIA/Model-Optimizer/blob/main/CHANGELOG.rst)?:
✅ <!--- Only for new features, API changes, critical bug fixes or
backward incompatible changes. -->
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **Documentation**
* Added a Remote Autotuning guide for TensorRT 10.15+ with CLI examples;
updated examples to require `--safe --skipInference`.
* **Updates**
* Lowered TensorRT minimum requirement for remote autotuning from 10.16
to 10.15.
* Clarified CLI help text for trtexec/autotune arguments.
* **Bug Fixes**
* trtexec-based autotuning now verifies the trtexec executable version
when checking compatibility.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Signed-off-by: gcunhase <4861122+gcunhase@users.noreply.github.com>
Signed-off-by: dmoodie <dmoodie@nvidia.com>
Co-authored-by: dmoodie <dmoodie@nvidia.com>
Copy file name to clipboardExpand all lines: CHANGELOG.rst
+1Lines changed: 1 addition & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -22,6 +22,7 @@ Changelog
22
22
**Bug Fixes**
23
23
24
24
- Fix Minitron pruning (``mcore_minitron``) for MoE models. Importance estimation hooks were incorrectly registered for MoE modules and NAS step was hanging before this.
25
+
- Fix TRT support for remote autotuning in ONNX Autotune from 10.16+ to 10.15+ and fix TRT versioning check to the ``trtexec`` version instead of the TRT Python API when using ``trtexec`` backend.
TensorRT 10.15+ supports remote autotuning in safety mode (``--safe``), which allows TensorRT's optimization process to be offloaded to a remote hardware. This is useful when optimizing models for different target GPUs without having direct access to them.
228
+
229
+
To use remote autotuning during Q/DQ placement optimization, run with ``trtexec`` and pass extra args:
* ``--use_trtexec`` must be set (benchmarking uses ``trtexec`` instead of the TensorRT Python API)
245
+
* ``--safe --skipInference`` must be enabled via ``--trtexec_benchmark_args``
246
+
247
+
Replace ``<remote autotuning config>`` with an actual remote autotuning configuration string (see ``trtexec --help`` for more details). Other TensorRT benchmark options (e.g. ``--timing_cache``, ``--warmup_runs``, ``--timing_runs``, ``--plugin_libraries``) are also available; run ``--help`` for details.
TensorRT 10.16+ supports remote autotuning in safety mode (`--safe`), which allows TensorRT's optimization process to be offloaded to a remote hardware. This is useful when optimizing models for different target GPUs without having direct access to them.
232
+
TensorRT 10.15+ supports remote autotuning in safety mode (`--safe`), which allows TensorRT's optimization process to be offloaded to a remote hardware. This is useful when optimizing models for different target GPUs without having direct access to them.
233
233
234
234
To use remote autotuning during Q/DQ placement optimization, run with `trtexec` and pass extra args:
-`--use_trtexec` must be set (benchmarking uses `trtexec` instead of the TensorRT Python API)
250
-
-`--safe` must be enabled via `--trtexec_benchmark_args`
250
+
-`--safe --skipInference` must be enabled via `--trtexec_benchmark_args`
251
251
252
252
Replace `<remote autotuning config>` with an actual remote autotuning configuration string (see `trtexec --help` for more details).
253
253
Other TensorRT benchmark options (e.g. `--timing_cache`, `--warmup_runs`, `--timing_runs`, `--plugin_libraries`) are also available; run `--help` for details.
0 commit comments