Skip to content

[6034518] Downgrade TRT support for remote autotuning in Autotune from 10.16 to 10.15#1259

Open
gcunhase wants to merge 2 commits intoNVIDIA:mainfrom
gcunhase:dev/gcunhasergio/6034518_autotune_trt10.15
Open

[6034518] Downgrade TRT support for remote autotuning in Autotune from 10.16 to 10.15#1259
gcunhase wants to merge 2 commits intoNVIDIA:mainfrom
gcunhase:dev/gcunhasergio/6034518_autotune_trt10.15

Conversation

@gcunhase
Copy link
Copy Markdown
Contributor

@gcunhase gcunhase commented Apr 14, 2026

What does this PR do?

Type of change: Bug fix

Remote autotuning is supported in TensorRT from version 10.15, but fails with Autotune as it's checking for 10.16+. This PR fixes that check and updates documentation accordingly.

Usage

# Add a code snippet demonstrating how to use this

Testing

See bug 6034518.

Before your PR is "Ready for review"

  • Is this change backward compatible?: ✅
  • If you copied code from any other sources or added a new PIP dependency, did you follow guidance in CONTRIBUTING.md: N/A
  • Did you write any new necessary tests?: N/A
  • Did you update Changelog?: ✅

Summary by CodeRabbit

  • Documentation

    • Added Remote Autotuning guide for TensorRT 10.15+ with CLI invocation examples and explicit requirements.
    • Updated examples and command documentation for remote autotuning configuration.
  • Updates

    • Reduced TensorRT minimum version requirement for remote autotuning from 10.16 to 10.15.
    • Clarified CLI help text for autotuning arguments.

Signed-off-by: gcunhase <4861122+gcunhase@users.noreply.github.com>
Signed-off-by: gcunhase <4861122+gcunhase@users.noreply.github.com>
@gcunhase gcunhase requested a review from a team as a code owner April 14, 2026 16:49
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 14, 2026

📝 Walkthrough

Walkthrough

Updated TensorRT minimum version requirement for remote autotuning from 10.16 to 10.15 across changelog, documentation, examples, CLI help text, and version checks.

Changes

Cohort / File(s) Summary
Changelog & Documentation
CHANGELOG.rst, docs/source/guides/9_autotune.rst, examples/onnx_ptq/autotune/README.md
Updated TensorRT version from 10.16 to 10.15 for remote autotuning feature. Added new documentation section describing remote autotuning with safety mode, requirements, and CLI invocation example with --safe and --skipInference flags.
Implementation & CLI
modelopt/onnx/quantization/__main__.py, modelopt/onnx/quantization/autotune/benchmark.py
Updated CLI help text to clarify --autotune_trtexec_args behavior and relevance to trtexec workflow. Changed version gate in remote autotuning check from 10.16 to 10.15 with corresponding log message updates.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and concisely summarizes the main change: downgrading TRT version requirement for remote autotuning from 10.16 to 10.15.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Security Anti-Patterns ✅ Passed No security anti-patterns found: no unsafe torch.load, numpy.load, trust_remote_code, eval/exec, or # nosec comments detected.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
modelopt/onnx/quantization/autotune/benchmark.py (1)

209-228: ⚠️ Potential issue | 🔴 Critical

Remove early return so remote autotuning args are actually applied.

Line 218 returns from __init__ before Line 227 extends self._base_cmd, so the --remoteAutoTuningConfig/--safe args never make it into the executed command when TRT >= 10.15. This breaks remote autotuning despite passing the version gate.

💡 Proposed fix
-        trtexec_args = self.trtexec_args or []
+        trtexec_args = list(self.trtexec_args or [])
         has_remote_config = any("--remoteAutoTuningConfig" in arg for arg in trtexec_args)

         if has_remote_config:
             try:
                 _check_for_tensorrt(min_version="10.15")
                 self.logger.debug("TensorRT Python API version >= 10.15 detected")
                 if "--safe" not in trtexec_args:
                     self.logger.warning(
                         "Remote autotuning requires '--safe' to be set. Adding it to trtexec arguments."
                     )
-                    self.trtexec_args.append("--safe")
-                return
+                    trtexec_args.append("--safe")
             except ImportError:
                 self.logger.warning(
                     "Remote autotuning is not supported with TensorRT version < 10.15. "
                     "Removing --remoteAutoTuningConfig from trtexec arguments"
                 )
                 trtexec_args = [
                     arg for arg in trtexec_args if "--remoteAutoTuningConfig" not in arg
                 ]
         self._base_cmd.extend(trtexec_args)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@modelopt/onnx/quantization/autotune/benchmark.py` around lines 209 - 228, The
constructor's remote-autotuning branch returns early so trtexec args like
"--safe" or "--remoteAutoTuningConfig" never get appended to the final command;
remove the early return in the block that calls
_check_for_tensorrt(min_version="10.15") and instead ensure any modifications to
self.trtexec_args (and filtered local trtexec_args) are followed by extending
self._base_cmd with those trtexec_args so the arguments are applied; locate the
logic around _check_for_tensorrt, self.trtexec_args, trtexec_args and
self._base_cmd in the __init__ of the Benchmark class and adjust flow so the
successful TensorRT path does not exit before
self._base_cmd.extend(trtexec_args) executes.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@modelopt/onnx/quantization/autotune/benchmark.py`:
- Around line 209-228: The constructor's remote-autotuning branch returns early
so trtexec args like "--safe" or "--remoteAutoTuningConfig" never get appended
to the final command; remove the early return in the block that calls
_check_for_tensorrt(min_version="10.15") and instead ensure any modifications to
self.trtexec_args (and filtered local trtexec_args) are followed by extending
self._base_cmd with those trtexec_args so the arguments are applied; locate the
logic around _check_for_tensorrt, self.trtexec_args, trtexec_args and
self._base_cmd in the __init__ of the Benchmark class and adjust flow so the
successful TensorRT path does not exit before
self._base_cmd.extend(trtexec_args) executes.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: aca1a5f5-c803-49d9-8889-37357600cffb

📥 Commits

Reviewing files that changed from the base of the PR and between b6c6ec3 and 2097254.

📒 Files selected for processing (5)
  • CHANGELOG.rst
  • docs/source/guides/9_autotune.rst
  • examples/onnx_ptq/autotune/README.md
  • modelopt/onnx/quantization/__main__.py
  • modelopt/onnx/quantization/autotune/benchmark.py

@ajrasane
Copy link
Copy Markdown
Contributor

PR #1259 Review: Downgrade TRT support for remote autotuning from 10.16 to 10.15

Verdict: Request Changes — core fix is correct, but docs have formatting issues and there's a code/doc inconsistency.


  1. RST formatting errors — docs/source/guides/9_autotune.rst
  • Missing blank line after .. code-block:: bash — RST requires a blank line between the directive and content. All other code blocks in this file follow this pattern.
  • Wrong indentation — content uses 2-space indent but every other code block in the file uses 3-space indent. This will prevent proper rendering.
  • Stray Markdown fence — there's a ``` closing fence after the code block that doesn't belong in RST. Will render as literal text.
  • Leading space on " Other TensorRT benchmark options..." line — creates an unintended blockquote.
  1. --skipInference inconsistency between docs and code

Docs (both RST and README) now state --safe --skipInference must be enabled, but benchmark.py:211-215 only auto-adds --safe — no check or auto-add for --skipInference:

  if "--safe" not in trtexec_args:                                                                                                                 
      self.logger.warning(...)                                                                                                      
      self.trtexec_args.append("--safe")                                                            
  # No equivalent for --skipInference

Either:

  • Add a similar auto-add for --skipInference in the code (preferred for consistency), or
  • Clarify in docs that --skipInference is recommended but not enforced
  1. Minor: main.py help text

The updated help says "Only relevant with the 'trtexec' workflow enabled" — would be clearer as "Only relevant when --use_trtexec is set" since that's the actual flag name.

What looks good

  • Version downgrade 10.16 → 10.15 in benchmark.py — correct and consistent across all 3 occurrences
  • CHANGELOG.rst entry — properly placed
  • README.md updates — version numbers and --skipInference addition are consistent

@codecov
Copy link
Copy Markdown

codecov bot commented Apr 14, 2026

Codecov Report

❌ Patch coverage is 0% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 77.45%. Comparing base (b6c6ec3) to head (2097254).

Files with missing lines Patch % Lines
modelopt/onnx/quantization/autotune/benchmark.py 0.00% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1259      +/-   ##
==========================================
+ Coverage   76.90%   77.45%   +0.54%     
==========================================
  Files         350      350              
  Lines       40524    40524              
==========================================
+ Hits        31166    31388     +222     
+ Misses       9358     9136     -222     
Flag Coverage Δ
examples 43.81% <0.00%> (+1.18%) ⬆️
gpu 57.43% <0.00%> (-0.10%) ⬇️
unit 55.59% <0.00%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants