Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
17 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions docs/source/guides/9_autotune.rst
Original file line number Diff line number Diff line change
Expand Up @@ -243,6 +243,10 @@ To use remote autotuning during Q/DQ placement optimization, run with ``trtexec`
* Valid remote autotuning configuration
* ``--use_trtexec`` must be set (benchmarking uses ``trtexec`` instead of the TensorRT Python API)
* ``--safe --skipInference`` must be enabled via ``--trtexec_benchmark_args``
* ssh and scp must be available on the local machine
* sshpass must be available on the local machine if using password authentication
* Only one instance of remote auto tuning can be run at a time since the remote timing server and latency measurement processes share the GPU but do not coordinate execution; thus latency measurements would not be accurate if multiple instances are run concurrently.
* useCudaGraph will be added for latency measurement to improve accuracy.

Replace ``<remote autotuning config>`` with an actual remote autotuning configuration string (see ``trtexec --help`` for more details). Other TensorRT benchmark options (e.g. ``--timing_cache``, ``--warmup_runs``, ``--timing_runs``, ``--plugin_libraries``) are also available; run ``--help`` for details.

Expand Down
Loading