Skip to content

Commit f5e554c

Browse files
Adding tensorRT rebuild subpage (#824)
Signed-off-by: SteveMacenski <stevenmacenski@gmail.com>
1 parent afb75a2 commit f5e554c

2 files changed

Lines changed: 129 additions & 1 deletion

File tree

Lines changed: 120 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,120 @@
1+
.. _rebuilding_tensorrt_engine:
2+
3+
Rebuilding TensorRT Engine for Isaac Perceptor on Nova Carter
4+
*************************************************************
5+
6+
This is a step-by-step guide for fixing Isaac Perceptor model ("Engine") compatibility issues in the NVIDIA Isaac environment. While this has been developed (and tested) on the NVIDIA Nova Carter robot, this should work for Isaac Sim, etc. as well.
7+
8+
Among the collection of nodes and packages Perceptor uses for 3-D scene reconstruction are a set of "engine" and "plan" files, which are the actual neural network models used by Perceptor components such as ``nvblox`` to do things like object recognition, semantic segmentation, image disparity calculation, etc.
9+
10+
**Problem**: Incompatible "engine" files.
11+
12+
When running the Isaac ROS Perceptor node, the following error occurs::
13+
14+
Error Code 6: API Usage Error (The engine plan file is not compatible with this version of TensorRT, expecting library version 10.7.0.23)
15+
16+
**Root Cause**: The .engine files are shipped pre-compiled for a generic CUDA runtime built with the specific version of TensorRT installed on the system running Perceptor. Since Perceptor is typically run from a Docker container vs. in the native host Jetpack install, there is often some version drift between the Docker image the container runs and the host Jetpack.
17+
18+
**Solution**: Fortunately, NVIDIA provides some fine-grained tools for working with CUDA, building & converting models between different formats and NVIDIA hardware platforms. So, we can rebuild the incompatible ``.engine`` files with the ``trtexec`` tool.
19+
20+
Rebuild engines inside the container using ``trtexec`` compiled against the container's TensorRT version with "full" runtime optimization.
21+
22+
.. note::
23+
24+
We will see why "full" is important later, although sneak-peek: it has to do with resolving the error message::
25+
26+
Error Code 4: API Usage Error (Cannot deserialize engine with lean runtime...
27+
28+
Step-by-Step Resolution
29+
========================
30+
31+
This all needs to be done inside the container you will eventually be running Isaac Perceptor from.
32+
Do all these steps from top to bottom, preferably run from inside a ``screen(1)`` session.
33+
34+
1. Access the Running Container
35+
-------------------------------
36+
37+
.. code-block:: bash
38+
39+
docker exec -it <container_name> /bin/bash
40+
41+
2. Rebuild ``trtexec`` for Compatibility
42+
----------------------------------------
43+
44+
.. code-block:: bash
45+
46+
# Navigate to TensorRT source
47+
cd /usr/src/tensorrt
48+
49+
# Clean and rebuild trtexec
50+
sudo make clean
51+
sudo make -j$(nproc) trtexec
52+
53+
# Verify new trtexec version
54+
./bin/trtexec --version
55+
56+
3. Reinstall ISAAC_ROS Assets
57+
-----------------------------
58+
59+
.. code-block:: bash
60+
61+
# Reinstall essential model packages
62+
sudo apt-get install --reinstall isaac_ros_ess_models_install
63+
# Add other model packages as needed
64+
65+
4. Remove Old Engine Files
66+
--------------------------
67+
68+
.. code-block:: bash
69+
70+
# Navigate to assets directory
71+
cd $ISAAC_ROS_ASSETS
72+
73+
# Remove all existing engine files
74+
find . -name "*.engine" -delete
75+
find . -name "*.plan" -delete
76+
77+
5. Modify Model Installation Scripts
78+
------------------------------------
79+
80+
For each model install script (``/opt/isaac_ros_assets/install_scripts/*.sh``):
81+
82+
.. code-block:: bash
83+
84+
# Edit the trtexec command to add --useRuntime=full
85+
sed -i 's/trtexec --onnx=/trtexec --useRuntime=full --onnx=/' install_script.sh
86+
87+
6. Regenerate Engines with Full Runtime
88+
---------------------------------------
89+
90+
.. code-block:: bash
91+
92+
# Run each modified installation script
93+
sudo /opt/isaac_ros_assets/install_scripts/isaac_ros_ess_models_install.sh
94+
# Repeat for other model scripts as needed
95+
96+
7. Commit the Container Changes
97+
-------------------------------
98+
99+
.. code-block:: bash
100+
101+
# From host system
102+
docker ps # Get container ID/name
103+
docker commit <container_name> nova_carter_with_new_engines
104+
105+
8. Launch Perceptor
106+
-------------------
107+
108+
.. code-block:: bash
109+
110+
# Inside container
111+
ros2 launch nova_carter_bringup perceptor.launch.py
112+
113+
If Perceptor reads in the new .engine files successfully, you should see output similar to this on the console for each camera used::
114+
115+
[component_container_mt-9] [INFO] [1765602288.918936247] [right_stereo_camera.left.rectify_node]: Negotiating
116+
[INFO] [launch_ros.actions.load_composable_nodes]: Loaded node '/right_stereo_camera/ess_node' in container 'nova_container'
117+
[component_container_mt-9] [INFO] [1765602288.919007703] [right_stereo_camera.right.rectify_node]: Negotiating
118+
[component_container_mt-9] [INFO] [1765602288.920391927] [nova_container]: Load Library: /opt/ros/humble/lib/libvisual_slam_node.so
119+
120+
As the .engine models are used by multiple nodes, such as ``dnn_stereo_disparity`` and ``nvblox``, a successful Perceptor run will result in a LOT of console output, but eventually quiet down and stay running.

tutorials/docs/using_isaac_perceptor.rst

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,14 @@ This tutorial will make use of the `Jetson AGX Orin <https://amzn.to/4k8jiQh>`_
3030
However, another Jetson product may suffice depending on the GPU compute demands placed on it by the number of cameras, resolutions, and models being run.
3131
Applying these technologies to a non-Nova design is possible using the general concepts and designs in this tutorial, however it involves a great deal of unique development as the launch files and nodes provided by NVIDIA assume this.
3232

33+
Additional Resources
34+
====================
35+
36+
.. toctree::
37+
:maxdepth: 1
38+
39+
isaac_perceptor/rebuilding_tensorrt_engine.rst
40+
3341
Concepts
3442
========
3543

@@ -493,7 +501,7 @@ Isaac Perceptor API Usage Errors
493501

494502
This error occurs when the packages on the Nova Carter host install are different from those installed on the Docker container, specifically the TensorRT and nvblox packages. More specifically, this often occurs because the Nova Carter JetPack install provides 10.3.x versions of TensorRT(tensorrt,nvinfer,etc.) and the development Docker containers use 10.7.x
495503

496-
See *Rebuilding TensorRT .engine files* for a step-by-step guide to fix this error.
504+
See :ref:`rebuilding_tensorrt_engine` for a step-by-step guide to fix this error.
497505

498506
``Error Code 4: API Usage Error``
499507

0 commit comments

Comments
 (0)