|
| 1 | +.. _rebuilding_tensorrt_engine: |
| 2 | + |
| 3 | +Rebuilding TensorRT Engine for Isaac Perceptor on Nova Carter |
| 4 | +************************************************************* |
| 5 | + |
| 6 | +This is a step-by-step guide for fixing Isaac Perceptor model ("Engine") compatibility issues in the NVIDIA Isaac environment. While this has been developed (and tested) on the NVIDIA Nova Carter robot, this should work for Isaac Sim, etc. as well. |
| 7 | + |
| 8 | +Among the collection of nodes and packages Perceptor uses for 3-D scene reconstruction are a set of "engine" and "plan" files, which are the actual neural network models used by Perceptor components such as ``nvblox`` to do things like object recognition, semantic segmentation, image disparity calculation, etc. |
| 9 | + |
| 10 | +**Problem**: Incompatible "engine" files. |
| 11 | + |
| 12 | +When running the Isaac ROS Perceptor node, the following error occurs:: |
| 13 | + |
| 14 | + Error Code 6: API Usage Error (The engine plan file is not compatible with this version of TensorRT, expecting library version 10.7.0.23) |
| 15 | + |
| 16 | +**Root Cause**: The .engine files are shipped pre-compiled for a generic CUDA runtime built with the specific version of TensorRT installed on the system running Perceptor. Since Perceptor is typically run from a Docker container vs. in the native host Jetpack install, there is often some version drift between the Docker image the container runs and the host Jetpack. |
| 17 | + |
| 18 | +**Solution**: Fortunately, NVIDIA provides some fine-grained tools for working with CUDA, building & converting models between different formats and NVIDIA hardware platforms. So, we can rebuild the incompatible ``.engine`` files with the ``trtexec`` tool. |
| 19 | + |
| 20 | +Rebuild engines inside the container using ``trtexec`` compiled against the container's TensorRT version with "full" runtime optimization. |
| 21 | + |
| 22 | +.. note:: |
| 23 | + |
| 24 | + We will see why "full" is important later, although sneak-peek: it has to do with resolving the error message:: |
| 25 | + |
| 26 | + Error Code 4: API Usage Error (Cannot deserialize engine with lean runtime... |
| 27 | + |
| 28 | +Step-by-Step Resolution |
| 29 | +======================== |
| 30 | + |
| 31 | +This all needs to be done inside the container you will eventually be running Isaac Perceptor from. |
| 32 | +Do all these steps from top to bottom, preferably run from inside a ``screen(1)`` session. |
| 33 | + |
| 34 | +1. Access the Running Container |
| 35 | +------------------------------- |
| 36 | + |
| 37 | +.. code-block:: bash |
| 38 | +
|
| 39 | + docker exec -it <container_name> /bin/bash |
| 40 | +
|
| 41 | +2. Rebuild ``trtexec`` for Compatibility |
| 42 | +---------------------------------------- |
| 43 | + |
| 44 | +.. code-block:: bash |
| 45 | +
|
| 46 | + # Navigate to TensorRT source |
| 47 | + cd /usr/src/tensorrt |
| 48 | +
|
| 49 | + # Clean and rebuild trtexec |
| 50 | + sudo make clean |
| 51 | + sudo make -j$(nproc) trtexec |
| 52 | +
|
| 53 | + # Verify new trtexec version |
| 54 | + ./bin/trtexec --version |
| 55 | +
|
| 56 | +3. Reinstall ISAAC_ROS Assets |
| 57 | +----------------------------- |
| 58 | + |
| 59 | +.. code-block:: bash |
| 60 | +
|
| 61 | + # Reinstall essential model packages |
| 62 | + sudo apt-get install --reinstall isaac_ros_ess_models_install |
| 63 | + # Add other model packages as needed |
| 64 | +
|
| 65 | +4. Remove Old Engine Files |
| 66 | +-------------------------- |
| 67 | + |
| 68 | +.. code-block:: bash |
| 69 | +
|
| 70 | + # Navigate to assets directory |
| 71 | + cd $ISAAC_ROS_ASSETS |
| 72 | +
|
| 73 | + # Remove all existing engine files |
| 74 | + find . -name "*.engine" -delete |
| 75 | + find . -name "*.plan" -delete |
| 76 | +
|
| 77 | +5. Modify Model Installation Scripts |
| 78 | +------------------------------------ |
| 79 | + |
| 80 | +For each model install script (``/opt/isaac_ros_assets/install_scripts/*.sh``): |
| 81 | + |
| 82 | +.. code-block:: bash |
| 83 | +
|
| 84 | + # Edit the trtexec command to add --useRuntime=full |
| 85 | + sed -i 's/trtexec --onnx=/trtexec --useRuntime=full --onnx=/' install_script.sh |
| 86 | +
|
| 87 | +6. Regenerate Engines with Full Runtime |
| 88 | +--------------------------------------- |
| 89 | + |
| 90 | +.. code-block:: bash |
| 91 | +
|
| 92 | + # Run each modified installation script |
| 93 | + sudo /opt/isaac_ros_assets/install_scripts/isaac_ros_ess_models_install.sh |
| 94 | + # Repeat for other model scripts as needed |
| 95 | +
|
| 96 | +7. Commit the Container Changes |
| 97 | +------------------------------- |
| 98 | + |
| 99 | +.. code-block:: bash |
| 100 | +
|
| 101 | + # From host system |
| 102 | + docker ps # Get container ID/name |
| 103 | + docker commit <container_name> nova_carter_with_new_engines |
| 104 | +
|
| 105 | +8. Launch Perceptor |
| 106 | +------------------- |
| 107 | + |
| 108 | +.. code-block:: bash |
| 109 | +
|
| 110 | + # Inside container |
| 111 | + ros2 launch nova_carter_bringup perceptor.launch.py |
| 112 | +
|
| 113 | +If Perceptor reads in the new .engine files successfully, you should see output similar to this on the console for each camera used:: |
| 114 | + |
| 115 | + [component_container_mt-9] [INFO] [1765602288.918936247] [right_stereo_camera.left.rectify_node]: Negotiating |
| 116 | + [INFO] [launch_ros.actions.load_composable_nodes]: Loaded node '/right_stereo_camera/ess_node' in container 'nova_container' |
| 117 | + [component_container_mt-9] [INFO] [1765602288.919007703] [right_stereo_camera.right.rectify_node]: Negotiating |
| 118 | + [component_container_mt-9] [INFO] [1765602288.920391927] [nova_container]: Load Library: /opt/ros/humble/lib/libvisual_slam_node.so |
| 119 | + |
| 120 | +As the .engine models are used by multiple nodes, such as ``dnn_stereo_disparity`` and ``nvblox``, a successful Perceptor run will result in a LOT of console output, but eventually quiet down and stay running. |
0 commit comments