NVIDIA® TensorRT™ is an ecosystem of APIs for high-performance deep learning inference. TensorRT includes an inference runtime and model optimizations that deliver low latency and high throughput for production applications. The TensorRT ecosystem includes TensorRT, TensorRT-LLM, TensorRT Model Optimizer, and TensorRT Cloud.
In this sample, we will create a neural network with classification capabilities based on PyTorch to implement handwritten digit recognition. The model will be trained and tested on the MNIST dataset. Finally, the model will be converted to TensorRT format to accelerate the inference speed of the neural network.
This is a test case from Nvidia. Here we attempt to deploy and run it on a Jetson device.
Note: For more information, please refer to:
Jetpack has TensorRT pre-installed.
Please run the following command in the terminal. If the TensorRT version is printed in the terminal, it means that TensorRT is installed correctly.
sudo apt install nvidia-jetpack
python3 -c "import tensorrt as trt; print(f'TensorRT version: {trt.__version__}')"Please refer to Module 3.3 for the installation of PyTorch and Torchvision. I have installed version 2.0.0 of torch.
Download the Python scripts from the scripts folder and copy them to the Jetson device.
Open the terminal and run:
cd <path-of-scripts>
python3 sample.pyIf the sample runs successfully you should see a match between the test case and the prediction.
Test Case: 4
Prediction: 4Note: Please ignore any warning messages that appear during the execution of the program.
| Tutorial | Type | Description |
|---|---|---|
| TensorRT Getting Started | website | Nvidia's official getting started tutorial |
