Remove trt-llm from docs

oyilmaz-nvidia · oyilmaz-nvidia · commit 7cb77e5f5ff7 · 2026-05-29T16:14:44.000-04:00
Signed-off-by: Onur Yilmaz &lt;oyilmaz@nvidia.com&gt;
diff --git a/README.md b/README.md
@@ -12,7 +12,7 @@
 [![Python 3.10+](https://img.shields.io/badge/python-3.10+-blue.svg)](https://www.python.org/downloads/release/python-3100/)
 [![GitHub Stars](https://img.shields.io/github/stars/NVIDIA-NeMo/Export-Deploy.svg?style=social&label=Star)](https://github.com/NVIDIA-NeMo/Export-Deploy/stargazers/)
 
-<!-- **Library with tooling and APIs for exporting and deploying NeMo and Hugging Face models with support of backends like  TensorRT, TensorRT-LLM and vLLM through NVIDIA Triton Inference Server.** -->
+<!-- **Library with tooling and APIs for exporting and deploying NeMo and Hugging Face models with support of backends like TensorRT and vLLM through NVIDIA Triton Inference Server.** -->
 
 [![📖 Documentation](https://img.shields.io/badge/docs-nvidia-informational?logo=book)](https://docs.nvidia.com/nemo/export-deploy/latest/index.html)
 [![🔧 Installation](https://img.shields.io/badge/install-guide-blue?logo=terminal)](https://github.com/NVIDIA-NeMo/Export-Deploy?tab=readme-ov-file#-install)
diff --git a/docs/llm/automodel/optimized/automodel-trtllm.md b/docs/llm/automodel/optimized/automodel-trtllm.md
diff --git a/docs/llm/automodel/optimized/index.md b/docs/llm/automodel/optimized/index.md
@@ -1,12 +1,11 @@
 # Export and Deploy NeMo Automodel LLMs
 
-NeMo Export-Deploy library offers scripts and APIs to export [NeMo AutoModel](https://docs.nvidia.com/nemo/automodel/latest/index.html) models to two inference optimized libraries, TensorRT-LLM and vLLM, and to deploy the exported model with the NVIDIA Triton Inference Server. 
+NeMo Export-Deploy library offers scripts and APIs to export [NeMo AutoModel](https://docs.nvidia.com/nemo/automodel/latest/index.html) models to the vLLM inference optimized library, and to deploy the exported model with the NVIDIA Triton Inference Server. 
 
 ```{toctree}
 :maxdepth: 4
 :titlesonly:
 :hidden:
 
-Deploy TensorRT-LLM with Triton <automodel-trtllm.md>
 Deploy vLLM with Triton <automodel-vllm.md>
 ```
diff --git a/docs/llm/index.md b/docs/llm/index.md
@@ -1,6 +1,6 @@
 # Export and Deploy Large Language Models
 
-The Export-Deploy library provides comprehensive tools and APIs for exporting and deploying Large Language Models (LLMs) to production environments. This library supports multiple checkpoint formats and offers various deployment paths including TensorRT-LLM and vLLM deployment through NVIDIA Triton Inference Server and Ray Serve.
+The Export-Deploy library provides comprehensive tools and APIs for exporting and deploying Large Language Models (LLMs) to production environments. This library supports multiple checkpoint formats and offers various deployment paths including vLLM deployment through NVIDIA Triton Inference Server and Ray Serve.
 
 ## Overview
 
@@ -19,7 +19,6 @@ The library supports several checkpoint formats, each with specific capabilities
 - Model deployment with Triton and Ray Serve
 
 **Export and Deployment Paths Coming Soon:**
-- TensorRT-LLM export and deployment with Triton and Ray Serve
 - vLLM export and deployment with Triton and Ray Serve
 
 
@@ -29,7 +28,6 @@ The library supports several checkpoint formats, each with specific capabilities
 
 **Supported Export and Deployment Paths:**
 - Model deployment with Triton and Ray Serve
-- TensorRT-LLM export and deployment with Triton and Ray Serve
 - vLLM export and deployment with Triton and Ray Serve
 
 
diff --git a/docs/llm/mbridge/index.md b/docs/llm/mbridge/index.md
@@ -5,7 +5,6 @@ The [Megatron-Bridge](https://github.com/NVIDIA-NeMo/Megatron-Bridge) checkpoint
 With the Export-Deploy library, you can seamlessly export and deploy Megatron-Bridge checkpoints across a variety of production environments. The following export and deployment paths are supported for Megatron-Bridge models:
 
 - **Model deployment with Triton and Ray Serve:** Directly serve Megatron-Bridge models using NVIDIA Triton Inference Server or Ray Serve for scalable inference.
-- **TensorRT-LLM export and deployment with Triton and Ray Serve:** Convert Megatron-Bridge checkpoints into optimized TensorRT-LLM engines for high-performance inference, deployable via Triton or Ray Serve. Support for this feature is coming soon.
 - **vLLM export and deployment with Triton:** Export Megatron-Bridge models to the vLLM format for efficient serving with Triton. Support for this feature is coming soon.
 
 
diff --git a/docs/llm/mbridge/optimized/index.md b/docs/llm/mbridge/optimized/index.md
@@ -1,6 +1,6 @@
 # Deploy Megatron-Bridge LLMs by Exporting to Inference Optimized Libraries
 
-Export-Deploy supports optimizing and deploying Megatron-Bridge checkpoints using inference-optimized libraries such as vLLM and TensorRT-LLM.
+Export-Deploy supports optimizing and deploying Megatron-Bridge checkpoints using inference-optimized libraries such as vLLM.
 
 ```{toctree}
 :maxdepth: 1
@@ -9,5 +9,3 @@ Export-Deploy supports optimizing and deploying Megatron-Bridge checkpoints usin
 vLLM <vllm.md>
 ```
 
-**Note:** Support for exporting and deploying Megatron-Bridge models with TensorRT-LLM is coming soon. Please check back for updates.
-
diff --git a/docs/mm/index.md b/docs/mm/index.md
@@ -1,6 +1,6 @@
 # Export and Deploy Multimodal Models
 
-The Export-Deploy library provides comprehensive tools and APIs for exporting and deploying Multimodal Models (MMs) to production environments. This library supports multiple checkpoint formats and offers various deployment paths including TensorRT-LLM deployment through NVIDIA Triton Inference Server.
+The Export-Deploy library provides comprehensive tools and APIs for exporting and deploying Multimodal Models (MMs) to production environments. This library supports multiple checkpoint formats and offers various deployment paths through NVIDIA Triton Inference Server.
 
 ## Overview
 
@@ -17,7 +17,6 @@ The library supports several checkpoint formats, each with specific capabilities
 
 **Export and Deployment Paths Coming Soon:**
 - Model deployment with Triton and Ray Serve
-- TensorRT-LLM export and deployment with Triton and Ray Serve
 
 
 ### AutoModel Model/Checkpoints
@@ -26,7 +25,6 @@ The library supports several checkpoint formats, each with specific capabilities
 
 **Export and Deployment Paths Coming Soon:**
 - Model deployment with Triton
-- TensorRT-LLM export and deployment with Triton