Skip to content

Commit 7cb77e5

Browse files
Remove trt-llm from docs
Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>
1 parent d7ee9b0 commit 7cb77e5

7 files changed

Lines changed: 5 additions & 315 deletions

File tree

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@
1212
[![Python 3.10+](https://img.shields.io/badge/python-3.10+-blue.svg)](https://www.python.org/downloads/release/python-3100/)
1313
[![GitHub Stars](https://img.shields.io/github/stars/NVIDIA-NeMo/Export-Deploy.svg?style=social&label=Star)](https://github.com/NVIDIA-NeMo/Export-Deploy/stargazers/)
1414

15-
<!-- **Library with tooling and APIs for exporting and deploying NeMo and Hugging Face models with support of backends like TensorRT, TensorRT-LLM and vLLM through NVIDIA Triton Inference Server.** -->
15+
<!-- **Library with tooling and APIs for exporting and deploying NeMo and Hugging Face models with support of backends like TensorRT and vLLM through NVIDIA Triton Inference Server.** -->
1616

1717
[![📖 Documentation](https://img.shields.io/badge/docs-nvidia-informational?logo=book)](https://docs.nvidia.com/nemo/export-deploy/latest/index.html)
1818
[![🔧 Installation](https://img.shields.io/badge/install-guide-blue?logo=terminal)](https://github.com/NVIDIA-NeMo/Export-Deploy?tab=readme-ov-file#-install)

docs/llm/automodel/optimized/automodel-trtllm.md

Lines changed: 0 additions & 302 deletions
This file was deleted.
Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,11 @@
11
# Export and Deploy NeMo Automodel LLMs
22

3-
NeMo Export-Deploy library offers scripts and APIs to export [NeMo AutoModel](https://docs.nvidia.com/nemo/automodel/latest/index.html) models to two inference optimized libraries, TensorRT-LLM and vLLM, and to deploy the exported model with the NVIDIA Triton Inference Server.
3+
NeMo Export-Deploy library offers scripts and APIs to export [NeMo AutoModel](https://docs.nvidia.com/nemo/automodel/latest/index.html) models to the vLLM inference optimized library, and to deploy the exported model with the NVIDIA Triton Inference Server.
44

55
```{toctree}
66
:maxdepth: 4
77
:titlesonly:
88
:hidden:
99
10-
Deploy TensorRT-LLM with Triton <automodel-trtllm.md>
1110
Deploy vLLM with Triton <automodel-vllm.md>
1211
```

docs/llm/index.md

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Export and Deploy Large Language Models
22

3-
The Export-Deploy library provides comprehensive tools and APIs for exporting and deploying Large Language Models (LLMs) to production environments. This library supports multiple checkpoint formats and offers various deployment paths including TensorRT-LLM and vLLM deployment through NVIDIA Triton Inference Server and Ray Serve.
3+
The Export-Deploy library provides comprehensive tools and APIs for exporting and deploying Large Language Models (LLMs) to production environments. This library supports multiple checkpoint formats and offers various deployment paths including vLLM deployment through NVIDIA Triton Inference Server and Ray Serve.
44

55
## Overview
66

@@ -19,7 +19,6 @@ The library supports several checkpoint formats, each with specific capabilities
1919
- Model deployment with Triton and Ray Serve
2020

2121
**Export and Deployment Paths Coming Soon:**
22-
- TensorRT-LLM export and deployment with Triton and Ray Serve
2322
- vLLM export and deployment with Triton and Ray Serve
2423

2524

@@ -29,7 +28,6 @@ The library supports several checkpoint formats, each with specific capabilities
2928

3029
**Supported Export and Deployment Paths:**
3130
- Model deployment with Triton and Ray Serve
32-
- TensorRT-LLM export and deployment with Triton and Ray Serve
3331
- vLLM export and deployment with Triton and Ray Serve
3432

3533

docs/llm/mbridge/index.md

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,6 @@ The [Megatron-Bridge](https://github.com/NVIDIA-NeMo/Megatron-Bridge) checkpoint
55
With the Export-Deploy library, you can seamlessly export and deploy Megatron-Bridge checkpoints across a variety of production environments. The following export and deployment paths are supported for Megatron-Bridge models:
66

77
- **Model deployment with Triton and Ray Serve:** Directly serve Megatron-Bridge models using NVIDIA Triton Inference Server or Ray Serve for scalable inference.
8-
- **TensorRT-LLM export and deployment with Triton and Ray Serve:** Convert Megatron-Bridge checkpoints into optimized TensorRT-LLM engines for high-performance inference, deployable via Triton or Ray Serve. Support for this feature is coming soon.
98
- **vLLM export and deployment with Triton:** Export Megatron-Bridge models to the vLLM format for efficient serving with Triton. Support for this feature is coming soon.
109

1110

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Deploy Megatron-Bridge LLMs by Exporting to Inference Optimized Libraries
22

3-
Export-Deploy supports optimizing and deploying Megatron-Bridge checkpoints using inference-optimized libraries such as vLLM and TensorRT-LLM.
3+
Export-Deploy supports optimizing and deploying Megatron-Bridge checkpoints using inference-optimized libraries such as vLLM.
44

55
```{toctree}
66
:maxdepth: 1
@@ -9,5 +9,3 @@ Export-Deploy supports optimizing and deploying Megatron-Bridge checkpoints usin
99
vLLM <vllm.md>
1010
```
1111

12-
**Note:** Support for exporting and deploying Megatron-Bridge models with TensorRT-LLM is coming soon. Please check back for updates.
13-

docs/mm/index.md

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Export and Deploy Multimodal Models
22

3-
The Export-Deploy library provides comprehensive tools and APIs for exporting and deploying Multimodal Models (MMs) to production environments. This library supports multiple checkpoint formats and offers various deployment paths including TensorRT-LLM deployment through NVIDIA Triton Inference Server.
3+
The Export-Deploy library provides comprehensive tools and APIs for exporting and deploying Multimodal Models (MMs) to production environments. This library supports multiple checkpoint formats and offers various deployment paths through NVIDIA Triton Inference Server.
44

55
## Overview
66

@@ -17,7 +17,6 @@ The library supports several checkpoint formats, each with specific capabilities
1717

1818
**Export and Deployment Paths Coming Soon:**
1919
- Model deployment with Triton and Ray Serve
20-
- TensorRT-LLM export and deployment with Triton and Ray Serve
2120

2221

2322
### AutoModel Model/Checkpoints
@@ -26,7 +25,6 @@ The library supports several checkpoint formats, each with specific capabilities
2625

2726
**Export and Deployment Paths Coming Soon:**
2827
- Model deployment with Triton
29-
- TensorRT-LLM export and deployment with Triton
3028

3129

3230

0 commit comments

Comments
 (0)