Skip to content

Commit 6ecde05

Browse files
authored
Merge branch 'NVIDIA:main' into support-custom-ort-opt
2 parents 7a0f9b6 + 52399f5 commit 6ecde05

513 files changed

Lines changed: 25881 additions & 6898 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.clang-format

Lines changed: 24 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -19,8 +19,30 @@ AlwaysBreakTemplateDeclarations: true
1919
BasedOnStyle: None
2020
BinPackArguments: true
2121
BinPackParameters: true
22+
# Almost the same as Allman style, but explicitly disabling BeforeLambdaBody
23+
# for backwards compatibility with clang-format-10 Allman style.
24+
# See also https://reviews.llvm.org/D44609
25+
BreakBeforeBraces: Custom
26+
BraceWrapping:
27+
AfterCaseLabel: true
28+
AfterClass: true
29+
AfterControlStatement: Always
30+
AfterEnum: true
31+
AfterFunction: true
32+
AfterNamespace: true
33+
AfterObjCDeclaration: true
34+
AfterStruct: true
35+
AfterUnion: true
36+
AfterExternBlock: true
37+
BeforeCatch: true
38+
BeforeElse: true
39+
BeforeLambdaBody: false
40+
BeforeWhile: false
41+
IndentBraces: false
42+
SplitEmptyFunction: true
43+
SplitEmptyRecord: true
44+
SplitEmptyNamespace: true
2245
BreakBeforeBinaryOperators: All
23-
BreakBeforeBraces: Allman
2446
BreakBeforeTernaryOperators: true
2547
BreakConstructorInitializersBeforeComma: true
2648
ColumnLimit: 120
@@ -61,6 +83,7 @@ PenaltyExcessCharacter: 1000000
6183
PenaltyReturnTypeOnItsOwnLine: 60
6284
PointerAlignment: Left
6385
PointerBindsToType: false
86+
QualifierAlignment: Right
6487
ReflowComments: true
6588
SortIncludes: true
6689
SpaceAfterCStyleCast: true
@@ -77,4 +100,3 @@ Standard: Cpp11
77100
StatementMacros: [API_ENTRY_TRY,TRT_TRY]
78101
TabWidth: 4
79102
UseTab: Never
80-
...

.github/ISSUE_TEMPLATE/bug_report.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -64,4 +64,6 @@ Baremetal or Container (if so, version):
6464

6565
**Have you tried [the latest release](https://developer.nvidia.com/tensorrt)?**:
6666

67+
**Attach the captured .json and .bin files from [TensorRT's API Capture tool](https://docs.nvidia.com/deeplearning/tensorrt/latest/inference-library/capture-replay.html) if you're on an x86_64 Unix system**
68+
6769
**Can this model run on other frameworks?** For example run ONNX model with ONNXRuntime (`polygraphy run <model.onnx> --onnxrt`):

.github/workflows/blossom-ci.yml

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -42,7 +42,6 @@ jobs:
4242
github.actor == 'rajeevsrao' ||
4343
github.actor == 'kevinch-nv' ||
4444
github.actor == 'ttyio' ||
45-
github.actor == 'samurdhikaru' ||
4645
github.actor == 'zerollzeng' ||
4746
github.actor == 'nvpohanh' ||
4847
github.actor == 'poweiw'

.github/workflows/label_issue.yml

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -15,10 +15,9 @@ jobs:
1515
- name: Checkout private action repository
1616
uses: actions/checkout@v4
1717
with:
18-
repository: poweiw/goggles_action
18+
repository: NVIDIA/goggles_action
1919
path: ./.github/actions/goggles_action # local path to store the action
20-
token: ${{ secrets.ACTION_REPO_PAT }} # token to access poweiw/goggles_action
21-
ref: v1.2.1
20+
ref: v1.3.0
2221

2322
- name: AI Label Issue
2423
uses: ./.github/actions/goggles_action/actions/llm_label

CHANGELOG.md

Lines changed: 52 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,62 @@
11
# TensorRT OSS Release Changelog
2+
## 10.16 GA - 2026-3-24
3+
4+
- General
5+
- Default CUDA version updated to CUDA 13.2.
6+
7+
- Samples
8+
- Added sampleDistCollective sample to showcase multi-device execution in TensorRT.
9+
10+
- Parsers
11+
- Added kADJUST_FOR_DLA flag to adjust parsing behavior for ONNX models to be more amenable for DLA hardware execution.
12+
- Added DistCollective operator support for multi-device execution in TensorRT.
13+
14+
## 10.15 GA - 2026-2-2
15+
16+
- Sample changes
17+
- Added 2 safety samples sampleSafeMNIST, and sampleSafePluginV3 to demonstrate how to use TensorRT with the safety workflow.
18+
- Added trtSafeExec to accompany the safety workflow release.
19+
- Added python/stream_writer to showcase how to serialize a TensorRT engine directly to a custom stream using the IStreamWriter interface, rather than writing to a file or a contiguous memory buffer.
20+
- Added python/strongly_type_autocast to demonstrate how to convert FP32 ONNX models to mixed precision (FP32-FP16) using ModelOpt's AutoCast tool and subsequently building the engine with TensorRT's Strong Typing mode.
21+
- Added sampleCudla to demonstrate how to use the cuDLA API to run TensorRT engines on the Deep Learning Accelerator (DLA) hardware, which is available on NVIDIA Jetson and DRIVE platforms.
22+
- Deprecated sampleCharRNN.
23+
24+
- Plugin changes
25+
- Deprecated bertQKVToContextPlugin and will be removed in a future release. No alternatives are planned to be provided.
26+
27+
- Parser changes
28+
- Added support for `RotaryEmbedding`, `RMSNormalization` and `TensorScatter` for improved LLM model support
29+
- Added more specialized quantization ops for models quantized through TensorRT ModelOptimizer.
30+
- Added `kREPORT_CAPABILITY_DLA` flag to enable per-node validation when building DLA engines through TensorRT.
31+
- Added `kENABLE_PLUGIN_OVERRIDE` flag to enable TensorRT plugin override for nodes that share names with user plugins.
32+
- Improved error reporting for models with multiple subgraphs, such as `Loop` or `Scan` nodes.
33+
34+
- Demo changes
35+
- demoDiffusion: Stable Diffusion 1.5, 2.0 and 2.1 pipelines have been deprecated and removed.
36+
- Added support for Wan2.2-T2V-A14B Text to Video pipeline
37+
38+
## 10.14 GA - 2025-11-7
39+
- Sample changes
40+
- Replace all pycuda usages with cuda-python APIs
41+
- Removed the efficientnet samples
42+
- Deprecated tensorflow_object_detection and efficientdet samples
43+
- Samples will no longer be released with the packages. The TensorRT GitHub repository will be the single source.
44+
45+
46+
- Parsers:
47+
- Added support for the `Attention` operator
48+
- Improved refit for `ConstantOfShape` nodes
49+
50+
- Demos
51+
- demoDiffusion:
52+
- Added support for the Cosmos-Predict2 text2image and video2world pipelines
53+
254

355
## 10.13.3 GA - 2025-9-8
456
- Added support for TensorRT API Capture and Replay feature, see the [developer guide](https://docs.nvidia.com/deeplearning/tensorrt/latest/inference-library/advanced.html) for more information.
557
- Demo changes
658
- Added support for Flux Kontext pipeline.
759

8-
960
## 10.13.2 GA - 2025-8-18
1061
- Added support for CUDA 13.0, dropped support for CUDA 11.X
1162
- Dropped support for Ubuntu 20.04
@@ -24,7 +75,6 @@
2475
- Added `loadModelProto`, `loadInitializer` and `refitModelProto` APIs for IParserRefitter. These APIs are meant to be used to load user initializers when refitting ONNX models.
2576
- Deprecated `IParser::parseWithWeightDescriptors`.
2677

27-
2878
## 10.12.0 GA - 2025-6-10
2979
- Plugin changes
3080
- Migrated `IPluginV2`-descendent version 1 of `cropAndResizeDynamic`, to version 2, which implements `IPluginV3`.

0 commit comments

Comments
 (0)