NVIDIA
diff --git a/‎.clang-format‎
Lines changed: 24 additions & 2 deletions b/‎.clang-format‎
Lines changed: 24 additions & 2 deletions
diff --git a/‎.github/ISSUE_TEMPLATE/bug_report.md‎
Lines changed: 2 additions & 0 deletions b/‎.github/ISSUE_TEMPLATE/bug_report.md‎
Lines changed: 2 additions & 0 deletions
diff --git a/‎.github/workflows/blossom-ci.yml‎
Lines changed: 0 additions & 1 deletion b/‎.github/workflows/blossom-ci.yml‎
Lines changed: 0 additions & 1 deletion
diff --git a/‎.github/workflows/label_issue.yml‎
Lines changed: 2 additions & 3 deletions b/‎.github/workflows/label_issue.yml‎
Lines changed: 2 additions & 3 deletions
diff --git a/‎CHANGELOG.md‎
Lines changed: 52 additions & 2 deletions b/‎CHANGELOG.md‎
Lines changed: 52 additions & 2 deletions
@@ -19,8 +19,30 @@ AlwaysBreakTemplateDeclarations: true
 BasedOnStyle: None
 BinPackArguments: true
 BinPackParameters: true
+# Almost the same as Allman style, but explicitly disabling BeforeLambdaBody
+# for backwards compatibility with clang-format-10 Allman style.
+# See also https://reviews.llvm.org/D44609
+BreakBeforeBraces: Custom
+BraceWrapping:
+  AfterCaseLabel:  true
+  AfterClass:      true
+  AfterControlStatement: Always
+  AfterEnum:       true
+  AfterFunction:   true
+  AfterNamespace:  true
+  AfterObjCDeclaration: true
+  AfterStruct:     true
+  AfterUnion:      true
+  AfterExternBlock: true
+  BeforeCatch:     true
+  BeforeElse:      true
+  BeforeLambdaBody: false
+  BeforeWhile:     false
+  IndentBraces:    false
+  SplitEmptyFunction: true
+  SplitEmptyRecord: true
+  SplitEmptyNamespace: true
 BreakBeforeBinaryOperators: All
-BreakBeforeBraces: Allman
 BreakBeforeTernaryOperators: true
 BreakConstructorInitializersBeforeComma: true
 ColumnLimit:     120
@@ -61,6 +83,7 @@ PenaltyExcessCharacter: 1000000
 PenaltyReturnTypeOnItsOwnLine: 60
 PointerAlignment: Left
 PointerBindsToType: false
+QualifierAlignment: Right
 ReflowComments:  true
 SortIncludes:    true
 SpaceAfterCStyleCast: true
@@ -77,4 +100,3 @@ Standard:        Cpp11
 StatementMacros: [API_ENTRY_TRY,TRT_TRY]
 TabWidth:        4
 UseTab:          Never
-...
 
@@ -64,4 +64,6 @@ Baremetal or Container (if so, version):
 
 **Have you tried [the latest release](https://developer.nvidia.com/tensorrt)?**:
 
+**Attach the captured .json and .bin files from [TensorRT's API Capture tool](https://docs.nvidia.com/deeplearning/tensorrt/latest/inference-library/capture-replay.html) if you're on an x86_64 Unix system**
+
 **Can this model run on other frameworks?** For example run ONNX model with ONNXRuntime (`polygraphy run <model.onnx> --onnxrt`):
@@ -42,7 +42,6 @@ jobs:
         github.actor == 'rajeevsrao' ||
         github.actor == 'kevinch-nv' ||
         github.actor == 'ttyio' ||
-        github.actor == 'samurdhikaru' ||
         github.actor == 'zerollzeng' ||
         github.actor == 'nvpohanh' ||
         github.actor == 'poweiw'
 
@@ -15,10 +15,9 @@ jobs:
       - name: Checkout private action repository
         uses: actions/checkout@v4
         with:
-          repository: poweiw/goggles_action
+          repository: NVIDIA/goggles_action
           path: ./.github/actions/goggles_action # local path to store the action
-          token: ${{ secrets.ACTION_REPO_PAT }} # token to access poweiw/goggles_action
-          ref: v1.2.1
+          ref: v1.3.0
 
       - name: AI Label Issue
         uses: ./.github/actions/goggles_action/actions/llm_label
 
@@ -1,11 +1,62 @@
 # TensorRT OSS Release Changelog
+## 10.16 GA - 2026-3-24
+
+- General
+  - Default CUDA version updated to CUDA 13.2.
+
+- Samples
+  - Added sampleDistCollective sample to showcase multi-device execution in TensorRT.
+
+- Parsers
+  - Added kADJUST_FOR_DLA flag to adjust parsing behavior for ONNX models to be more amenable for DLA hardware execution.
+  - Added DistCollective operator support for multi-device execution in TensorRT.
+
+## 10.15 GA - 2026-2-2
+
+- Sample changes
+  - Added 2 safety samples sampleSafeMNIST, and sampleSafePluginV3 to demonstrate how to use TensorRT with the safety workflow.
+  - Added trtSafeExec to accompany the safety workflow release.
+  - Added python/stream_writer to showcase how to serialize a TensorRT engine directly to a custom stream using the IStreamWriter interface, rather than writing to a file or a contiguous memory buffer.
+  - Added python/strongly_type_autocast to demonstrate how to convert FP32 ONNX models to mixed precision (FP32-FP16) using ModelOpt's AutoCast tool and subsequently building the engine with TensorRT's Strong Typing mode.
+  - Added sampleCudla to demonstrate how to use the cuDLA API to run TensorRT engines on the Deep Learning Accelerator (DLA) hardware, which is available on NVIDIA Jetson and DRIVE platforms.
+  - Deprecated sampleCharRNN.
+
+- Plugin changes
+  - Deprecated bertQKVToContextPlugin and will be removed in a future release. No alternatives are planned to be provided.
+
+- Parser changes
+  - Added support for `RotaryEmbedding`, `RMSNormalization` and `TensorScatter` for improved LLM model support
+  - Added more specialized quantization ops for models quantized through TensorRT ModelOptimizer.
+  - Added `kREPORT_CAPABILITY_DLA` flag to enable per-node validation when building DLA engines through TensorRT.
+  - Added `kENABLE_PLUGIN_OVERRIDE` flag to enable TensorRT plugin override for nodes that share names with user plugins.
+  - Improved error reporting for models with multiple subgraphs, such as `Loop` or `Scan` nodes.
+
+- Demo changes
+  - demoDiffusion: Stable Diffusion 1.5, 2.0 and 2.1 pipelines have been deprecated and removed.
+  - Added support for Wan2.2-T2V-A14B Text to Video pipeline
+
+## 10.14 GA - 2025-11-7
+- Sample changes
+  - Replace all pycuda usages with cuda-python APIs
+  - Removed the efficientnet samples
+  - Deprecated tensorflow_object_detection and efficientdet samples
+  - Samples will no longer be released with the packages. The TensorRT GitHub repository will be the single source.
+
+
+- Parsers:
+  - Added support for the `Attention` operator
+  - Improved refit for `ConstantOfShape` nodes
+
+- Demos
+  - demoDiffusion:
+    - Added support for the Cosmos-Predict2 text2image and video2world pipelines
+
 
 ## 10.13.3 GA - 2025-9-8
 - Added support for TensorRT API Capture and Replay feature, see the [developer guide](https://docs.nvidia.com/deeplearning/tensorrt/latest/inference-library/advanced.html) for more information.
 - Demo changes
   - Added support for Flux Kontext pipeline.
 
-
 ## 10.13.2 GA - 2025-8-18
 - Added support for CUDA 13.0, dropped support for CUDA 11.X
 - Dropped support for Ubuntu 20.04
@@ -24,7 +75,6 @@
   - Added `loadModelProto`, `loadInitializer` and `refitModelProto` APIs for IParserRefitter. These APIs are meant to be used to load user initializers when refitting ONNX models.
   - Deprecated `IParser::parseWithWeightDescriptors`.
 
-
 ## 10.12.0 GA - 2025-6-10
 - Plugin changes
   - Migrated `IPluginV2`-descendent version 1 of `cropAndResizeDynamic`, to version 2, which implements `IPluginV3`.