Skip to content

[TRTLLM-13247][feat] Wave 2: stage Linear and Attention transforms#15288

Open
chienchunhung wants to merge 2 commits into
NVIDIA:mainfrom
chienchunhung:codex/staged-hooks-wave2-transform-weights
Open

[TRTLLM-13247][feat] Wave 2: stage Linear and Attention transforms#15288
chienchunhung wants to merge 2 commits into
NVIDIA:mainfrom
chienchunhung:codex/staged-hooks-wave2-transform-weights

Conversation

@chienchunhung

@chienchunhung chienchunhung commented Jun 12, 2026

Copy link
Copy Markdown
Collaborator

Summary

Wave 2 of the staged post-load hooks rollout, stacked on #15014.

This migrates the remaining Linear and MLA tensor-layout post-load work into transform_weights() with _weights_transformed guards, while keeping post_load_weights() as the backward-compatible shim for existing full post-load walks.

What Changed

  • Added Linear.transform_weights() and a quant-method-level transform_weights() hook, with post_load_weights() delegating through the staged hook.
  • Moved FP8 block-scale resmoothing, NVFP4 padding, and W4A16 NVFP4 scale unswizzling from Linear post_load_weights() implementations into transform_weights().
  • Added _weights_transformed state for Linear and MLA, reset when fresh Linear weights or auxiliary MLA weight tensors are created/loaded.
  • Moved MLA SM120 FP8 resmoothing into MLA.transform_weights() and kept MLA.post_load_weights() as a shim.
  • Clarified the GMS RO documentation: RO readers run setup_aliases(), materialize_module(), then cache_derived_state(); writer-only tensor layout changes belong in transform_weights().
  • Updated/added pyexecutor unit coverage for transform idempotency and the GMS RW source_identity call shape.

Dependency / prerequisite stack

This PR is Wave 2 in the staged post-load hooks rollout. The foundation PRs #14770 and #14878 are already merged. The wave PRs should merge in sequence; after each upstream wave lands, rebase the next wave onto main so review and CI focus on that wave's delta.

Arrows point from prerequisite to dependent. PR numbers in graph nodes are clickable.

graph TD
    PR14770["<a href='https://github.com/NVIDIA/TensorRT-LLM/pull/14770'>#14770</a>: staged-hook contract (merged)"]
    PR14878["<a href='https://github.com/NVIDIA/TensorRT-LLM/pull/14878'>#14878</a>: GMS SourceIdentity gate (merged)"]
    PR15014["<a href='https://github.com/NVIDIA/TensorRT-LLM/pull/15014'>#15014</a>: Wave 1 aliases + GMS RO load (open)"]
    PR15288["<a href='https://github.com/NVIDIA/TensorRT-LLM/pull/15288'>#15288</a>: Wave 2 Linear/Attention transforms (this PR, draft)"]
    PR15386["<a href='https://github.com/NVIDIA/TensorRT-LLM/pull/15386'>#15386</a>: Wave 3 MoE/Mamba staged hooks (draft)"]
    PR15387["<a href='https://github.com/NVIDIA/TensorRT-LLM/pull/15387'>#15387</a>: Wave 4 MX receiver cutover (draft)"]
    PR15432["<a href='https://github.com/NVIDIA/TensorRT-LLM/pull/15432'>#15432</a>: Wave 5 MX publisher + Llama receiver (draft)"]
    VERIFY["post-migration verification / demo (planned)"]

    PR14770 -->|satisfied| PR15014
    PR14878 -->|satisfied| PR15014
    PR15014 -->|blocking| PR15288
    PR15288 -->|blocking| PR15386
    PR15386 -->|blocking| PR15387
    PR15387 -->|blocking| PR15432
    PR15432 -.->|planned| VERIFY

    classDef merged fill:#dcfce7,stroke:#16a34a,color:#14532d;
    classDef inflight fill:#dbeafe,stroke:#2563eb,color:#1e3a8a;
    classDef draft fill:#ffedd5,stroke:#f97316,color:#7c2d12;
    classDef current fill:#ede9fe,stroke:#7c3aed,color:#3b0764,stroke-width:3px;
    classDef downstream fill:#f3f4f6,stroke:#6b7280,color:#374151,stroke-dasharray:5 5;
    linkStyle 0,1 stroke:#16a34a,stroke-width:2px;
    linkStyle 2,3,4,5 stroke:#ea580c,stroke-width:3px;
    linkStyle 6 stroke:#6b7280,stroke-width:2px,stroke-dasharray:5 5;

    class PR14770,PR14878 merged;
    class PR15014 inflight;
    class PR15386,PR15387,PR15432 draft;
    class PR15288 current;
    class VERIFY downstream;
Loading

Immediate merge dependency for this PR: #15014 must land first; after it lands, rebase this branch onto main so the PR diff collapses to the Wave 2 delta.

Test Plan

  • PYTHONPYCACHEPREFIX=/tmp/trtllm-wave2-pycache python3 -m py_compile tensorrt_llm/_torch/modules/linear.py tensorrt_llm/_torch/modules/attention.py tensorrt_llm/_torch/memory/gpu_memory_backend.py tests/unittest/_torch/pyexecutor/test_model_loader_mx.py tests/unittest/_torch/pyexecutor/test_model_loader_gms.py
  • git diff --check
  • PATH=/Users/chienchunh/.cache/codex-runtimes/codex-primary-runtime/dependencies/python/bin:$PATH pre-commit run --files tensorrt_llm/_torch/memory/gpu_memory_backend.py tensorrt_llm/_torch/modules/attention.py tensorrt_llm/_torch/modules/linear.py tests/unittest/_torch/pyexecutor/test_model_loader_gms.py tests/unittest/_torch/pyexecutor/test_model_loader_mx.py
  • Focused pytest command attempted but blocked in this macOS shell because transformers is not installed: PYTHONPATH=. PYTHONPYCACHEPREFIX=/tmp/trtllm-wave2-pycache pytest tests/unittest/_torch/pyexecutor/test_model_loader_mx.py tests/unittest/_torch/pyexecutor/test_model_loader_gms.py

Next Steps

Summary by CodeRabbit

Release Notes

  • Bug Fixes

    • Fixed weight initialization sequencing in read-only weight sharing models to ensure structural aliases are properly configured before tensor materialization.
  • Refactor

    • Restructured weight loading and transformation pipeline with enhanced state tracking and idempotency checks to prevent redundant transformations.
    • Updated and consolidated weight initialization hooks across multiple model implementations for improved lifecycle management and consistency.

@chienchunhung chienchunhung force-pushed the codex/staged-hooks-wave2-transform-weights branch from 379b212 to d309240 Compare June 12, 2026 01:10

chienchunhung commented Jun 12, 2026

Copy link
Copy Markdown
Collaborator Author

Superseded stack note: this branch was rebased again after the initial draft PR setup. The current stack is now documented in #15288 (comment).

@chienchunhung

Copy link
Copy Markdown
Collaborator Author

/bot run --disable-fail-fast

@tensorrt-cicd

Copy link
Copy Markdown
Collaborator

PR_Github #53738 [ run ] triggered by Bot. Commit: d309240 Link to invocation

@tensorrt-cicd

Copy link
Copy Markdown
Collaborator

PR_Github #53738 [ run ] completed with state SUCCESS. Commit: d309240
/LLM/main/L0_MergeRequest_PR pipeline #42864 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

CI Agent Failure Analysis

Link to invocation

@chienchunhung chienchunhung force-pushed the codex/staged-hooks-wave2-transform-weights branch from d309240 to 67267df Compare June 12, 2026 18:01

Copy link
Copy Markdown
Collaborator Author

/bot run --disable-fail-fast

chienchunhung commented Jun 12, 2026

Copy link
Copy Markdown
Collaborator Author

CI investigation update: the failed build 42864 ran on old head d309240d0b and old base 2dd5c67358. The dominant Ray failures were caused by the upstream/main regression from #14970: BaseWorker.reset_prefix_cache() conflicted with rlhf_utils.WorkerExtension.reset_prefix_cache(), producing ValueError: Worker class RayGPUWorker already defines 'reset_prefix_cache', which conflicts with extension WorkerExtension.

upstream/main has since reverted that regression in #15306 (db7161b675), so this branch has been rebased and force-pushed onto latest main:

A fresh /bot run --disable-fail-fast was requested in #15288 (comment).

@tensorrt-cicd

Copy link
Copy Markdown
Collaborator

PR_Github #53933 [ run ] triggered by Bot. Commit: 67267df Link to invocation

@tensorrt-cicd

Copy link
Copy Markdown
Collaborator

PR_Github #53933 [ run ] completed with state FAILURE. Commit: 67267df
/LLM/main/L0_MergeRequest_PR pipeline #43026 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

CI Agent Failure Analysis

Link to invocation

Copy link
Copy Markdown
Collaborator Author

/bot run --disable-fail-fast --stage-list "DGX_B200-4_GPUs-PyTorch-Ray-1, DGX_B200-8_GPUs-PyTorch-1"

@tensorrt-cicd

Copy link
Copy Markdown
Collaborator

PR_Github #53990 [ run ] triggered by Bot. Commit: 67267df Link to invocation

@tensorrt-cicd

Copy link
Copy Markdown
Collaborator

PR_Github #53990 [ run ] completed with state SUCCESS. Commit: 67267df
/LLM/main/L0_MergeRequest_PR pipeline #43076 (Partly Tested) completed with status: 'SUCCESS'

CI Report

Link to invocation

@chienchunhung

Copy link
Copy Markdown
Collaborator Author

/bot run --disable-fail-fast

@tensorrt-cicd

Copy link
Copy Markdown
Collaborator

PR_Github #54020 [ run ] triggered by Bot. Commit: 67267df Link to invocation

@tensorrt-cicd

Copy link
Copy Markdown
Collaborator

PR_Github #54020 [ run ] completed with state FAILURE. Commit: 67267df
/LLM/main/L0_MergeRequest_PR pipeline #43104 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

CI Agent Failure Analysis

Link to invocation

Copy link
Copy Markdown
Collaborator Author

/bot run --disable-fail-fast --stage-list "SBSA-Linux"

@tensorrt-cicd

Copy link
Copy Markdown
Collaborator

PR_Github #54145 [ run ] triggered by Bot. Commit: 67267df Link to invocation

@tensorrt-cicd

Copy link
Copy Markdown
Collaborator

PR_Github #54145 [ run ] completed with state FAILURE. Commit: 67267df
/LLM/main/L0_MergeRequest_PR pipeline #43228 (Partly Tested) completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

CI Agent Failure Analysis

Link to invocation

@chienchunhung chienchunhung force-pushed the codex/staged-hooks-wave2-transform-weights branch from 67267df to bfebf3a Compare June 15, 2026 17:13

Copy link
Copy Markdown
Collaborator Author

/bot run

@tensorrt-cicd

Copy link
Copy Markdown
Collaborator

PR_Github #54338 [ run ] triggered by Bot. Commit: bfebf3a Link to invocation

@chienchunhung chienchunhung changed the title [TRTLLM-13246][feat] Wave 2: stage Linear and Attention transforms [TRTLLM-13247][feat] Wave 2: stage Linear and Attention transforms Jun 15, 2026
@tensorrt-cicd

Copy link
Copy Markdown
Collaborator

PR_Github #54338 [ run ] completed with state SUCCESS. Commit: bfebf3a
/LLM/main/L0_MergeRequest_PR pipeline #43409 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

CI Agent Failure Analysis

Link to invocation

Copy link
Copy Markdown
Collaborator Author

/bot run --disable-fail-fast --stage-list "DGX_B200-PyTorch-1"

@tensorrt-cicd

Copy link
Copy Markdown
Collaborator

PR_Github #54369 [ run ] triggered by Bot. Commit: bfebf3a Link to invocation

@tensorrt-cicd

Copy link
Copy Markdown
Collaborator

PR_Github #54369 [ run ] completed with state SUCCESS. Commit: bfebf3a
/LLM/main/L0_MergeRequest_PR pipeline #43439 (Partly Tested) completed with status: 'SUCCESS'

CI Report

Link to invocation

@chienchunhung

Copy link
Copy Markdown
Collaborator Author

/bot run --disable-fail-fast

@coderabbitai

coderabbitai Bot commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

📝 Walkthrough

Walkthrough

The PR splits the post_load_weights lifecycle hook into two distinct hooks: setup_aliases (structural tensor aliasing, must run before GMS materialization) and transform_weights (idempotent weight transformations, guarded by _weights_transformed). Seven model classes rename their hook; Linear and MLA gain idempotency flags; the GMS RO load pipeline is re-ordered accordingly.

Changes

Lifecycle Hook Refactor

Layer / File(s) Summary
transform_weights idempotency in LinearMethodBase, Linear, MLA
tensorrt_llm/_torch/modules/linear.py, tensorrt_llm/_torch/modules/attention.py
LinearMethodBase adds transform_weights (no-op) and routes post_load_weights through it. Quant subclasses (FP8BlockScalesLinearMethod, NVFP4LinearMethod, W4A16NVFP4LinearMethod) override transform_weights instead of post_load_weights. Linear adds _weights_transformed flag reset on init/create/load and implements transform_weights with idempotency guard. MLA similarly extracts resmooth logic into transform_weights with a _weights_transformed guard.
post_load_weightssetup_aliases rename across model classes
tensorrt_llm/_torch/models/modeling_deepseekv3.py, tensorrt_llm/_torch/models/modeling_exaone_moe.py, tensorrt_llm/_torch/models/modeling_glm.py, tensorrt_llm/_torch/models/modeling_gpt_oss.py, tensorrt_llm/_torch/models/modeling_llama.py, tensorrt_llm/_torch/models/modeling_qwen3_moe.py, tensorrt_llm/_torch/models/modeling_qwen3_next.py
Eight model classes rename their layer-norm aliasing hook from post_load_weights to setup_aliases with no logic changes. Related constructor comments in Llama decoder layers and models are updated to reference setup_aliases.
GMS RO pipeline re-ordering and _setup_aliases recursive walk
tensorrt_llm/_torch/pyexecutor/model_loader.py, tensorrt_llm/_torch/memory/gpu_memory_backend.py
On the GMS RO path, the per-module post_load_weights loop is replaced with: _setup_aliases(model)_check_gms_source_identity gate → materialize_module_walk_cache_state. _setup_aliases changes from a single root-model call to a recursive module walk skipping _weights_removed modules. reload() resets _weights_transformed flags before weight loading. Docs in gpu_memory_backend.py update the stated call-order contract.
Tests: GMS ordering, recursive walk, idempotency
tests/unittest/_torch/pyexecutor/test_model_loader_gms.py, tests/unittest/_torch/pyexecutor/test_model_loader_mx.py
GMS tests add setup_aliases/cache_derived_state to _TinyModel, stub SourceIdentity.from_model_config, set RO get_source_identity to None, assert the new RO event sequence, and add a dedicated ordering test asserting post_load_weights is absent on RO. MX tests add imports, stub identity, assert _weights_transformed reset on reload, replace the top-level-only setup_aliases test with a recursive-walk test, and add idempotency tests for Linear.transform_weights and MLA.transform_weights.

Sequence Diagram(s)

sequenceDiagram
  rect rgba(173, 216, 230, 0.5)
    Note over ModelLoader,GMSBackend: GMS RO Load Path
  end
  participant ModelLoader
  participant CheckpointLoader
  participant GMSBackend
  participant Model

  ModelLoader->>CheckpointLoader: post_load_apply(weights_preloaded=True)
  ModelLoader->>Model: _setup_aliases() — recursive walk, skip _weights_removed
  ModelLoader->>GMSBackend: _check_gms_source_identity() — SourceIdentity gate
  ModelLoader->>GMSBackend: materialize_module(model) — bind real tensors
  ModelLoader->>Model: _walk_cache_state() — refresh derived state
  ModelLoader->>CheckpointLoader: post_load_publish()
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • NVIDIA/TensorRT-LLM#14878: Introduces the SourceIdentity / _check_gms_source_identity gate in the GMS RO pipeline that this PR now positions after setup_aliases and before materialize_module.

Suggested labels

api-breaking

Suggested reviewers

  • chang-l
  • brb-nv
  • byshiue
  • galletas1712
  • pcastonguay
  • yechank-nvidia
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 17.39% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title '[TRTLLM-13247][feat] Wave 2: stage Linear and Attention transforms' clearly and concisely summarizes the main change: implementing Wave 2 of a staged rollout to refactor tensor-layout transformations for Linear and Attention modules.
Description check ✅ Passed The PR description comprehensively covers what changed, why it changed, testing performed, and dependencies/prerequisites. It includes a detailed Summary section, What Changed section with specific technical details, clear Test Plan, and dependency graph showing Wave 2's place in the larger rollout.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tensorrt_llm/_torch/models/modeling_gpt_oss.py`:
- Line 634: The setup_aliases method is missing an explicit return type
annotation which violates the coding guidelines requiring all functions to be
annotated with their return types. Add -> None after the closing parenthesis of
the setup_aliases method signature to explicitly indicate that this method does
not return any value. This should be placed between the closing parenthesis and
the colon in the method definition.

In `@tensorrt_llm/_torch/models/modeling_qwen3_next.py`:
- Line 983: Add an explicit `-> None` return type annotation to the
`setup_aliases` method definition. Locate the method definition for
`setup_aliases` and modify it from `def setup_aliases(self):` to `def
setup_aliases(self) -> None:` to comply with the coding guideline that requires
all functions to have return type annotations.

In `@tensorrt_llm/_torch/modules/linear.py`:
- Around line 3145-3149: The `_weights_transformed` flag in the Linear class
becomes inaccurate when using GMS RO (Read-Only) materialization because
`materialize_module()` binds already transformed parameters but the flag remains
False, causing layout transforms to be incorrectly re-applied later. Add a RO
cache-state hook in the Linear module that sets `_weights_transformed = True`
when weights are materialized through the RO path, ensuring the flag truthfully
reflects the actual state of weight transformation. Apply the same state
handling logic to the MLA module to maintain consistency across both modules.
- Around line 383-384: The transform_weights method in LinearMethodBase class
currently violates Ruff rule B027 because it only contains a pass statement in a
concrete (non-abstract) method. Since this is an intentional optional hook that
should remain concrete rather than abstract, replace the pass statement with a
non-empty body such as ellipsis (...) or a docstring to satisfy the Ruff linter
while maintaining the optional hook functionality.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 402ced13-7870-45cc-83ca-6fa005ee6211

📥 Commits

Reviewing files that changed from the base of the PR and between 09449d4 and cf883fc.

📒 Files selected for processing (13)
  • tensorrt_llm/_torch/memory/gpu_memory_backend.py
  • tensorrt_llm/_torch/models/modeling_deepseekv3.py
  • tensorrt_llm/_torch/models/modeling_exaone_moe.py
  • tensorrt_llm/_torch/models/modeling_glm.py
  • tensorrt_llm/_torch/models/modeling_gpt_oss.py
  • tensorrt_llm/_torch/models/modeling_llama.py
  • tensorrt_llm/_torch/models/modeling_qwen3_moe.py
  • tensorrt_llm/_torch/models/modeling_qwen3_next.py
  • tensorrt_llm/_torch/modules/attention.py
  • tensorrt_llm/_torch/modules/linear.py
  • tensorrt_llm/_torch/pyexecutor/model_loader.py
  • tests/unittest/_torch/pyexecutor/test_model_loader_gms.py
  • tests/unittest/_torch/pyexecutor/test_model_loader_mx.py

Comment thread tensorrt_llm/_torch/models/modeling_gpt_oss.py Outdated
Comment thread tensorrt_llm/_torch/models/modeling_qwen3_next.py Outdated
Comment thread tensorrt_llm/_torch/modules/linear.py Outdated
Comment thread tensorrt_llm/_torch/modules/linear.py
@chienchunhung chienchunhung force-pushed the codex/staged-hooks-wave2-transform-weights branch from cf883fc to e5d3175 Compare June 23, 2026 05:11

Copy link
Copy Markdown
Collaborator Author

/bot run

@tensorrt-cicd

Copy link
Copy Markdown
Collaborator

PR_Github #55161 [ run ] triggered by Bot. Commit: e5d3175 Link to invocation

@tensorrt-cicd

Copy link
Copy Markdown
Collaborator

PR_Github #55161 [ run ] completed with state SUCCESS. Commit: e5d3175
/LLM/main/L0_MergeRequest_PR pipeline #44136 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

@chienchunhung chienchunhung force-pushed the codex/staged-hooks-wave2-transform-weights branch 2 times, most recently from abce27d to 896e764 Compare June 23, 2026 17:03

Copy link
Copy Markdown
Collaborator Author

/bot run

@tensorrt-cicd

Copy link
Copy Markdown
Collaborator

PR_Github #55288 [ run ] triggered by Bot. Commit: 896e764 Link to invocation

Copy link
Copy Markdown
Collaborator Author

/bot run

@chienchunhung chienchunhung requested a review from litaotju June 23, 2026 17:43
@tensorrt-cicd

Copy link
Copy Markdown
Collaborator

PR_Github #55288 [ run ] completed with state SUCCESS. Commit: 896e764
/LLM/main/L0_MergeRequest_PR pipeline #44239 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

CI Agent Failure Analysis

Link to invocation

@chienchunhung chienchunhung requested review from QiJune and xxi-nv June 23, 2026 21:39
Signed-off-by: Chien-Chun Hung <2679986+chienchunhung@users.noreply.github.com>
Signed-off-by: Chien-Chun Hung <2679986+chienchunhung@users.noreply.github.com>
@chienchunhung chienchunhung force-pushed the codex/staged-hooks-wave2-transform-weights branch from 71027a6 to 9450f16 Compare June 24, 2026 21:15

Copy link
Copy Markdown
Collaborator Author

/bot run

@tensorrt-cicd

Copy link
Copy Markdown
Collaborator

PR_Github #55595 [ run ] triggered by Bot. Commit: 9450f16 Link to invocation

@tensorrt-cicd

Copy link
Copy Markdown
Collaborator

PR_Github #55595 [ run ] completed with state FAILURE. Commit: 9450f16
/LLM/main/L0_MergeRequest_PR pipeline #44513 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

CI Agent Failure Analysis

Link to invocation

@chienchunhung

Copy link
Copy Markdown
Collaborator Author

/bot run --disable-fail-fast

@tensorrt-cicd

Copy link
Copy Markdown
Collaborator

PR_Github #55610 [ run ] triggered by Bot. Commit: 9450f16 Link to invocation

@tensorrt-cicd

Copy link
Copy Markdown
Collaborator

PR_Github #55610 [ run ] completed with state FAILURE. Commit: 9450f16
/LLM/main/L0_MergeRequest_PR pipeline #44528 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

CI Agent Failure Analysis

Link to invocation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants