[None][chore] Small cleanups to MultimodalModelMixin by 2ez4bz · Pull Request #15322 · NVIDIA/TensorRT-LLM

2ez4bz · 2026-06-12T21:27:48Z

Summary by CodeRabbit

Refactor
- Simplified multimodal encoder output handling to return tensors directly instead of wrapper objects.
- Updated multimodal input processing pipeline with improved internal validation logic.
- Enhanced encoder flexibility to support multiple embedding output formats.
Tests
- Added new test coverage for tensor-based encoder outputs in multimodal input processing.

Description

Test Coverage

PR Checklist

Please review the following before submitting your PR:

PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.
PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.
Test cases are provided for new code paths (see test instructions)
If PR introduces API changes, an appropriate PR label is added - either api-compatible or api-breaking. For api-breaking, include BREAKING in the PR title.
Any new dependencies have been scanned for license and vulnerabilities
CODEOWNERS updated if ownership changes
Documentation updated as needed
Update tava architecture diagram if there is a significant design change in PR.
The reviewers assigned automatically/manually are appropriate for the PR.
Please check this after reviewing the above items as appropriate for this PR.

GitHub Bot Help

To see a list of available CI bot commands, please comment /bot help.

coderabbitai · 2026-06-12T21:34:37Z

📝 Walkthrough

Walkthrough

This PR simplifies the multimodal encoder contract by changing from returning a MultimodalEncoderOutput wrapper dataclass to returning raw torch.Tensor. The changes cascade through the mixin's hook signatures, internal flow orchestration, encoder output normalization, and concrete implementations.

Changes

Multimodal Encoder Contract Simplification

Layer / File(s)	Summary
Encoder contract and dataclass removal `tensorrt_llm/_torch/models/modeling_multimodal_mixin.py`	Removes `MultimodalEncoderOutput` dataclass definition and changes `encode_multimodal_inputs` method to return `torch.Tensor` directly instead of a wrapper.
Hook signature update for tensor inputs `tensorrt_llm/_torch/models/modeling_multimodal_mixin.py`	Updates `after_full_multimodal_embeddings` hook contract to accept and return tensors rather than encoder output objects.
Encoder output normalization support `tensorrt_llm/_torch/models/modeling_multimodal_utils.py`	Introduces `_normalize_encoder_embeddings` helper to support encoder functions returning either a single tensor or a list, normalizing both forms to consistent list representation.
Mixin flow refactoring and validation consolidation `tensorrt_llm/_torch/models/modeling_multimodal_mixin.py`	Refactors `prepare_multimodal_inputs` and `_get_or_encode_multimodal_embeddings` to work directly with tensor outputs; consolidates validation logic via new `_validate_embeddings` method replacing prior separate validation routines.
Mistral model implementation of tensor contract `tensorrt_llm/_torch/models/modeling_mistral.py`	Updates `Mistral3VLM.encode_multimodal_inputs` to return tensor directly and adds `PreparedLlmInputs` to multiline import.
Test helpers and tensor encoder path validation `tests/unittest/_torch/multimodal/test_multimodal_mixin.py`	Introduces `TensorEncoderMultimodalModel` and `make_raw_multimodal_param` helpers; adds test case validating `prepare_multimodal_inputs` with tensor-returning encoder.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 77.78% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check	⚠️ Warning	The PR description is incomplete. While the template structure is present with sections for Description and Test Coverage, both critical sections are empty with only placeholder comments.	Fill in the Description section explaining what changes were made and why, and the Test Coverage section listing relevant tests that validate the changes. The PR introduces significant API changes (removing MultimodalEncoderOutput, changing return types) that need clear explanation.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly summarizes the main change: simplifying the MultimodalModelMixin API by removing the MultimodalEncoderOutput wrapper and making encode_multimodal_inputs return torch.Tensor directly.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

tests/unittest/_torch/multimodal/test_multimodal_mixin.py (1)
69-70: 📐 Maintainability & Code Quality | ⚡ Quick win

Coverage misses the cacheable tensor path.

make_raw_multimodal_param() omits multimodal_runtime.total_embeds_in_request, so this test only exercises the early-return path in get_multimodal_embeddings() and the "skip validation" branch in MultimodalModelMixin._validate_embeddings(). Please add cases in tests/unittest/_torch/multimodal/test_multimodal_mixin.py that populate runtime metadata and assert both successful cache/gather of tensor-returning encoders and a ValueError on row-count mismatch.

As per coding guidelines, changes under tests/** should explicitly call out whether coverage is sufficient and identify concrete follow-up files.

Also applies to: 107-138
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/unittest/_torch/multimodal/test_multimodal_mixin.py` around lines 69 -
70, The test helper make_raw_multimodal_param() currently omits
multimodal_runtime.total_embeds_in_request so tests only hit the early-return
branch of get_multimodal_embeddings() and the "skip validation" branch of
MultimodalModelMixin._validate_embeddings(); update the unit tests to (1) add
cases that populate MultimodalParams.multimodal_runtime.total_embeds_in_request
with the expected row count, (2) verify that tensor-returning encoders are
cached/gathered correctly (exercise the cacheable tensor path in
get_multimodal_embeddings and the gather logic), and (3) add a failing case
where total_embeds_in_request mismatches actual returned rows and assert a
ValueError from _validate_embeddings; target the tests around
make_raw_multimodal_param(), get_multimodal_embeddings, and
MultimodalModelMixin._validate_embeddings so coverage includes both successful
cache/gather and row-count mismatch branches.
Source: Coding guidelines

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tensorrt_llm/_torch/models/modeling_multimodal_mixin.py`:
- Around line 285-290: The current early-return accepts mixed runtime metadata
and causes inconsistent slicing; change the check around the existing "if
runtime is None or runtime.total_embeds_in_request is None:" so you instead
inspect all multimodal_params' runtime/total_embeds_in_request: if every param
lacks runtime.total_embeds_in_request, keep the debug log and return; if some
params have it and some don't, raise an error (ValueError) immediately to fail
fast; update the code path in the same function (referencing multimodal_params,
runtime, total_embeds_in_request and find_input_mm_embeds) to implement this
all-or-none validation.

---

Nitpick comments:
In `@tests/unittest/_torch/multimodal/test_multimodal_mixin.py`:
- Around line 69-70: The test helper make_raw_multimodal_param() currently omits
multimodal_runtime.total_embeds_in_request so tests only hit the early-return
branch of get_multimodal_embeddings() and the "skip validation" branch of
MultimodalModelMixin._validate_embeddings(); update the unit tests to (1) add
cases that populate MultimodalParams.multimodal_runtime.total_embeds_in_request
with the expected row count, (2) verify that tensor-returning encoders are
cached/gathered correctly (exercise the cacheable tensor path in
get_multimodal_embeddings and the gather logic), and (3) add a failing case
where total_embeds_in_request mismatches actual returned rows and assert a
ValueError from _validate_embeddings; target the tests around
make_raw_multimodal_param(), get_multimodal_embeddings, and
MultimodalModelMixin._validate_embeddings so coverage includes both successful
cache/gather and row-count mismatch branches.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 1f1f4bf9-dd79-43d1-87a6-8a1c8794089d

📥 Commits

Reviewing files that changed from the base of the PR and between 646464b and 733542c.

📒 Files selected for processing (4)

tensorrt_llm/_torch/models/modeling_mistral.py
tensorrt_llm/_torch/models/modeling_multimodal_mixin.py
tensorrt_llm/_torch/models/modeling_multimodal_utils.py
tests/unittest/_torch/multimodal/test_multimodal_mixin.py

aswinvisva

lgtm!

2ez4bz · 2026-06-13T06:00:21Z

/bot run

tensorrt-cicd · 2026-06-13T06:05:55Z

PR_Github #54019 [ run ] triggered by Bot. Commit: 1247439 Link to invocation

tensorrt-cicd · 2026-06-13T07:05:25Z

PR_Github #54019 [ run ] completed with state FAILURE. Commit: 1247439
/LLM/main/L0_MergeRequest_PR pipeline #43103 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

Please check the failed tests and fix your PR
If you cannot view the failures, ask the CI triggerer to share details
Once fixed, request an NVIDIA team member to trigger CI again

CI Agent Failure Analysis

Link to invocation

2ez4bz · 2026-06-14T06:08:41Z

/bot run

tensorrt-cicd · 2026-06-14T06:14:06Z

PR_Github #54090 [ run ] triggered by Bot. Commit: ebffba0 Link to invocation

tensorrt-cicd · 2026-06-14T06:59:38Z

PR_Github #54090 [ run ] completed with state SUCCESS. Commit: ebffba0
/LLM/main/L0_MergeRequest_PR pipeline #43174 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

Please check the failed tests and fix your PR
If you cannot view the failures, ask the CI triggerer to share details
Once fixed, request an NVIDIA team member to trigger CI again

CI Agent Failure Analysis

Link to invocation

2ez4bz · 2026-06-16T05:21:09Z

/bot run --disable-fail-fast

github-actions · 2026-06-16T05:26:08Z

👎 Promotion blocked, new vulnerability found

Vulnerability report

Component	Vulnerability	Description	Severity
pytorch	CVE-2025-3000	A vulnerability classified as critical has been found in PyTorch 2.6.0. This affects the function torch.jit.script. The manipulation leads to memory corruption. It is possible to launch the attack on the local host. The exploit has been disclosed to the public and may be used.	MEDIUM

Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>

2ez4bz · 2026-06-18T04:26:29Z

/bot run --disable-fail-fast

tensorrt-cicd · 2026-06-18T04:32:50Z

PR_Github #54904 [ run ] triggered by Bot. Commit: 796222d Link to invocation

tensorrt-cicd · 2026-06-18T08:04:41Z

PR_Github #54904 [ run ] completed with state SUCCESS. Commit: 796222d
/LLM/main/L0_MergeRequest_PR pipeline #43908 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

Please check the failed tests and fix your PR
If you cannot view the failures, ask the CI triggerer to share details
Once fixed, request an NVIDIA team member to trigger CI again

CI Agent Failure Analysis

Link to invocation

2ez4bz requested review from a team as code owners June 12, 2026 21:27

2ez4bz requested review from jdebache, tijyojwad and yechank-nvidia June 12, 2026 21:27

github-actions Bot assigned 2ez4bz Jun 12, 2026

coderabbitai Bot reviewed Jun 12, 2026

View reviewed changes

Comment thread tensorrt_llm/_torch/models/modeling_multimodal_mixin.py Outdated

moraxu reviewed Jun 12, 2026

View reviewed changes

Comment thread tensorrt_llm/_torch/models/modeling_multimodal_mixin.py

aswinvisva approved these changes Jun 12, 2026

View reviewed changes

moraxu approved these changes Jun 12, 2026

View reviewed changes

2ez4bz force-pushed the dev-mixin-cleanups branch from 733542c to 1247439 Compare June 13, 2026 04:22

2ez4bz enabled auto-merge (squash) June 13, 2026 06:00

[None][chore] Small cleanups to MultimodalModelMixin

796222d

Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>

2ez4bz force-pushed the dev-mixin-cleanups branch from f283e2a to 796222d Compare June 18, 2026 04:25

Conversation

2ez4bz commented Jun 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Description

Test Coverage

PR Checklist

GitHub Bot Help

Uh oh!

coderabbitai Bot commented Jun 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (2 warnings)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

aswinvisva left a comment

Choose a reason for hiding this comment

Uh oh!

2ez4bz commented Jun 13, 2026

Uh oh!

tensorrt-cicd commented Jun 13, 2026

Uh oh!

tensorrt-cicd commented Jun 13, 2026

Uh oh!

2ez4bz commented Jun 14, 2026

Uh oh!

tensorrt-cicd commented Jun 14, 2026

Uh oh!

tensorrt-cicd commented Jun 14, 2026

Uh oh!

2ez4bz commented Jun 16, 2026

Uh oh!

github-actions Bot commented Jun 16, 2026

👎 Promotion blocked, new vulnerability found

Vulnerability report

Uh oh!

2ez4bz commented Jun 18, 2026

Uh oh!

tensorrt-cicd commented Jun 18, 2026

Uh oh!

tensorrt-cicd commented Jun 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

2ez4bz commented Jun 12, 2026 •

edited

Loading

coderabbitai Bot commented Jun 12, 2026 •

edited

Loading