Skip to content

[None][chore] Small cleanups to MultimodalModelMixin#15322

Open
2ez4bz wants to merge 1 commit into
NVIDIA:mainfrom
2ez4bz:dev-mixin-cleanups
Open

[None][chore] Small cleanups to MultimodalModelMixin#15322
2ez4bz wants to merge 1 commit into
NVIDIA:mainfrom
2ez4bz:dev-mixin-cleanups

Conversation

@2ez4bz

@2ez4bz 2ez4bz commented Jun 12, 2026

Copy link
Copy Markdown
Collaborator

Summary by CodeRabbit

  • Refactor

    • Simplified multimodal encoder output handling to return tensors directly instead of wrapper objects.
    • Updated multimodal input processing pipeline with improved internal validation logic.
    • Enhanced encoder flexibility to support multiple embedding output formats.
  • Tests

    • Added new test coverage for tensor-based encoder outputs in multimodal input processing.

Description

Test Coverage

PR Checklist

Please review the following before submitting your PR:

  • PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.

  • PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.

  • Test cases are provided for new code paths (see test instructions)

  • If PR introduces API changes, an appropriate PR label is added - either api-compatible or api-breaking. For api-breaking, include BREAKING in the PR title.

  • Any new dependencies have been scanned for license and vulnerabilities

  • CODEOWNERS updated if ownership changes

  • Documentation updated as needed

  • Update tava architecture diagram if there is a significant design change in PR.

  • The reviewers assigned automatically/manually are appropriate for the PR.

  • Please check this after reviewing the above items as appropriate for this PR.

GitHub Bot Help

To see a list of available CI bot commands, please comment /bot help.

@2ez4bz 2ez4bz requested review from a team as code owners June 12, 2026 21:27
@coderabbitai

coderabbitai Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

📝 Walkthrough

Walkthrough

This PR simplifies the multimodal encoder contract by changing from returning a MultimodalEncoderOutput wrapper dataclass to returning raw torch.Tensor. The changes cascade through the mixin's hook signatures, internal flow orchestration, encoder output normalization, and concrete implementations.

Changes

Multimodal Encoder Contract Simplification

Layer / File(s) Summary
Encoder contract and dataclass removal
tensorrt_llm/_torch/models/modeling_multimodal_mixin.py
Removes MultimodalEncoderOutput dataclass definition and changes encode_multimodal_inputs method to return torch.Tensor directly instead of a wrapper.
Hook signature update for tensor inputs
tensorrt_llm/_torch/models/modeling_multimodal_mixin.py
Updates after_full_multimodal_embeddings hook contract to accept and return tensors rather than encoder output objects.
Encoder output normalization support
tensorrt_llm/_torch/models/modeling_multimodal_utils.py
Introduces _normalize_encoder_embeddings helper to support encoder functions returning either a single tensor or a list, normalizing both forms to consistent list representation.
Mixin flow refactoring and validation consolidation
tensorrt_llm/_torch/models/modeling_multimodal_mixin.py
Refactors prepare_multimodal_inputs and _get_or_encode_multimodal_embeddings to work directly with tensor outputs; consolidates validation logic via new _validate_embeddings method replacing prior separate validation routines.
Mistral model implementation of tensor contract
tensorrt_llm/_torch/models/modeling_mistral.py
Updates Mistral3VLM.encode_multimodal_inputs to return tensor directly and adds PreparedLlmInputs to multiline import.
Test helpers and tensor encoder path validation
tests/unittest/_torch/multimodal/test_multimodal_mixin.py
Introduces TensorEncoderMultimodalModel and make_raw_multimodal_param helpers; adds test case validating prepare_multimodal_inputs with tensor-returning encoder.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 77.78% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check ⚠️ Warning The PR description is incomplete. While the template structure is present with sections for Description and Test Coverage, both critical sections are empty with only placeholder comments. Fill in the Description section explaining what changes were made and why, and the Test Coverage section listing relevant tests that validate the changes. The PR introduces significant API changes (removing MultimodalEncoderOutput, changing return types) that need clear explanation.
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main change: simplifying the MultimodalModelMixin API by removing the MultimodalEncoderOutput wrapper and making encode_multimodal_inputs return torch.Tensor directly.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
tests/unittest/_torch/multimodal/test_multimodal_mixin.py (1)

69-70: 📐 Maintainability & Code Quality | ⚡ Quick win

Coverage misses the cacheable tensor path.

make_raw_multimodal_param() omits multimodal_runtime.total_embeds_in_request, so this test only exercises the early-return path in get_multimodal_embeddings() and the "skip validation" branch in MultimodalModelMixin._validate_embeddings(). Please add cases in tests/unittest/_torch/multimodal/test_multimodal_mixin.py that populate runtime metadata and assert both successful cache/gather of tensor-returning encoders and a ValueError on row-count mismatch.

As per coding guidelines, changes under tests/** should explicitly call out whether coverage is sufficient and identify concrete follow-up files.

Also applies to: 107-138

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/unittest/_torch/multimodal/test_multimodal_mixin.py` around lines 69 -
70, The test helper make_raw_multimodal_param() currently omits
multimodal_runtime.total_embeds_in_request so tests only hit the early-return
branch of get_multimodal_embeddings() and the "skip validation" branch of
MultimodalModelMixin._validate_embeddings(); update the unit tests to (1) add
cases that populate MultimodalParams.multimodal_runtime.total_embeds_in_request
with the expected row count, (2) verify that tensor-returning encoders are
cached/gathered correctly (exercise the cacheable tensor path in
get_multimodal_embeddings and the gather logic), and (3) add a failing case
where total_embeds_in_request mismatches actual returned rows and assert a
ValueError from _validate_embeddings; target the tests around
make_raw_multimodal_param(), get_multimodal_embeddings, and
MultimodalModelMixin._validate_embeddings so coverage includes both successful
cache/gather and row-count mismatch branches.

Source: Coding guidelines

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tensorrt_llm/_torch/models/modeling_multimodal_mixin.py`:
- Around line 285-290: The current early-return accepts mixed runtime metadata
and causes inconsistent slicing; change the check around the existing "if
runtime is None or runtime.total_embeds_in_request is None:" so you instead
inspect all multimodal_params' runtime/total_embeds_in_request: if every param
lacks runtime.total_embeds_in_request, keep the debug log and return; if some
params have it and some don't, raise an error (ValueError) immediately to fail
fast; update the code path in the same function (referencing multimodal_params,
runtime, total_embeds_in_request and find_input_mm_embeds) to implement this
all-or-none validation.

---

Nitpick comments:
In `@tests/unittest/_torch/multimodal/test_multimodal_mixin.py`:
- Around line 69-70: The test helper make_raw_multimodal_param() currently omits
multimodal_runtime.total_embeds_in_request so tests only hit the early-return
branch of get_multimodal_embeddings() and the "skip validation" branch of
MultimodalModelMixin._validate_embeddings(); update the unit tests to (1) add
cases that populate MultimodalParams.multimodal_runtime.total_embeds_in_request
with the expected row count, (2) verify that tensor-returning encoders are
cached/gathered correctly (exercise the cacheable tensor path in
get_multimodal_embeddings and the gather logic), and (3) add a failing case
where total_embeds_in_request mismatches actual returned rows and assert a
ValueError from _validate_embeddings; target the tests around
make_raw_multimodal_param(), get_multimodal_embeddings, and
MultimodalModelMixin._validate_embeddings so coverage includes both successful
cache/gather and row-count mismatch branches.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 1f1f4bf9-dd79-43d1-87a6-8a1c8794089d

📥 Commits

Reviewing files that changed from the base of the PR and between 646464b and 733542c.

📒 Files selected for processing (4)
  • tensorrt_llm/_torch/models/modeling_mistral.py
  • tensorrt_llm/_torch/models/modeling_multimodal_mixin.py
  • tensorrt_llm/_torch/models/modeling_multimodal_utils.py
  • tests/unittest/_torch/multimodal/test_multimodal_mixin.py

Comment thread tensorrt_llm/_torch/models/modeling_multimodal_mixin.py Outdated
Comment thread tensorrt_llm/_torch/models/modeling_multimodal_mixin.py

@aswinvisva aswinvisva left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

@2ez4bz 2ez4bz force-pushed the dev-mixin-cleanups branch from 733542c to 1247439 Compare June 13, 2026 04:22
@2ez4bz

2ez4bz commented Jun 13, 2026

Copy link
Copy Markdown
Collaborator Author

/bot run

@2ez4bz 2ez4bz enabled auto-merge (squash) June 13, 2026 06:00
@tensorrt-cicd

Copy link
Copy Markdown
Collaborator

PR_Github #54019 [ run ] triggered by Bot. Commit: 1247439 Link to invocation

@tensorrt-cicd

Copy link
Copy Markdown
Collaborator

PR_Github #54019 [ run ] completed with state FAILURE. Commit: 1247439
/LLM/main/L0_MergeRequest_PR pipeline #43103 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

CI Agent Failure Analysis

Link to invocation

@2ez4bz

2ez4bz commented Jun 14, 2026

Copy link
Copy Markdown
Collaborator Author

/bot run

@tensorrt-cicd

Copy link
Copy Markdown
Collaborator

PR_Github #54090 [ run ] triggered by Bot. Commit: ebffba0 Link to invocation

@tensorrt-cicd

Copy link
Copy Markdown
Collaborator

PR_Github #54090 [ run ] completed with state SUCCESS. Commit: ebffba0
/LLM/main/L0_MergeRequest_PR pipeline #43174 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

CI Agent Failure Analysis

Link to invocation

@2ez4bz

2ez4bz commented Jun 16, 2026

Copy link
Copy Markdown
Collaborator Author

/bot run --disable-fail-fast

@github-actions

Copy link
Copy Markdown

👎 Promotion blocked, new vulnerability found

Vulnerability report

Component Vulnerability Description Severity
pytorch CVE-2025-3000 A vulnerability classified as critical has been found in PyTorch 2.6.0. This affects the function torch.jit.script. The manipulation leads to memory corruption. It is possible to launch the attack on the local host. The exploit has been disclosed to the public and may be used. MEDIUM

Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>
@2ez4bz 2ez4bz force-pushed the dev-mixin-cleanups branch from f283e2a to 796222d Compare June 18, 2026 04:25
@2ez4bz

2ez4bz commented Jun 18, 2026

Copy link
Copy Markdown
Collaborator Author

/bot run --disable-fail-fast

@tensorrt-cicd

Copy link
Copy Markdown
Collaborator

PR_Github #54904 [ run ] triggered by Bot. Commit: 796222d Link to invocation

@tensorrt-cicd

Copy link
Copy Markdown
Collaborator

PR_Github #54904 [ run ] completed with state SUCCESS. Commit: 796222d
/LLM/main/L0_MergeRequest_PR pipeline #43908 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

CI Agent Failure Analysis

Link to invocation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants