[TRTLLM-11353][feat] API to configure TeaCache coefficients by o-stoner · Pull Request #13170 · NVIDIA/TensorRT-LLM

o-stoner · 2026-04-17T23:39:44Z

Summary by CodeRabbit

New Features
- Added CLI options to override TeaCache polynomial coefficients for visual generation models.
- Enabled TeaCache support for LTX-2 with explicit coefficient configuration requirements.
- Enhanced dual-transformer TeaCache handling for Wan 2.2 models.
Documentation
- Clarified per-model TeaCache coefficient requirements in feature matrix and configuration guides.
- Updated examples with TeaCache configuration details and usage instructions.
Tests
- Expanded TeaCache unit test coverage for coefficient resolution, validation, and multi-backend scenarios.

Description

Extends TeaCache to accept user-supplied polynomial coefficients (rather than relying solely on the built-in checkpoint lookup table), unlocking two use cases:

Enable TeaCache on previously unsupported models — Wan 2.2 (T2V A14B, I2V A14B, TI2V-5B) and LTX-2, which had no entries in the built-in coefficient table.
Override defaults on already-supported models (Wan 2.1, FLUX.1, FLUX.2) to tune the rescale polynomial for a custom quality/latency trade-off.

When coefficients is omitted, the pipeline falls back to checkpoint-path matching against the built-in table (existing behavior preserved).

How to supply coefficients:

1.) Via YAML (passed to trtllm-serve --extra_visual_gen_options or to offline example scripts via the same flag):

cache:
  cache_backend: teacache
  teacache_thresh: 0.2
  use_ret_steps: false
  coefficients: [c0, c1, ...] # optional override; REQUIRED for LTX-2 and Wan 2.2
  coefficients_2: [c0, c1, ...] # REQUIRED for Wan 2.2 dual-stage (T2V/I2V A14B)

coefficients is the polynomial mapping raw → rescaled embedding distance (evaluated via np.poly1d). coefficients_2 is the second-stage polynomial used only by Wan 2.2 dual-transformer pipelines.

2.) Via CLI (offline example scripts):

--enable_teacache \
--teacache_thresh 0.2 \
--teacache_coefficients <c0> <c1> ... \
--teacache_coefficients_2 <c0> <c1> ...   # Wan 2.2 dual-stage only

Test Coverage

PR Checklist

Please review the following before submitting your PR:

PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.
PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.
Test cases are provided for new code paths (see test instructions)
Any new dependencies have been scanned for license and vulnerabilities
CODEOWNERS updated if ownership changes
Documentation updated as needed
Update tava architecture diagram if there is a significant design change in PR.
The reviewers assigned automatically/manually are appropriate for the PR.
Please check this after reviewing the above items as appropriate for this PR.

GitHub Bot Help

To see a list of available CI bot commands, please comment /bot help.

o-stoner · 2026-04-18T00:02:22Z

/bot run --disable-fail-fast

tensorrt-cicd · 2026-04-18T00:08:40Z

PR_Github #44077 [ run ] triggered by Bot. Commit: 55cab64 Link to invocation

tensorrt-cicd · 2026-04-18T13:49:25Z

PR_Github #44077 [ run ] completed with state SUCCESS. Commit: 55cab64
/LLM/main/L0_MergeRequest_PR pipeline #34507 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

Please check the failed tests and fix your PR
If you cannot view the failures, ask the CI triggerer to share details
Once fixed, request an NVIDIA team member to trigger CI again

Uh oh!

Conversation

o-stoner commented Apr 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Description

Test Coverage

PR Checklist

GitHub Bot Help

Uh oh!

o-stoner commented Apr 18, 2026

Uh oh!

tensorrt-cicd commented Apr 18, 2026

Uh oh!

tensorrt-cicd commented Apr 18, 2026

Uh oh!

o-stoner commented Apr 20, 2026

Uh oh!

tensorrt-cicd commented Apr 20, 2026

Uh oh!

tensorrt-cicd commented Apr 21, 2026

Uh oh!

o-stoner commented Apr 21, 2026

Uh oh!

tensorrt-cicd commented Apr 21, 2026

Uh oh!

tensorrt-cicd commented Apr 22, 2026

Uh oh!

o-stoner commented Apr 22, 2026

Uh oh!

tensorrt-cicd commented Apr 22, 2026

Uh oh!

tensorrt-cicd commented Apr 22, 2026

Uh oh!

o-stoner commented Apr 25, 2026

Uh oh!

tensorrt-cicd commented Apr 25, 2026

Uh oh!

tensorrt-cicd commented Apr 25, 2026

Uh oh!

o-stoner commented May 4, 2026

Uh oh!

tensorrt-cicd commented May 4, 2026

Uh oh!

tensorrt-cicd commented May 5, 2026

Uh oh!

o-stoner commented May 5, 2026

Uh oh!

tensorrt-cicd commented May 5, 2026

Uh oh!

tensorrt-cicd commented May 5, 2026

Uh oh!

o-stoner commented May 5, 2026

Uh oh!

tensorrt-cicd commented May 5, 2026

Uh oh!

tensorrt-cicd commented May 6, 2026

Uh oh!

o-stoner commented May 6, 2026

Uh oh!

o-stoner commented May 6, 2026

Uh oh!

tburt-nv commented May 6, 2026

Uh oh!

tensorrt-cicd commented May 6, 2026

Uh oh!

tensorrt-cicd commented Jun 23, 2026

Uh oh!

o-stoner commented Jun 23, 2026

Uh oh!

tensorrt-cicd commented Jun 23, 2026

Uh oh!

tensorrt-cicd commented Jun 23, 2026

Uh oh!

o-stoner commented Jun 23, 2026

Uh oh!

tensorrt-cicd commented Jun 23, 2026

Uh oh!

tensorrt-cicd commented Jun 23, 2026

Uh oh!

o-stoner commented Apr 17, 2026 •

edited

Loading