[TRTLLM-11353][feat] API to configure TeaCache coefficients#13170
Conversation
37625ce to
55cab64
Compare
|
/bot run --disable-fail-fast |
|
PR_Github #44077 [ run ] triggered by Bot. Commit: |
|
PR_Github #44077 [ run ] completed with state
|
|
/bot run --disable-fail-fast |
|
PR_Github #44510 [ run ] triggered by Bot. Commit: |
|
PR_Github #44510 [ run ] completed with state
|
|
/bot run --disable-fail-fast |
|
PR_Github #44802 [ run ] triggered by Bot. Commit: |
|
PR_Github #44802 [ run ] completed with state
|
|
/bot run --disable-fail-fast |
|
PR_Github #45000 [ run ] triggered by Bot. Commit: |
|
PR_Github #45000 [ run ] completed with state |
|
/bot run --disable-fail-fast |
|
PR_Github #45462 [ run ] triggered by Bot. Commit: |
|
PR_Github #45462 [ run ] completed with state
|
|
/bot run --disable-fail-fast --add-multi-gpu-test |
|
PR_Github #46670 [ run ] triggered by Bot. Commit: |
|
PR_Github #46670 [ run ] completed with state
|
|
/bot run --disable-fail-fast --add-multi-gpu-test |
|
PR_Github #46860 [ run ] triggered by Bot. Commit: |
|
PR_Github #46860 [ run ] completed with state |
|
/bot run --disable-fail-fast --add-multi-gpu-test |
|
PR_Github #46865 [ run ] triggered by Bot. Commit: |
|
PR_Github #46865 [ run ] completed with state
|
|
/bot run --disable-fail-fast --add-multi-gpu-test |
2 similar comments
|
/bot run --disable-fail-fast --add-multi-gpu-test |
|
/bot run --disable-fail-fast --add-multi-gpu-test |
|
PR_Github #47038 [ run ] triggered by Bot. Commit: |
|
PR_Github #55099 [ run ] completed with state
|
|
/bot run --disable-fail-fast |
|
PR_Github #55274 [ run ] triggered by Bot. Commit: |
|
PR_Github #55274 [ run ] completed with state
|
|
/bot run --disable-fail-fast |
|
PR_Github #55340 [ run ] triggered by Bot. Commit: |
|
PR_Github #55340 [ run ] completed with state
|
|
/bot run --disable-fail-fast |
|
PR_Github #55352 [ run ] triggered by Bot. Commit: |
|
PR_Github #55352 [ run ] completed with state
|
|
/bot run --disable-fail-fast |
|
PR_Github #55535 [ run ] triggered by Bot. Commit: |
|
PR_Github #55535 [ run ] completed with state
|
|
/bot run --disable-fail-fast |
|
PR_Github #55551 [ run ] triggered by Bot. Commit: |
|
PR_Github #55551 [ run ] completed with state
|
|
/bot run --disable-fail-fast |
|
PR_Github #55566 [ run ] triggered by Bot. Commit: |
|
PR_Github #55566 [ run ] completed with state
|
|
/bot run --stage-list "GB200-4_GPUs-PyTorch-PerfSanity-1,GB200-4_GPUs-PyTorch-PerfSanity-2" |
|
PR_Github #55576 [ run ] triggered by Bot. Commit: |
|
PR_Github #55576 [ run ] completed with state |
|
/bot reuse-pipeline |
|
PR_Github #55637 [ reuse-pipeline ] triggered by Bot. Commit: |
|
PR_Github #55637 [ reuse-pipeline ] completed with state |
Summary by CodeRabbit
New Features
Documentation
Tests
Description
Extends TeaCache to accept user-supplied polynomial coefficients (rather than relying solely on the built-in checkpoint lookup table), unlocking two use cases:
Enable TeaCache on previously unsupported models — Wan 2.2 (T2V A14B, I2V A14B, TI2V-5B) and LTX-2, which had no entries in the built-in coefficient table.
Override defaults on already-supported models (Wan 2.1, FLUX.1, FLUX.2) to tune the rescale polynomial for a custom quality/latency trade-off.
When coefficients is omitted, the pipeline falls back to checkpoint-path matching against the built-in table (existing behavior preserved).
How to supply coefficients:
1.) Via YAML (passed to trtllm-serve --extra_visual_gen_options or to offline example scripts via the same flag):
coefficients is the polynomial mapping raw → rescaled embedding distance (evaluated via np.poly1d). coefficients_2 is the second-stage polynomial used only by Wan 2.2 dual-transformer pipelines.
2.) Via CLI (offline example scripts):
Test Coverage
PR Checklist
Please review the following before submitting your PR:
PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.
PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.
Test cases are provided for new code paths (see test instructions)
Any new dependencies have been scanned for license and vulnerabilities
CODEOWNERS updated if ownership changes
Documentation updated as needed
Update tava architecture diagram if there is a significant design change in PR.
The reviewers assigned automatically/manually are appropriate for the PR.
Please check this after reviewing the above items as appropriate for this PR.
GitHub Bot Help
To see a list of available CI bot commands, please comment
/bot help.