[fix]Add logits_to_keep and shift_labels support for Qwen3-VL and Qwen3-VL-MoE#1181
Merged
Mecoli1219 merged 3 commits intolinkedin:mainfrom Apr 17, 2026
Merged
Conversation
Contributor
Author
Test Environment
Long-Sequence BenchmarkDense:
|
| Sequence Length | Full Logits Latency | Keep1 Latency | Latency Delta | Full Peak Mem | Keep1 Peak Mem | Memory Delta |
|---|---|---|---|---|---|---|
| 755 | 0.1111 s | 0.1051 s | -5.4% | 16.809 GB | 16.674 GB | -0.135 GB |
| 2795 | 0.3733 s | 0.3484 s | -6.7% | 17.979 GB | 17.469 GB | -0.510 GB |
| 5515 | 0.7592 s | 0.7097 s | -6.5% | 19.539 GB | 18.531 GB | -1.008 GB |
| 10955 | 1.6723 s | 1.5753 s | -5.8% | 22.660 GB | 20.655 GB | -2.005 GB |
MoE: Qwen3-VL-30B-A3B-Instruct
| Sequence Length | Full Logits Latency | Keep1 Latency | Latency Delta | Full Peak Mem | Keep1 Peak Mem | Memory Delta |
|---|---|---|---|---|---|---|
| 755 | 0.2218 s | 0.2245 s | +1.2% | 58.348 GB | 58.313 GB | -0.035 GB |
| 2795 | 0.3642 s | 0.3510 s | -3.6% | 59.511 GB | 59.365 GB | -0.146 GB |
| 5515 | 0.6716 s | 0.6461 s | -3.8% | 61.061 GB | 60.772 GB | -0.289 GB |
| 10955 | 1.4846 s | 1.4352 s | -3.3% | 64.161 GB | 63.585 GB | -0.576 GB |
Contributor
Author
|
@Mecoli1219 @Tcc0403 Could you please review this PR and approve the pending workflows when you have a chance? The implementation follows the same approach currently used for qwen3, qwen3-moe, and qwen3.5 in the repository. Thank you! |
Mecoli1219
approved these changes
Apr 10, 2026
Collaborator
Mecoli1219
left a comment
There was a problem hiding this comment.
Overall looks good to me. Thanks for contribution!
Contributor
Author
|
@Mecoli1219 Since it already has approval, could this be merged if there are no further concerns? |
Collaborator
|
Yes. Feel free to merge it! |
Contributor
Author
If you have permission, could you please merge this PR when you have a moment? Thanks! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR adds
logits_to_keepandshift_labelssupport for bothQwen3-VLandQwen3-VL-MoEin the Liger-patched forward path. The change aligns the patched implementation with the expected Hugging Face interface and enables selective logits materialization for long-context inference.Testing Done
make testGRPO,fused_neighborhood_attention, andgemma3monkey patch testsmake test-convergencetest/convergence/bf16/test_mini_models_multimodal.py::test_mini_model_multimodal[mini_llama4-...]make checkstyleKnown limitation:
make test/make test-convergencecases above do not directly exercise theQwen3-VLorQwen3-VL-MoElogits_to_keep/shift_labelschange in this PR