Skip to content

Pull requests: vllm-project/tpu-inference

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Add batched rpa e2e test using qwen coder ready ONLY add when PR is ready to merge/full CI is needed
#2694 opened May 20, 2026 by kyuyeunk Collaborator Loading…
[Jax] add JaxLmHead ready ONLY add when PR is ready to merge/full CI is needed
#2693 opened May 20, 2026 by lk-chen Collaborator Loading…
feat: add threadpool to write profiler batch composition stats asynch… ready ONLY add when PR is ready to merge/full CI is needed
#2692 opened May 20, 2026 by junyanxu Collaborator Loading…
Add tuned block sizes for Mistral Small 4 on v7x
#2691 opened May 20, 2026 by karan Collaborator Draft
Gemma-4 E4B-it on JAX path ready ONLY add when PR is ready to merge/full CI is needed
#2690 opened May 20, 2026 by QiliangCui2023 Collaborator Loading…
2 of 3 tasks
[testing]ds mla tuning buildkite
#2689 opened May 20, 2026 by patrickji2014 Collaborator Loading…
Qwen3-VL - Jax
#2685 opened May 20, 2026 by amanseervi Contributor Loading…
preallocate blocks
#2684 opened May 20, 2026 by pv97 Collaborator Loading…
Re-enable JAX 0.10.0 with MoE layout-constraint fix ready ONLY add when PR is ready to merge/full CI is needed
#2683 opened May 20, 2026 by QiliangCui2023 Collaborator Loading…
3 tasks
QKVParallelLinear: derive tp size from sharding_config rather than parallel_config ready ONLY add when PR is ready to merge/full CI is needed
#2682 opened May 20, 2026 by lenscloth Collaborator Loading…
[CI] Add validation for JSON benchmark cases ready ONLY add when PR is ready to merge/full CI is needed
#2681 opened May 20, 2026 by meiyeh123 Collaborator Draft
Patrickji.add kernel test to cicd
#2677 opened May 19, 2026 by patrickji2014 Collaborator Loading…
Make Kernel Tuner Result Inspector CLI Support Filter and Print as Table Format ready ONLY add when PR is ready to merge/full CI is needed
#2675 opened May 19, 2026 by patrickji2014 Collaborator Loading…
[Qwen3.5] Use onehot+matmul for permute and unpermute when batch size is small. ready ONLY add when PR is ready to merge/full CI is needed
#2674 opened May 19, 2026 by wyzhang Collaborator Loading…
Add mla_kernel_tuner and unittest for kernel tuners ready ONLY add when PR is ready to merge/full CI is needed
#2672 opened May 19, 2026 by patrickji2014 Collaborator Loading…
GDN Decode kernel optimization ready ONLY add when PR is ready to merge/full CI is needed
#2671 opened May 19, 2026 by helloworld1 Collaborator Loading…
Migrate MOE E2E ready ONLY add when PR is ready to merge/full CI is needed
#2668 opened May 19, 2026 by pv97 Collaborator Loading…
Add Deepseek benchmark case
#2667 opened May 19, 2026 by boe20211 Collaborator Draft
Add two daily gpt-oss benchmark cases to test the fix from failure to success. ready ONLY add when PR is ready to merge/full CI is needed
#2666 opened May 19, 2026 by CienetStingLin Collaborator Loading…
[releases/v0.21.0][Gemma4][MoE] Fix accuracy regression under jax 0.10 without reverting jax ready ONLY add when PR is ready to merge/full CI is needed
#2665 opened May 19, 2026 by QiliangCui2023 Collaborator Loading…
1 of 3 tasks
platform: override use_custom_op_collectives() to True
#2664 opened May 19, 2026 by colin2328 Loading…
[CompressedTensors] Speed-up w4a8 sharding/requantization ready ONLY add when PR is ready to merge/full CI is needed
#2659 opened May 18, 2026 by jrplatin Collaborator Loading…
[DpSched] Make DP_SCHED_BATCH_PREFILL_FLUSH_TIMEOUT_MS consistent with the default value to be 30,000ms ready ONLY add when PR is ready to merge/full CI is needed
#2656 opened May 18, 2026 by wyzhang Collaborator Loading…
Batched RPA - Optimize KV reshape in prepare_inputs ready ONLY add when PR is ready to merge/full CI is needed
#2653 opened May 18, 2026 by pritha90 Contributor Loading…
ProTip! What’s not been updated in a month: updated:<2026-04-20.