-
Notifications
You must be signed in to change notification settings - Fork 192
Pull requests: vllm-project/tpu-inference
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Add batched rpa e2e test using qwen coder
ready
ONLY add when PR is ready to merge/full CI is needed
#2694
opened May 20, 2026 by
kyuyeunk
Collaborator
Loading…
[Jax] add JaxLmHead
ready
ONLY add when PR is ready to merge/full CI is needed
#2693
opened May 20, 2026 by
lk-chen
Collaborator
Loading…
feat: add threadpool to write profiler batch composition stats asynch…
ready
ONLY add when PR is ready to merge/full CI is needed
#2692
opened May 20, 2026 by
junyanxu
Collaborator
Loading…
Gemma-4 E4B-it on JAX path
ready
ONLY add when PR is ready to merge/full CI is needed
#2690
opened May 20, 2026 by
QiliangCui2023
Collaborator
Loading…
2 of 3 tasks
Re-enable JAX 0.10.0 with MoE layout-constraint fix
ready
ONLY add when PR is ready to merge/full CI is needed
#2683
opened May 20, 2026 by
QiliangCui2023
Collaborator
Loading…
3 tasks
QKVParallelLinear: derive tp size from sharding_config rather than parallel_config
ready
ONLY add when PR is ready to merge/full CI is needed
#2682
opened May 20, 2026 by
lenscloth
Collaborator
Loading…
[CI] Add validation for JSON benchmark cases
ready
ONLY add when PR is ready to merge/full CI is needed
Optimize attn DP + MoE EP ReduceScatter using JAX psum_scatter and avoiding any padding
#2679
opened May 20, 2026 by
zhangamy-crypto
Loading…
Make Kernel Tuner Result Inspector CLI Support Filter and Print as Table Format
ready
ONLY add when PR is ready to merge/full CI is needed
#2675
opened May 19, 2026 by
patrickji2014
Collaborator
Loading…
[Qwen3.5] Use onehot+matmul for permute and unpermute when batch size is small.
ready
ONLY add when PR is ready to merge/full CI is needed
#2674
opened May 19, 2026 by
wyzhang
Collaborator
Loading…
Add mla_kernel_tuner and unittest for kernel tuners
ready
ONLY add when PR is ready to merge/full CI is needed
#2672
opened May 19, 2026 by
patrickji2014
Collaborator
Loading…
GDN Decode kernel optimization
ready
ONLY add when PR is ready to merge/full CI is needed
#2671
opened May 19, 2026 by
helloworld1
Collaborator
Loading…
Migrate MOE E2E
ready
ONLY add when PR is ready to merge/full CI is needed
#2668
opened May 19, 2026 by
pv97
Collaborator
Loading…
Add two daily gpt-oss benchmark cases to test the fix from failure to success.
ready
ONLY add when PR is ready to merge/full CI is needed
#2666
opened May 19, 2026 by
CienetStingLin
Collaborator
Loading…
[releases/v0.21.0][Gemma4][MoE] Fix accuracy regression under jax 0.10 without reverting jax
ready
ONLY add when PR is ready to merge/full CI is needed
#2665
opened May 19, 2026 by
QiliangCui2023
Collaborator
Loading…
1 of 3 tasks
platform: override use_custom_op_collectives() to True
#2664
opened May 19, 2026 by
colin2328
Loading…
[CompressedTensors] Speed-up w4a8 sharding/requantization
ready
ONLY add when PR is ready to merge/full CI is needed
#2659
opened May 18, 2026 by
jrplatin
Collaborator
Loading…
[DpSched] Make DP_SCHED_BATCH_PREFILL_FLUSH_TIMEOUT_MS consistent with the default value to be 30,000ms
ready
ONLY add when PR is ready to merge/full CI is needed
#2656
opened May 18, 2026 by
wyzhang
Collaborator
Loading…
Batched RPA - Optimize KV reshape in prepare_inputs
ready
ONLY add when PR is ready to merge/full CI is needed
#2653
opened May 18, 2026 by
pritha90
Contributor
Loading…
Previous Next
ProTip!
What’s not been updated in a month: updated:<2026-04-20.