Skip to content

Pull requests: sgl-project/sglang

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

[DSv4]Fix attention CP startup barrier before CUDA graph capture
#25581 opened May 18, 2026 by huangzhilin-hzl Contributor Loading…
3 of 5 tasks
fix(mxfp4): route AITER MXFP4+swiglu through FlyDSL gate_mode=INTERLEAVE
#25580 opened May 18, 2026 by bingxche Collaborator Draft
2 of 5 tasks
docs: fix v6e topology documentation Improvements or additions to documentation
#25578 opened May 18, 2026 by zhengkezhou1 Contributor Loading…
5 tasks
[Deps] Use cu13 extra for nvidia cutlass dsl dependencies Pull requests that update a dependency file run-ci
#25576 opened May 18, 2026 by mmangkad Contributor Loading…
Use triton_attn as default vision attention on B300 (SM103) Multi-modal multi-modal language model run-ci
#25570 opened May 18, 2026 by yhyang201 Collaborator Loading…
Add DeepSeekV4 fused MoE Triton autotune support
#25569 opened May 18, 2026 by xieminghe1 Contributor Loading…
5 tasks
Fix PCG silently bypassed for EAGLE speculative decoding
#25568 opened May 18, 2026 by narutolhy Contributor Loading…
3 of 5 tasks
[LoRA] Fix slot accounting for chunked prefill requests
#25567 opened May 18, 2026 by HuskyLYL Loading…
3 of 5 tasks
fix(disagg): unstuck decode aborts under prealloc pressure run-ci
#25561 opened May 18, 2026 by whybeyoung Collaborator Loading…
[AMD] TEST - disable preshuffle for nsa path deepseek
#25559 opened May 18, 2026 by yctseng0211 Collaborator Draft
5 tasks
Improve performance for mrope position compute
#25557 opened May 18, 2026 by zhaozx-cn Contributor Loading…
5 tasks
[AMD] Fix correctness for AITER MLA backend with --page-size > 1
#25556 opened May 18, 2026 by Duyi-Wang Contributor Loading…
[MXFP4 KV Cache] MXFP4 KV Cache: support blockfp4-hadamard-quant quant LLM Quantization
#25555 opened May 18, 2026 by DehuaTang Loading…
1 task
[WIP][Gemma4] Triton extend_attention tuning to improve TTFT and TPOT
#25550 opened May 18, 2026 by kpham-sgl Collaborator Loading…
5 tasks done
Respect user override for Gemma4 attention backend
#25547 opened May 17, 2026 by kpham-sgl Collaborator Loading…
4 tasks done
[Spec] Add trtllm_mha support for Gemma 4 MTP draft attention backend
#25545 opened May 17, 2026 by kpham-sgl Collaborator Loading…
5 tasks done
Codec: token-native binary transport for completions streaming
#25544 opened May 17, 2026 by wdunn001 Loading…
4 tasks
[XPU] Expand Stage B with proven AMD/CUDA tests deepseek
#25543 opened May 17, 2026 by arathi-hlab Loading…
4 tasks
ProTip! Follow long discussions with comments:>50.