-
Notifications
You must be signed in to change notification settings - Fork 6k
Pull requests: sgl-project/sglang
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[DSv4]Fix attention CP startup barrier before CUDA graph capture
#25581
opened May 18, 2026 by
huangzhilin-hzl
Contributor
Loading…
3 of 5 tasks
docs: fix v6e topology
documentation
Improvements or additions to documentation
#25578
opened May 18, 2026 by
zhengkezhou1
Contributor
Loading…
5 tasks
[Quantization][Bugfix] Preserve 3-D output shape in Quark W4A4 MXFP4 apply_weights
#25577
opened May 18, 2026 by
spandantiwari
Loading…
5 tasks done
[Deps] Use cu13 extra for nvidia cutlass dsl
dependencies
Pull requests that update a dependency file
run-ci
#25576
opened May 18, 2026 by
mmangkad
Contributor
Loading…
fix(dsv4 topk_v2): honor cluster contract in fused kernel SMALL/TRIVIAL branches
jit-kernel
#25575
opened May 18, 2026 by
GavinZhu-GMI
Contributor
Loading…
6 of 8 tasks
[sgl-kernel] Fix MUSA wheel METADATA version to match +musa<suffix> filename
mthreads
#25573
opened May 18, 2026 by
Kangyan-Zhou
Collaborator
•
Draft
[WIP][MIS] Fix scheduler crash when /v1/score and /generate run concurrently
#25572
opened May 18, 2026 by
kevin85421
Collaborator
•
Draft
5 tasks
[Benchmark] Add SGLANG_SIMULATE_UNIFORM_EXPERTS for balanced expert routing with dummy weights
run-ci
#25571
opened May 18, 2026 by
ByronHsu
Collaborator
Loading…
4 tasks done
Use triton_attn as default vision attention on B300 (SM103)
Multi-modal
multi-modal language model
run-ci
#25570
opened May 18, 2026 by
yhyang201
Collaborator
Loading…
Add DeepSeekV4 fused MoE Triton autotune support
#25569
opened May 18, 2026 by
xieminghe1
Contributor
Loading…
5 tasks
Fix PCG silently bypassed for EAGLE speculative decoding
#25568
opened May 18, 2026 by
narutolhy
Contributor
Loading…
3 of 5 tasks
[LoRA] Fix slot accounting for chunked prefill requests
#25567
opened May 18, 2026 by
HuskyLYL
Loading…
3 of 5 tasks
[Spec] fold can_run_cuda_graph into EagleVerifyOutput; drop dead extend-after-decode check
#25566
opened May 18, 2026 by
hnyls2002
Collaborator
Loading…
fix(disagg): unstuck decode aborts under prealloc pressure
run-ci
#25561
opened May 18, 2026 by
whybeyoung
Collaborator
Loading…
[AMD] TEST - disable preshuffle for nsa path
deepseek
#25559
opened May 18, 2026 by
yctseng0211
Collaborator
•
Draft
5 tasks
Improve performance for mrope position compute
#25557
opened May 18, 2026 by
zhaozx-cn
Contributor
Loading…
5 tasks
[AMD] Fix correctness for AITER MLA backend with
--page-size > 1
#25556
opened May 18, 2026 by
Duyi-Wang
Contributor
Loading…
[MXFP4 KV Cache] MXFP4 KV Cache: support blockfp4-hadamard-quant
quant
LLM Quantization
#25555
opened May 18, 2026 by
DehuaTang
Loading…
1 task
[WIP][Gemma4] Triton
extend_attention tuning to improve TTFT and TPOT
#25550
opened May 18, 2026 by
kpham-sgl
Collaborator
Loading…
5 tasks done
Respect user override for Gemma4 attention backend
#25547
opened May 17, 2026 by
kpham-sgl
Collaborator
Loading…
4 tasks done
[Core][Metrics] Expose scheduler queue pressure by waiting reason
#25546
opened May 17, 2026 by
mukeshbaphna
Loading…
[Spec] Add
trtllm_mha support for Gemma 4 MTP draft attention backend
#25545
opened May 17, 2026 by
kpham-sgl
Collaborator
Loading…
5 tasks done
Codec: token-native binary transport for completions streaming
#25544
opened May 17, 2026 by
wdunn001
Loading…
4 tasks
[XPU] Expand Stage B with proven AMD/CUDA tests
deepseek
#25543
opened May 17, 2026 by
arathi-hlab
Loading…
4 tasks
Previous Next
ProTip!
Follow long discussions with comments:>50.