-
Notifications
You must be signed in to change notification settings - Fork 838
Pull requests: ml-explore/mlx-lm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Add KV cache quantization to mlx_lm.server (#1043)
#1476
opened Jul 5, 2026 by
katlun-lgtm
Loading…
2 of 3 tasks
Fix NewlineTokenizer registration with Transformers v5
#1474
opened Jul 5, 2026 by
kime541200
Loading…
fix(vl): strip model.visual.* in qwen2_vl/qwen3_vl/qwen3_vl_moe sanitize
#1473
opened Jul 4, 2026 by
Jonathangadeaharder
Loading…
fix: pin transformers<5.13.0 to avoid AutoTokenizer.register breakage
#1471
opened Jul 4, 2026 by
rggammon
Loading…
Fix MTPHead: correct hidden-state/embedding concat order and use pre-norm hidden
#1469
opened Jul 4, 2026 by
h9q2cyxvgm-ui
Loading…
Attach Qwen3.6 MTP head for speculative decoding + fix cache-rewind for hybrid recurrent models
#1468
opened Jul 4, 2026 by
h9q2cyxvgm-ui
Loading…
Fix broadcast crash in quantized SDPA with GQA + batched padding mask (batch >= 2)
#1467
opened Jul 4, 2026 by
pinglin
Loading…
Fix NewlineTokenizer registration for transformers >= 5.13
#1465
opened Jul 4, 2026 by
chandukona
Loading…
GLM-5.2: full/shared indexer typing for glm_moe_dsa (DSA schedule + interleaved indexer rope)
#1463
opened Jul 4, 2026 by
machiabeli
•
Draft
Fix AutoTokenizer.register() for transformers 5.13.0+ compatibility
#1459
opened Jul 3, 2026 by
jonpspri
Loading…
qwen3_5: load in-checkpoint MTP head + speculative rollback for hybrid (GDN) caches
#1456
opened Jul 3, 2026 by
pierre427
Loading…
fix(mlx_lm.server): fail fast when --draft-model set with non-trimmable cache
#1455
opened Jul 2, 2026 by
tejkas
Loading…
DeepSeek-V3.2/GLM DSA: fix silent >128k top-k corruption + sparse-gather prefill
#1454
opened Jul 2, 2026 by
aidiffuser
Loading…
Fix DSA indexer LoRA-training crash: stop gradients through sparse-attention top-k indices
#1452
opened Jul 2, 2026 by
trevorgordon981
Loading…
Fix Mistral tool parser dropping parallel/multiple tool calls
#1448
opened Jul 2, 2026 by
DavidObando
Loading…
Fix dropped tool calls for models with empty tool_call_end (Mistral/Devstral)
#1447
opened Jul 1, 2026 by
DavidObando
Loading…
Fix frozen PRNG in categorical_sampling under repeated sampling
#1444
opened Jun 30, 2026 by
utkarshtiwari-24
Loading…
Fix qwen3.5-MoE garbage output: don't double-shift RMSNorm on MTP-retaining checkpoints
#1442
opened Jun 29, 2026 by
embwl0x
Loading…
Make RotatingKVCache trimmable so prefix cache reuse works for sliding-window models
#1437
opened Jun 26, 2026 by
amirarsalan90
Loading…
Fix: pythonic tool parser not auto-detected for LFM2.5 models
#1436
opened Jun 25, 2026 by
grumdahl
Loading…
fix: use FiscalNote/billsum HF dataset path in test_datsets
#1434
opened Jun 25, 2026 by
ttxs69
Loading…
Previous Next
ProTip!
Exclude everything labeled
bug with -label:bug.