ikawrakow / ik_llama.cpp Public

Notifications You must be signed in to change notification settings
Fork 373
Star 2.9k

Code
Issues 47
Pull requests 26
Discussions
Actions
Projects
Wiki
Security and quality
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Wiki
Security and quality
Insights

Pull requests: ikawrakow/ik_llama.cpp

Labels 13 Milestones 0

New pull request New

26 Open 1,330 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

GLM DSA: reduce indexer cache size

#2093 opened Jul 6, 2026 by ikawrakow Owner

Loading…

Be able to force synchronization when copying graph ginputs

#2092 opened Jul 6, 2026 by ikawrakow Owner

Loading…

GLM-DSA: do not compute indexer score if context < n_top_k

#2090 opened Jul 6, 2026 by ikawrakow Owner

Loading…

feat: Add MiniMax M3 Vision Projector Support

#2086 opened Jul 5, 2026 by jkyamog Contributor • Draft

fix: MiniMax-M3 streaming parser when tool calls start before </mm:think>

#2085 opened Jul 5, 2026 by jkyamog Contributor • Draft

model: add openPangu-2.0-Flash (92B-A6B) with MLA-latent cache, DSA/SWA, and NextN MTP

#2065 opened Jul 2, 2026 by joelfarthing Contributor • Draft

2 of 4 tasks

Bound DFlash SWA K/V cache to the sliding window

#2064 opened Jul 1, 2026 by SamuelOliveirads Collaborator

Loading…

Parallelize weight loading for weights targeted at anonymous host ram and also GPU

#2057 opened Jun 29, 2026 by Farmadupe Contributor

Loading…

2 tasks done

add --split-output-tensor / -sot CLI parameter

#2056 opened Jun 29, 2026 by Nexesenex Contributor • Draft

2 of 4 tasks

implement perplexity in llama-server

#2011 opened Jun 22, 2026 by magikRUKKOLA Contributor • Draft

fix: checkpoint restore for hybrid/recurrent models

#1976 opened Jun 16, 2026 by razlani

Loading…

delta-net: fix np>1 hybrid recurrent-state corruption (batched multi-seq)

#1933 opened Jun 7, 2026 by poisonxa16

Loading…

Fix misc. expiring logit/sparam bias bugs

#1914 opened Jun 2, 2026 by dungquixote42 Contributor

Loading…

2 of 4 tasks

Qwen3.5 MTP: extract selected tokens earlier

#1892 opened May 28, 2026 by ikawrakow Owner

Loading…

server: enable checkpoint reuse for recurrent/hybrid models (qwen3next, Mamba)

#1888 opened May 26, 2026 by localweights

Loading…

Fix prompt cache viability

#1877 opened May 25, 2026 by zeljkokalezic

Loading…

2 of 4 tasks

docs: Complete rewrite of build.md – CMake-only, focused on supported backends

#1853 opened May 21, 2026 by maddes8cht

Loading…

2 of 4 tasks

A GGUF MTP Extract and Merge Tool

#1849 opened May 20, 2026 by FNsi • Draft

2 of 4 tasks

cuda: add get_rows CUDA kernels for Q4_K, Q5_K, Q6_K

#1830 opened May 18, 2026 by localweights

Loading…

Slightly expand the usage of VNNI256

#1764 opened May 9, 2026 by XZiar Contributor

Loading…

2 of 4 tasks

A GGUF editor, which can be use to duplicate or delete layers in Qwen3.5 / Qwen Coder Next or whatever but may not run in 1 shot.

#1746 opened May 6, 2026 by FNsi • Draft

1 of 3 tasks

runtime : add --run-time-repack auto mode for swap-bound MoE safety

#1738 opened May 4, 2026 by AndrewMoryakov Contributor

Loading…

2 of 4 tasks

Change signature of llama_set_draft_input_hidden_state

#1727 opened May 3, 2026 by ikawrakow Owner

Loading…

convert_hf_to_gguf: add Qwen3.5 / Qwen3.6 support (+ Qwen3-Next scaffolding, not e2e-verified)

#1654 opened Apr 18, 2026 by markaalonzo Contributor • Draft

Add reuse property to ggml_cgraph

#1617 opened Apr 11, 2026 by ikawrakow Owner

Loading…

Previous 1 2 Next

Previous Next

ProTip! Adding no:label will show everything without a label.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!