-
Notifications
You must be signed in to change notification settings - Fork 42
Pull requests: antirez/llama.cpp-deepseek-v4-flash
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Change k_cache and k_raw to use ggml_view_3d to fix np > 1 launch abort
model
#10
opened May 13, 2026 by
kstjohn1
Loading…
fix: two x86_64 Linux inference crashes (sched_reserve assert + CUDA concat dispatch)
ggml
Nvidia GPU
testing
#7
opened May 8, 2026 by
randomsamples
Loading…
fix: support non-F32 quantized types in CUDA concat op
ggml
Nvidia GPU
#4
opened May 4, 2026 by
cdome94
Loading…
ProTip!
Type g i on any issue or pull request to go back to the issue listing page.