Sync master with upstream release b9066 by jan-service-account · Pull Request #509 · janhq/llama.cpp

jan-service-account · 2026-05-08T01:04:29Z

Updates dev branch with latest release (b9066) from ggml-org/llama.cpp

Signed-off-by: Adrien Gallouët <angt@huggingface.co>

The error: ./examples/sycl/test.sh: line 122: level_zero:${$GGML_SYCL_DEVICE}: bad substitution was thrown whenever the user used this command: ./examples/sycl/test.sh -mg 0 Fix is to get rid of a dollar sign.

…gml-org#22773) * add fill-mode-forwards * generated diffs

* codeowners : add ZenDNN backend codeowner * codeowners : fix zendnn owners to use individual github handles

* webui: fix ?model= URL param race in router mode * chore: update webui build output

* add mimo-v2.5 support * mimo-v2.5: fix modify_tensors row split * mimi-v2.5: forgot `add_attn_value_scale` plumbing * mimi-v2.5: fix tp dequant to detect tp rows * mimo-v2.5: fix TP iteration to be descending * mimo-v2.5: fix comment * mimo-v2.5: retain fused qkv * mimo-v2.5: missed the attn_value scale during merge * mimo-v2.5: fused QKV needs contiguous for scaling attention value * mimo-v2.5: move `speech_embeddings.` to TextModel filter_tensors * Update src/llama-hparams.h Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> * Update src/models/mimo2.cpp Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> * Update src/models/mimo2.cpp Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> * Update convert_hf_to_gguf.py Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> * Update convert_hf_to_gguf.py Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> * Update src/models/mimo2.cpp Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> * mimo-v2.5: include MTP weights in gguf --------- Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

…FFT (ggml-org#22770)

…g#22797)

* Write a readme on Multi-GPU usage in llama.cpp * Apply suggestions from code review Co-authored-by: Johannes Gäßler <johannesg@5d6.de> * Address review comments * Apply suggestions from code review Co-authored-by: Johannes Gäßler <johannesg@5d6.de> --------- Co-authored-by: Johannes Gäßler <johannesg@5d6.de>

…gml-org#22149) * sycl: add FILL, CUMSUM, DIAG, SOLVE_TRI, SSM_SCAN, GATED_DELTA_NET Signed-off-by: Chun Tao <chun.tao@intel.com> * Fix abort during test-backend-ops Signed-off-by: Todd Malsbary <todd.malsbary@intel.com> * Regenerate ops.md Signed-off-by: Todd Malsbary <todd.malsbary@intel.com> * Add scope_dbg_print to newly added SYCL ops. Also add scope_dbg_print to existing ssm_conv op. Signed-off-by: Todd Malsbary <todd.malsbary@intel.com> --------- Signed-off-by: Chun Tao <chun.tao@intel.com> Signed-off-by: Todd Malsbary <todd.malsbary@intel.com> Co-authored-by: Chun Tao <chun.tao@intel.com> Co-authored-by: Todd Malsbary <todd.malsbary@intel.com>

…ml-org#22794) * tests : add long-seq + tail cases for gated_delta_net * tests : realistic input ranges for gated_delta_net

…l-org#22634)

* webui: add LLM title generation option * webui: use chat_template_kwargs for title gen + fix conversation check * webui: capture firstUserMessage before async streamChatCompletion to fix race condition * webui: extract LLM title generation into separate method * webui: use constants and ChatService for LLM generated titles * webui: rebuild static output * webui: add LLM title generation setting to new settings location * webui: use sendMessage in generateTitle * webui: rebuild static output * webui: fix formatting * webui: configurable title prompt, remove think tag regexes, fix TS error * webui: group title constants into TITLE object, use TruncatedText for CSS truncation and fix race condition * webui: rebuild static output

…org#22651) * CUDA: batch out_prod inner loop with cublasSgemmStridedBatched * CUDA: batch out_prod inner loop with cublasSgemmStridedBatched * CUDA: add cublasSgemmStridedBatched mapping for HIP and MUSA backends

angt and others added 17 commits May 7, 2026 08:24

llama : add missing call to ggml_backend_load_all() (ggml-org#22752)

3980e04

Signed-off-by: Adrien Gallouët <angt@huggingface.co>

sycl : fix test script (ggml-org#22737)

cfff1fc

The error: ./examples/sycl/test.sh: line 122: level_zero:${$GGML_SYCL_DEVICE}: bad substitution was thrown whenever the user used this command: ./examples/sycl/test.sh -mg 0 Fix is to get rid of a dollar sign.

webui: fix flicker issue on dismiss animation on overlay primitives (g…

e358d75

…gml-org#22773) * add fill-mode-forwards * generated diffs

codeowners : add ZenDNN backend codeowner (ggml-org#22772)

97f06e9

* codeowners : add ZenDNN backend codeowner * codeowners : fix zendnn owners to use individual github handles

webui: fix ?model= URL param race in router mode (ggml-org#22771)

f4b5a2e

* webui: fix ?model= URL param race in router mode * chore: update webui build output

mtmd: fix whisper audio tail truncation by exposing padded buffer to …

cc97e45

…FFT (ggml-org#22770)

ggml-cpu: Optimized risc-v cpu q1_0 dot

68380ae

llama : remove unnecessary seq_id check during state restore (ggml-or…

803627f

…g#22797)

tests: add long-sequence cases and fix inputs for gated_delta_net (gg…

deab41e

…ml-org#22794) * tests : add long-seq + tail cases for gated_delta_net * tests : realistic input ranges for gated_delta_net

common/chat : preserve media markers for typed-content templates (ggm…

093be62

…l-org#22634)

opencl: add opfilter regex for debugging (ggml-org#22782)

ceb7e14

llama : fix device state save/load (ggml-org#22805)

e43431b

jan-service-account merged commit 3bfd8b3 into dev May 8, 2026
9 checks passed

jan-service-account deleted the update-dev-from-master-2026-05-08-01-04 branch May 8, 2026 01:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sync master with upstream release b9066#509

Sync master with upstream release b9066#509
jan-service-account merged 17 commits into
devfrom
update-dev-from-master-2026-05-08-01-04

jan-service-account commented May 8, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

16 participants

Conversation

jan-service-account commented May 8, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

16 participants