Skip to content

[pull] master from ggml-org:master#122

Merged
pull[bot] merged 3 commits into
CrazyForks:masterfrom
ggml-org:master
Jun 2, 2026
Merged

[pull] master from ggml-org:master#122
pull[bot] merged 3 commits into
CrazyForks:masterfrom
ggml-org:master

Conversation

@pull
Copy link
Copy Markdown

@pull pull Bot commented Jun 2, 2026

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

ggerganov and others added 3 commits June 2, 2026 13:17
Reduce the number of parallel jobs in server-self-hosted.yml by stacking
test configurations as sequential steps within a single job, following the
pattern from #23927.

- server-metal: 4 matrix jobs -> 1 job with 4 sequential test steps
- server-cuda: 2 matrix jobs -> 1 job with 2 sequential test steps
- server-kleidiai: removed unnecessary single-entry matrix
- removed unused Setup Node.js step from server-metal

Total: 7 parallel jobs -> 3 parallel jobs

Assisted-by: llama.cpp:local pi
* common : fix state save in common_prompt_batch_decode

This commit addresses a bug in common_prompt_batch_decode that affects
the session state store/restore in completion.cpp and
save-load-state.cpp.

The motivation for this is that currently the code is saving n-1 tokens
in both the session_tokens and in the KV cache. Then when loading the
session tokens, and if the prompt matches, it would replay the last
saved token (n-1) into the next position, effectively replaying the
same token in the wrong position.

The fix is to store all n tokens in session_tokens, while the memory
state only reflects n-1 processed tokens as the saving happens before
the last token is decoded in common_prompt_batch_decode.

I ran both completion.cpp and save-load-state.cpp with a transformer, a
recurrent, and a hybrid model.

Resolves: #23400

Co-authored-by: fairydreaming <166155368+fairydreaming@users.noreply.github.com>
@pull pull Bot locked and limited conversation to collaborators Jun 2, 2026
@pull pull Bot added the ⤵️ pull label Jun 2, 2026
@pull pull Bot merged commit 0b71540 into CrazyForks:master Jun 2, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants