Skip to content

Commit 982240c

Browse files
Merge pull request #203 from bernardladenthin/claude/charming-gauss-9l007
Upgrade llama.cpp from b9437 to b9442
2 parents 42bb74f + ab1811e commit 982240c

4 files changed

Lines changed: 7 additions & 3 deletions

File tree

CLAUDE.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
66

77
Java bindings for [llama.cpp](https://github.com/ggerganov/llama.cpp) via JNI, providing a high-level API for LLM inference in Java. The Java layer communicates with a native C++ library through JNI.
88

9-
Current llama.cpp pinned version: **b9437**
9+
Current llama.cpp pinned version: **b9442**
1010

1111
## Upgrading CUDA Version
1212

CMakeLists.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -114,7 +114,7 @@ set(LLAMA_BUILD_APP OFF CACHE BOOL "" FORCE)
114114
FetchContent_Declare(
115115
llama.cpp
116116
GIT_REPOSITORY https://github.com/ggerganov/llama.cpp.git
117-
GIT_TAG b9437
117+
GIT_TAG b9442
118118
)
119119
FetchContent_MakeAvailable(llama.cpp)
120120

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
[![Lincheck](https://img.shields.io/badge/tested%20with-Lincheck-7F52FF)](https://github.com/JetBrains/lincheck)
99
[![vmlens](https://img.shields.io/badge/tested%20with-vmlens-ff6f00)](https://vmlens.com)
1010
[![JMH](https://img.shields.io/badge/benchmarked%20with-JMH-25A162)](https://openjdk.org/projects/code-tools/jmh/)
11-
[![llama.cpp b9437](https://img.shields.io/badge/llama.cpp-%23b9437-informational)](https://github.com/ggml-org/llama.cpp/releases/tag/b9437)
11+
[![llama.cpp b9442](https://img.shields.io/badge/llama.cpp-%23b9442-informational)](https://github.com/ggml-org/llama.cpp/releases/tag/b9442)
1212
[![Publish](https://github.com/bernardladenthin/java-llama.cpp/actions/workflows/publish.yml/badge.svg)](https://github.com/bernardladenthin/java-llama.cpp/actions/workflows/publish.yml)
1313
[![CodeQL](https://github.com/bernardladenthin/java-llama.cpp/actions/workflows/codeql.yml/badge.svg)](https://github.com/bernardladenthin/java-llama.cpp/actions/workflows/codeql.yml)
1414

docs/history/llama-cpp-breaking-changes.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -279,3 +279,7 @@ Used during `llama.cpp` version bumps: when upgrading, scan this file from the r
279279
| ~b9354–b9437 | `vendor/cpp-httplib/` | Bumped to v0.46.0: adds `Client::set_no_proxy(std::vector<std::string>)` with full hostname-suffix and IPv4/IPv6 CIDR matching; `Server::ThreadPool` constructor is exception-safe (already in v0.45.0); `Client::set_proxy()` now disconnects the held socket immediately so a later proxy change cannot reuse the old TLS session. Compiled automatically, no project changes required |
280280
| ~b9354–b9437 | `common/arg.cpp` (additive flags) | New `--spec-draft-backend-sampling` / `--no-spec-draft-backend-sampling` (env `LLAMA_ARG_SPEC_DRAFT_BACKEND_SAMPLING`) and `--skip-download` (mapped to `common_params::skip_download`). Both default-on / default-off in a way that preserves current Java behaviour. Consider exposing as `ModelParameters.setSpecDraftBackendSampling(boolean)` and `setSkipDownload(boolean)` in a follow-up — tracked under Open TODOs |
281281
| ~b9354–b9437 | `ggml/src/ggml-cuda/common.cuh` | `GGML_CUDA_USE_PDL` gating tightened: for MSVC, now requires CTK ≥ 12.3 (was 11.8) due to a compiler bug in the older Windows CUDA toolchains. Project's only CUDA build is Linux (dockcross, CUDA 13.2) so the MSVC gate has no CI impact; Windows CI builds CPU-only |
282+
| ~b9437–b9442 | `src/llama-vocab.{h,cpp}` + `src/llama-arch.{h,cpp}` | New `LLAMA_VOCAB_PRE_TYPE_WHITESPACE = 53` and `llm_tokenizer_whitespace_session` (used by jina-v2-base-zh embeddings); new "whitespace" tokenizer_model routed as `LLAMA_VOCAB_TYPE_BPE`; new `LLM_KV_TOKENIZER_NORMALIZER_LOWERCASE` key (`tokenizer.ggml.normalizer.lowercase`) read into `llama_vocab::impl::normalizer_lowercase`; new public accessor `llama_vocab::get_normalizer_lowercase()`. All additive — existing tokenizers untouched; new whitespace + lowercase normalizer is consumed automatically when loading a GGUF that sets these vocabulary keys, no project source or Java API changes required |
283+
| ~b9437–b9442 | `src/llama.cpp` | `llama_prepare_model_devices()` iGPU collection now appends only the FIRST `GGML_BACKEND_DEVICE_TYPE_IGPU` device (prevents duplicate iGPU registration on multi-iGPU hosts). Behavioural fix, single-line caller in `jllama.cpp` unchanged, no project source changes required |
284+
| ~b9437–b9442 | `tools/ui/embed.cpp` + `tools/ui/src/...` (Svelte) | Webasset embedder tightened printf format specifiers (`%lu` &#x2192; `%zu` and `PRIx64`); UI settings split `custom` into `customJson` + `customCss`; runtime CSS injection via `<svelte:head>`. Project does not ship the upstream UI, no impact |
285+
| ~b9437–b9442 | `gguf-py/`, `conversion/` (Python) | New `_set_vocab_whitespace()` helper and `add_normalizer_lowercase()` GGUF writer for the new whitespace tokenizer + lowercase normalizer keys (mirrors the vocab additions above); jina-v2 Roberta-tokenizer path now branches to whitespace when `tokenizer.json` declares a `Whitespace` pre-tokenizer. Python-side only, no impact on the Java/JNI build |

0 commit comments

Comments
 (0)