Skip to content

Commit 4185529

Browse files
Merge pull request #216 from bernardladenthin/claude/amazing-noether-p7THl
Upgrade llama.cpp from b9549 to b9553
2 parents 89bf946 + 483bf83 commit 4185529

5 files changed

Lines changed: 66 additions & 3 deletions

File tree

CLAUDE.md

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
66

77
Java bindings for [llama.cpp](https://github.com/ggerganov/llama.cpp) via JNI, providing a high-level API for LLM inference in Java. The Java layer communicates with a native C++ library through JNI.
88

9-
Current llama.cpp pinned version: **b9549**
9+
Current llama.cpp pinned version: **b9553**
1010

1111
## Upgrading CUDA Version
1212

@@ -701,6 +701,14 @@ See [`../workspace/policies/jqwik-prompt-injection.md`](../workspace/policies/jq
701701
702702
See [`../workspace/policies/lombok-config.md`](../workspace/policies/lombok-config.md).
703703
704+
## JPMS Module Descriptor
705+
706+
This repo ships a `module-info.java` compiled in a separate `release 9` execution. Javadoc
707+
currently runs in **classpath mode** (javadoc `<source>` is `1.8`), which is the *only* thing
708+
keeping it clear of the JPMS module-mode javadoc trap that bit BAF. **Before raising the Java /
709+
javadoc source level to ≥ 9, read**
710+
[`../workspace/policies/jpms-module-descriptor.md`](../workspace/policies/jpms-module-descriptor.md).
711+
704712
## Open TODOs
705713
706714
Open TODOs for this repo live in [`TODO.md`](TODO.md). Cross-repo status

CMakeLists.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -114,7 +114,7 @@ set(LLAMA_BUILD_APP OFF CACHE BOOL "" FORCE)
114114
FetchContent_Declare(
115115
llama.cpp
116116
GIT_REPOSITORY https://github.com/ggerganov/llama.cpp.git
117-
GIT_TAG b9549
117+
GIT_TAG b9553
118118
)
119119
FetchContent_MakeAvailable(llama.cpp)
120120

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
**Build:**
22
![Java 8+](https://img.shields.io/badge/Java-8%2B-informational)
33
![Platform](https://img.shields.io/badge/Platform-Linux%20%7C%20macOS%20%7C%20Windows%20%7C%20Android-lightgrey)
4-
[![llama.cpp b9549](https://img.shields.io/badge/llama.cpp-%23b9549-informational)](https://github.com/ggml-org/llama.cpp/releases/tag/b9549)
4+
[![llama.cpp b9553](https://img.shields.io/badge/llama.cpp-%23b9553-informational)](https://github.com/ggml-org/llama.cpp/releases/tag/b9553)
55
[![JPMS](https://img.shields.io/badge/JPMS-modular%20JAR-25A162)](https://openjdk.org/projects/jigsaw/)
66
![JUnit](https://img.shields.io/badge/tested%20with-JUnit6-25A162)
77
[![JSpecify](https://img.shields.io/badge/JSpecify-1.0.0%20%40NullMarked-25A162)](https://jspecify.dev)

docs/history/llama-cpp-breaking-changes.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -321,3 +321,7 @@ Used during `llama.cpp` version bumps: when upgrading, scan this file from the r
321321
| ~b9543–b9549 | `.github/workflows/docker.yml` (upstream CI) | Upstream's `cuda13` Docker image bumped from CUDA `13.1.1` to `13.3.0`. Upstream's own CI only; this project ships its own `publish.yml` and pins CUDA 13.2 via `.github/build_cuda_linux.sh` (see CLAUDE.md "Upgrading CUDA Version"). No impact |
322322
| ~b9543–b9549 | project `CMakeLists.txt` (pre-existing latent bug, fixed in this bump) | **Not an upstream change** &mdash; surfaced while build-testing this bump locally. The OS/arch detection block invoked `net.ladenthin.llama.OSInfo`, but the class had moved to `net.ladenthin.llama.loader.OSInfo` in the earlier layered-package restructure, so `cmake -B build` failed with "Could not determine OS name" on any host that does not pass `-DOS_NAME`/`-DOS_ARCH` explicitly (CI does, which is why it went unnoticed). Fixed both `execute_process` invocations (`--os` and `--arch`) to the `loader.OSInfo` FQN. Same stale-FQN-after-restructure class as the earlier `spotbugs-exclude.xml` / PIT-`targetClasses` repairs &mdash; the standing reminder to re-validate every FQN-bearing config after a package move now also covers `CMakeLists.txt` |
323323
| ~b9543–b9549 | upstream build / verification | Local build with `GIT_TAG b9549` verified clean on Linux x86_64: `cmake -B build -DBUILD_TESTING=ON` configures cleanly (after the `loader.OSInfo` FQN fix above), `cmake --build build --config Release -j$(nproc)` links `libjllama.so` + `jllama_test` with zero warnings on any project translation unit (incl. the changed `server-context.cpp`), and `ctest --test-dir build --output-on-failure` reports 435/435 tests passing. All upstream breaking changes in this range are absorbed inside upstream-compiled translation units; no project C++ source edits were required for the version bump itself |
324+
| ~b9549&ndash;b9553 | `common/sampling.h` + `common/sampling.cpp` + `common/arg.cpp` + `common/common.cpp` + `tools/server/server-task.cpp` | `common_sampler_types_from_names()` **dropped its `bool allow_alt_names` parameter** &mdash; the signature is now `common_sampler_types_from_names(const std::vector<std::string> & names)`. The body was rewritten to (a) auto-generate kebab-case (`top-k`) and no-dash (`topk`) aliases from the canonical snake_case names, plus misc aliases (`nucleus`&#x2192;top_p, `temp`&#x2192;temperature, `typ`&#x2192;typical_p), and (b) lowercase the input so matching is **case-insensitive**; aliases are now *always* accepted (the old gate is gone). All three call sites were updated upstream (`arg.cpp` / `common.cpp` dropped the `, true` arg; `server-task.cpp` dropped the `, false` arg). **Project impact: none at the source level** &mdash; `grep -rn common_sampler_types_from_names src/main/cpp src/test/cpp` returns zero matches; the symbol is reached only through the upstream-compiled `server-task.cpp` linked into `jllama`. **New behaviour exposed for free:** because `server-task.cpp` previously passed `allow_alt_names=false`, the project's `InferenceParameters` `samplers` JSON array only matched canonical names like `top_k`; it now also accepts `top-k` / `topk` / `nucleus` / `temp` / `typ` and is case-insensitive (`TOP_K`, `Min-P`). Pinned by 5 new `ParamsFromJsonCmpl.Samplers_*` tests in `test_server.cpp` |
325+
| ~b9549&ndash;b9553 | `src/llama-kv-cache.cpp` + `src/llama-kv-cache.h` + `src/llama-kv-cells.h` | KV-cache shared-cells refactor (continues `TAG_KV_CACHE_SHARE_CELLS`, used by the Gemma4-assistant MTP head): the `v_cells` member changed from a by-value `std::vector<llama_kv_cells>` to a `std::shared_ptr<llama_kv_cells_vec> v_cells_impl` plus a `llama_kv_cells_vec & v_cells` reference, so a target cache now *views* the source cache's cells instead of copying them in `apply_ubatch()`; the constructor also clamps `kv_size` down to the shared source's size. New type alias `using llama_kv_cells_vec = std::vector<llama_kv_cells>;` in `llama-kv-cells.h`. All internal `src/` headers the JNI build does **not** include (the project pulls public `llama.h` / `llama-cpp.h`, never `llama-kv-cache.h` / `llama-kv-cells.h`) &mdash; verified via `grep -rn "llama_kv_cells\|llama-kv-cache" src/main/cpp src/test/cpp` &#x2192; zero matches. No project source changes required |
326+
| ~b9549&ndash;b9553 | `conversion/mistral.py` + `convert_hf_to_gguf.py` | Python conversion-script robustness only: `hparams["llama_4_scaling"]` and `"moe" in hparams` replaced with `hparams.get(...)` / `is not None` guards so a present-but-null key no longer crashes conversion. Python tooling, not part of the JNI build. No impact |
327+
| ~b9549&ndash;b9553 | upstream build / verification | Local build with `GIT_TAG b9553` verified clean on Linux x86_64: `cmake -B build -DBUILD_TESTING=ON` configures cleanly, `cmake --build build --config Release -j$(nproc)` links `libjllama.so` + `jllama_test` with zero warnings on any project translation unit, and `ctest --test-dir build --output-on-failure` reports **440/440 tests passing** (435 prior + 5 new `Samplers_*` tests). The sole breaking change in this range (the `common_sampler_types_from_names` signature) is absorbed inside upstream-compiled translation units; no project C++ source edits were required for the version bump itself |

src/test/cpp/test_server.cpp

Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1681,6 +1681,57 @@ TEST(ParamsFromJsonCmpl, NCmpl_AliasedFromN) {
16811681
EXPECT_EQ(p.n_cmpl, 1);
16821682
}
16831683

1684+
// ============================================================
1685+
// params_from_json_cmpl — "samplers" name matching (llama.cpp b9553)
1686+
// common_sampler_types_from_names dropped its allow_alt_names flag:
1687+
// the server path (params_from_json_cmpl) now ALWAYS accepts aliases and
1688+
// is case-insensitive. Before b9553 the server passed allow_alt_names=false,
1689+
// so only the canonical snake_case names matched and "top-k" / "TOP_K" were
1690+
// skipped. These tests pin the more lenient behaviour the project's
1691+
// "samplers" JSON field now exposes for free.
1692+
// ============================================================
1693+
1694+
TEST(ParamsFromJsonCmpl, Samplers_CanonicalNames_Parsed) {
1695+
const auto p = parse_params({{"samplers", {"top_k", "top_p", "min_p", "temperature"}}});
1696+
ASSERT_EQ(p.sampling.samplers.size(), 4u);
1697+
EXPECT_EQ(p.sampling.samplers[0], COMMON_SAMPLER_TYPE_TOP_K);
1698+
EXPECT_EQ(p.sampling.samplers[1], COMMON_SAMPLER_TYPE_TOP_P);
1699+
EXPECT_EQ(p.sampling.samplers[2], COMMON_SAMPLER_TYPE_MIN_P);
1700+
EXPECT_EQ(p.sampling.samplers[3], COMMON_SAMPLER_TYPE_TEMPERATURE);
1701+
}
1702+
1703+
TEST(ParamsFromJsonCmpl, Samplers_KebabCaseAlias_NowAccepted) {
1704+
// "top-k" / "min-p" alt names were rejected by the server before b9553.
1705+
const auto p = parse_params({{"samplers", {"top-k", "min-p"}}});
1706+
ASSERT_EQ(p.sampling.samplers.size(), 2u);
1707+
EXPECT_EQ(p.sampling.samplers[0], COMMON_SAMPLER_TYPE_TOP_K);
1708+
EXPECT_EQ(p.sampling.samplers[1], COMMON_SAMPLER_TYPE_MIN_P);
1709+
}
1710+
1711+
TEST(ParamsFromJsonCmpl, Samplers_CaseInsensitive) {
1712+
const auto p = parse_params({{"samplers", {"TOP_K", "Temperature", "Min-P"}}});
1713+
ASSERT_EQ(p.sampling.samplers.size(), 3u);
1714+
EXPECT_EQ(p.sampling.samplers[0], COMMON_SAMPLER_TYPE_TOP_K);
1715+
EXPECT_EQ(p.sampling.samplers[1], COMMON_SAMPLER_TYPE_TEMPERATURE);
1716+
EXPECT_EQ(p.sampling.samplers[2], COMMON_SAMPLER_TYPE_MIN_P);
1717+
}
1718+
1719+
TEST(ParamsFromJsonCmpl, Samplers_MiscAliases_Parsed) {
1720+
// "nucleus" -> top_p, "temp" -> temperature, "typ" -> typical_p
1721+
const auto p = parse_params({{"samplers", {"nucleus", "temp", "typ"}}});
1722+
ASSERT_EQ(p.sampling.samplers.size(), 3u);
1723+
EXPECT_EQ(p.sampling.samplers[0], COMMON_SAMPLER_TYPE_TOP_P);
1724+
EXPECT_EQ(p.sampling.samplers[1], COMMON_SAMPLER_TYPE_TEMPERATURE);
1725+
EXPECT_EQ(p.sampling.samplers[2], COMMON_SAMPLER_TYPE_TYPICAL_P);
1726+
}
1727+
1728+
TEST(ParamsFromJsonCmpl, Samplers_UnknownName_SkippedNotError) {
1729+
// unknown names are warned and skipped, not a hard error.
1730+
const auto p = parse_params({{"samplers", {"top_k", "definitely_not_a_sampler"}}});
1731+
ASSERT_EQ(p.sampling.samplers.size(), 1u);
1732+
EXPECT_EQ(p.sampling.samplers[0], COMMON_SAMPLER_TYPE_TOP_K);
1733+
}
1734+
16841735
// ============================================================
16851736
// params_from_json_cmpl — reasoning_budget_tokens
16861737
// reasoning_budget_tokens defaults to -1 (disabled).

0 commit comments

Comments
 (0)