Skip to content

Commit d2a8aa8

Browse files
committed
Upgrade llama.cpp from b9151 to b9172
Also adds LLAMA_BUILD_WEBUI=OFF before FetchContent to prevent the new build-time WebUI asset download introduced in b9172 from running during CI/local builds. No JNI-level API changes were required. https://claude.ai/code/session_01DVizDEtXBVDaXciEoo9a8v
1 parent f867cb8 commit d2a8aa8

3 files changed

Lines changed: 10 additions & 3 deletions

File tree

CLAUDE.md

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
66

77
Java bindings for [llama.cpp](https://github.com/ggerganov/llama.cpp) via JNI, providing a high-level API for LLM inference in Java. The Java layer communicates with a native C++ library through JNI.
88

9-
Current llama.cpp pinned version: **b9151**
9+
Current llama.cpp pinned version: **b9172**
1010

1111
## Upgrading CUDA Version
1212

@@ -275,6 +275,12 @@ Also review the project `CMakeLists.txt` for build-system-level breaks (e.g. ren
275275
| ~b9150–b9151 | `tools/server/server-common.h` | New `SLT_TRC` and `SRV_TRC` macros (emit at `LOG_TRC` level); additive, no project changes required |
276276
| ~b9150–b9151 | `tools/server/server-context.cpp` | New `server_slot::t_print_last` field + `print_timings_tg()` / `print_timings_pp()` methods: emit periodic in-flight token-generation and prompt-processing throughput to `SLT_INF` (throttled to ≥100 decoded tokens and ≥3 s interval); `server_context_impl` constructor now calls `mtmd_helper_log_set` unconditionally (was guarded by `!is_resume`); many `SLT_INF`/`SRV_WRN` downgraded to `SLT_TRC`/`SRV_INF`; compiled from upstream, no project JNI changes required |
277277
| ~b9150–b9151 | `tools/server/server-task.cpp` | Several `SRV_WRN` calls downgraded to `SRV_INF`; one `SRV_WRN` upgraded to `SRV_ERR` for failed state restore; compiled from upstream, no project changes required |
278+
| ~b9151–b9172 | `tools/mtmd/clip.h` | `clip_has_whisper_encoder()` removed from public API; not referenced by project — no changes required |
279+
| ~b9151–b9172 | `tools/server/CMakeLists.txt` + `scripts/webui-download.cmake` (new) | WebUI assets no longer committed (`tools/server/public/` gitignored); provisioned at build time via HF bucket (`LLAMA_USE_PREBUILT_WEBUI=ON` default) or built from source (`LLAMA_BUILD_WEBUI`); project sets `LLAMA_BUILD_WEBUI=OFF CACHE BOOL "" FORCE` before FetchContent to skip asset download |
280+
| ~b9151–b9172 | `common/common.h` | `common_params::webui` default made conditional on `LLAMA_WEBUI_DEFAULT_ENABLED` macro (falls back to `true` when undefined); compiled server sources unaffected |
281+
| ~b9151–b9172 | `common/reasoning-budget.cpp` | `common_reasoning_budget_clone` rewritten to use `llama_sampler_init` properly; pure bug fix, no API change, no project changes required |
282+
| ~b9151–b9172 | `ggml/src/ggml-cuda/fattn-mma-f16.cuh` + `mma.cuh` | AMD RDNA3 WMMA flash attention support; new `DATA_LAYOUT_I_MAJOR_SCRAMBLED`, `tile<16,16,half2,I_MAJOR_SCRAMBLED>`, extended config tables; internal CUDA backend, no project changes required |
283+
| ~b9151–b9172 | `tools/server/server-chat.cpp` | Non-function Responses API tools now silently skipped (`continue`) instead of throwing; server behavior fix, no Java API change required |
278284

279285
## Build Commands
280286

CMakeLists.txt

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -104,10 +104,11 @@ endif()
104104
set(GGML_FMA ON CACHE BOOL "" FORCE)
105105
set(GGML_F16C ON CACHE BOOL "" FORCE)
106106
set(GGML_AVX512 OFF CACHE BOOL "" FORCE)
107+
set(LLAMA_BUILD_WEBUI OFF CACHE BOOL "" FORCE)
107108
FetchContent_Declare(
108109
llama.cpp
109110
GIT_REPOSITORY https://github.com/ggerganov/llama.cpp.git
110-
GIT_TAG b9151
111+
GIT_TAG b9172
111112
)
112113
FetchContent_MakeAvailable(llama.cpp)
113114

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
![Java 8+](https://img.shields.io/badge/Java-8%2B-informational)
2-
[![llama.cpp b9151](https://img.shields.io/badge/llama.cpp-%23b9151-informational)](https://github.com/ggml-org/llama.cpp/releases/tag/b9151)
2+
[![llama.cpp b9172](https://img.shields.io/badge/llama.cpp-%23b9172-informational)](https://github.com/ggml-org/llama.cpp/releases/tag/b9172)
33
[![Maven Central](https://img.shields.io/maven-central/v/net.ladenthin/llama)](https://central.sonatype.com/artifact/net.ladenthin/llama)
44
[![Snapshot](https://img.shields.io/badge/snapshot-latest-informational)](https://central.sonatype.com/repository/maven-snapshots/net/ladenthin/llama/)
55

0 commit comments

Comments
 (0)