Skip to content

Commit aa33468

Browse files
committed
Upgrade llama.cpp from b9150 to b9151
Key changes in b9151: - New LOG_TRC macro (trace level between INFO=3 and DEBUG=5) - New common_params_print_info() consolidates build/device/system info logging; replace the two-line LOG_INF pattern in jllama.cpp with a single call - common_init() now defaults log prefix and timestamps to true (opt-out via --no-log-prefix / --no-log-timestamps) - New SLT_TRC / SRV_TRC server macros; many verbose server messages demoted from INF to TRC (less noise at default verbosity) - server_slot gains periodic in-flight throughput printing (print_timings_tg/pp) All 417 C++ unit tests pass. https://claude.ai/code/session_01FFt37e3FpbpFbT7oaPSbLB
1 parent f74fa9f commit aa33468

4 files changed

Lines changed: 11 additions & 5 deletions

File tree

CLAUDE.md

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
66

77
Java bindings for [llama.cpp](https://github.com/ggerganov/llama.cpp) via JNI, providing a high-level API for LLM inference in Java. The Java layer communicates with a native C++ library through JNI.
88

9-
Current llama.cpp pinned version: **b9150**
9+
Current llama.cpp pinned version: **b9151**
1010

1111
## Upgrading CUDA Version
1212

@@ -268,6 +268,13 @@ Also review the project `CMakeLists.txt` for build-system-level breaks (e.g. ren
268268
| ~b9145–b9150 | `ggml/src/ggml-vulkan/ggml-vulkan.cpp` | Bug fix: `mul_mat_l_int[i]` / `mul_mat_m_int[i]` / `mul_mat_s_int[i]` / `mul_mat_id_l_int[i]` / `mul_mat_id_m_int[i]` / `mul_mat_id_s_int[i]` were unconditionally set to `true` instead of mirroring the actual device pipeline capabilities from `mul_mat_l[i]` etc.; now properly initialized; internal Vulkan backend bug fix, no project changes required |
269269
| ~b9145–b9150 | `src/unicode.cpp` | New `unicode_regex_split_custom_qwen35()` function registered for the Qwen 3.5 tokenizer regex pattern; uses `[\p{L}\p{M}]+` letter-plus-combining-mark runs vs. Qwen2's `\p{L}+`; additive internal tokenizer change, no project changes required |
270270
| ~b9145–b9150 | `ggml/src/ggml-cpu/ggml-cpu-riscv64-spacemit/` | SpaceMIT RISC-V IME backend major refactor: IME2 kernels, expanded quantization (Q2_K, Q3_K, Q6_K, Q8_0, Q5_0, Q5_1, Q5_K, MXFP4), TCM (Tightly Coupled Memory) pool; new source files `ime2_kernels.cpp`, `ime_env.cpp`, `repack.cpp`, `rvv_kernels.cpp`, `spine_mem_pool.cpp`; guarded by `GGML_CPU_RISCV64_SPACEMIT` build flag; no project changes required |
271+
| ~b9150–b9151 | `common/log.h` | New `LOG_TRC` macro added at `LOG_LEVEL_TRACE = 4` (between INFO=3 and DEBUG=5); `LOG_LEVEL_DEBUG` bumped from 4 to 5; new `LOG_TRCV` verbosity variant; additive, no project changes required |
272+
| ~b9150–b9151 | `common/common.h` + `common/common.cpp` | New `common_params_print_info(const common_params &)` function: prints verbosity level, per-device memory (name, total, free), and system info at `LOG_INF` level; replaces the two-line pattern `LOG_INF("build_info: %s\n", llama_build_info()); LOG_INF("%s\n", common_params_get_system_info(params).c_str());` — updated in `jllama.cpp` |
273+
| ~b9150–b9151 | `common/common.cpp` | `common_init()` now unconditionally calls `common_log_set_prefix(…, true)` and `common_log_set_timestamps(…, true)` before setting the log callback; log output will always include prefix and timestamps unless explicitly disabled with `--no-log-prefix` / `--no-log-timestamps` |
274+
| ~b9150–b9151 | `common/arg.cpp` | `--log-prefix` and `--log-timestamps` now also accept negated forms `--no-log-prefix` / `--no-log-timestamps` (lambda receives a `bool value`); backing env vars renamed `LLAMA_LOG_PREFIX``LLAMA_ARG_LOG_PREFIX` and `LLAMA_LOG_TIMESTAMPS``LLAMA_ARG_LOG_TIMESTAMPS`; Java layer does not expose these, so no project changes required |
275+
| ~b9150–b9151 | `tools/server/server-common.h` | New `SLT_TRC` and `SRV_TRC` macros (emit at `LOG_TRC` level); additive, no project changes required |
276+
| ~b9150–b9151 | `tools/server/server-context.cpp` | New `server_slot::t_print_last` field + `print_timings_tg()` / `print_timings_pp()` methods: emit periodic in-flight token-generation and prompt-processing throughput to `SLT_INF` (throttled to ≥100 decoded tokens and ≥3 s interval); `server_context_impl` constructor now calls `mtmd_helper_log_set` unconditionally (was guarded by `!is_resume`); many `SLT_INF`/`SRV_WRN` downgraded to `SLT_TRC`/`SRV_INF`; compiled from upstream, no project JNI changes required |
277+
| ~b9150–b9151 | `tools/server/server-task.cpp` | Several `SRV_WRN` calls downgraded to `SRV_INF`; one `SRV_WRN` upgraded to `SRV_ERR` for failed state restore; compiled from upstream, no project changes required |
271278

272279
## Build Commands
273280

CMakeLists.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -107,7 +107,7 @@ set(GGML_AVX512 OFF CACHE BOOL "" FORCE)
107107
FetchContent_Declare(
108108
llama.cpp
109109
GIT_REPOSITORY https://github.com/ggerganov/llama.cpp.git
110-
GIT_TAG b9150
110+
GIT_TAG b9151
111111
)
112112
FetchContent_MakeAvailable(llama.cpp)
113113

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
![Java 8+](https://img.shields.io/badge/Java-8%2B-informational)
2-
[![llama.cpp b9150](https://img.shields.io/badge/llama.cpp-%23b9150-informational)](https://github.com/ggml-org/llama.cpp/releases/tag/b9150)
2+
[![llama.cpp b9151](https://img.shields.io/badge/llama.cpp-%23b9151-informational)](https://github.com/ggml-org/llama.cpp/releases/tag/b9151)
33
[![Maven Central](https://img.shields.io/maven-central/v/net.ladenthin/llama)](https://central.sonatype.com/artifact/net.ladenthin/llama)
44
[![Snapshot](https://img.shields.io/badge/snapshot-latest-informational)](https://central.sonatype.com/repository/maven-snapshots/net/ladenthin/llama/)
55

src/main/cpp/jllama.cpp

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -666,8 +666,7 @@ JNIEXPORT void JNICALL Java_net_ladenthin_llama_LlamaModel_loadModel(JNIEnv *env
666666

667667
llama_numa_init(params.numa);
668668

669-
LOG_INF("build_info: %s\n", llama_build_info());
670-
LOG_INF("%s\n", common_params_get_system_info(params).c_str());
669+
common_params_print_info(params);
671670

672671
// Resolve the auto sentinel before loading the model.
673672
if (params.n_parallel <= N_PARALLEL_AUTO) {

0 commit comments

Comments
 (0)