Skip to content

Commit 62274ca

Browse files
Merge pull request #125 from bernardladenthin/claude/update-b9106-compatibility-SPPw1
Upgrade llama.cpp to b9106
2 parents 8c75c30 + 5c94572 commit 62274ca

3 files changed

Lines changed: 6 additions & 3 deletions

File tree

CLAUDE.md

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
66

77
Java bindings for [llama.cpp](https://github.com/ggerganov/llama.cpp) via JNI, providing a high-level API for LLM inference in Java. The Java layer communicates with a native C++ library through JNI.
88

9-
Current llama.cpp pinned version: **b9103**
9+
Current llama.cpp pinned version: **b9106**
1010

1111
## Upgrading CUDA Version
1212

@@ -250,6 +250,9 @@ Also review the project `CMakeLists.txt` for build-system-level breaks (e.g. ren
250250
| ~b9094–b9102 | `src/llama-model.cpp` | `ggml/src/ggml-virtgpu/ggml-backend-device.cpp` gains `#include <mutex>` for `std::once_flag`; internal backend fix, no project changes required |
251251
| ~b9094–b9102 | `vendor/cpp-httplib/httplib.cpp` + `httplib.h` | Security fix: chunk-size parsing replaced `strtoul` with manual hex-digit scanning to prevent overflow and reject invalid chunk extensions; version bumped to 0.43.4; compiled automatically, no project changes required |
252252
| ~b9102–b9103 | `vendor/cpp-httplib/httplib.cpp` + `httplib.h` | cpp-httplib bumped to v0.44.0: (1) RFC 9110 §5.5 compliance — header field values are no longer percent-decoded by the recipient in `parse_header`; `Location`/`Referer` special-casing removed; callers that need URI-component decoding must call `decode_uri_component()` explicitly; (2) `ThreadPool` constructor is now exception-safe — if thread creation fails partway through, already-started workers are signalled to exit and joined before rethrowing, preventing `std::terminate` from joinable threads in the destructor; compiled automatically, no project changes required |
253+
| ~b9103–b9106 | `ggml/src/ggml-vulkan/ggml-vulkan.cpp` + Vulkan shaders | Vulkan flash attention refactored: `pipeline_flash_attn_f32_f16` changed from a per-type array of maps to a single map; mixed K/V quant types (e.g. Q4_0 K + F16 V) now supported on all Vulkan FA paths (scalar, cm1, cm2) rather than coopmat2 only; per-type SPIR-V variants replaced by two generic modules (`flash_attn_f32_f16` and `flash_attn_f32_f16_int8`) that select K/V type at runtime via `FaTypeK`/`FaTypeV` spec constants; new `flash_attn_dequant.glsl` contains aliased SSBO views and an uber `dequantize4()` switch; the K/V type mismatch guard removed from `ggml_backend_vk_device_supports_op`; internal Vulkan backend refactor, no project changes required |
254+
| ~b9103–b9106 | `ggml/src/ggml-cuda/argsort.cu` | Added `#include <cuda/iterator>` for CCCL ≥ 3.1 strided-iterator path; internal CUDA backend, no project changes required |
255+
| ~b9103–b9106 | `convert_hf_to_gguf.py` | Mistral Medium 3.5 mmproj support: `n_embd_text` now reads `"dim"` key instead of `"hidden_dim"`; negative `img_break_tok_id` placeholders resolved from `tekken.json` or `tokenizer.json`; conversion tool only, no project changes required |
253256

254257
## Build Commands
255258

CMakeLists.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -97,7 +97,7 @@ set(GGML_AVX512 OFF CACHE BOOL "" FORCE)
9797
FetchContent_Declare(
9898
llama.cpp
9999
GIT_REPOSITORY https://github.com/ggerganov/llama.cpp.git
100-
GIT_TAG b9103
100+
GIT_TAG b9106
101101
)
102102
FetchContent_MakeAvailable(llama.cpp)
103103

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
![Java 8+](https://img.shields.io/badge/Java-8%2B-informational)
2-
[![llama.cpp b9103](https://img.shields.io/badge/llama.cpp-%23b9103-informational)](https://github.com/ggml-org/llama.cpp/releases/tag/b9103)
2+
[![llama.cpp b9106](https://img.shields.io/badge/llama.cpp-%23b9106-informational)](https://github.com/ggml-org/llama.cpp/releases/tag/b9106)
33
[![Snapshot](https://img.shields.io/badge/snapshot-latest-informational)](https://github.com/bernardladenthin/java-llama.cpp/releases/tag/snapshot)
44

55
# Java Bindings for [llama.cpp](https://github.com/ggerganov/llama.cpp)

0 commit comments

Comments
 (0)