Skip to content

Commit 91d9799

Browse files
Merge pull request #239 from bernardladenthin/claude/intelligent-cray-9tfnxv
Upgrade llama.cpp to b9682 and improve CI test diagnostics
2 parents 55bc182 + e0ee1f6 commit 91d9799

6 files changed

Lines changed: 21 additions & 12 deletions

File tree

.github/workflows/publish.yml

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -422,7 +422,7 @@ jobs:
422422
echo "${{ github.workspace }}/core.%e.%p" | sudo tee /proc/sys/kernel/core_pattern
423423
- name: Run tests
424424
run: |
425-
mvn --no-transfer-progress -P jcstress test \
425+
mvn -e --no-transfer-progress -P jcstress test \
426426
-Dnet.ladenthin.llama.nomic.path=models/${NOMIC_EMBED_MODEL_NAME} \
427427
-Dnet.ladenthin.llama.vision.model=models/${VISION_MODEL_NAME} \
428428
-Dnet.ladenthin.llama.vision.mmproj=models/${VISION_MMPROJ_NAME} \
@@ -539,7 +539,7 @@ jobs:
539539
run: ulimit -c unlimited
540540
- name: Run tests
541541
run: |
542-
mvn --no-transfer-progress -Dnet.ladenthin.llama.test.ngl=0 test \
542+
mvn -e --no-transfer-progress -Dnet.ladenthin.llama.test.ngl=0 test \
543543
-Dnet.ladenthin.llama.vision.model=models/${VISION_MODEL_NAME} \
544544
-Dnet.ladenthin.llama.vision.mmproj=models/${VISION_MMPROJ_NAME} \
545545
-Dnet.ladenthin.llama.vision.image=${VISION_IMAGE_PATH}
@@ -603,7 +603,7 @@ jobs:
603603
run: ulimit -c unlimited
604604
- name: Run tests
605605
run: |
606-
mvn --no-transfer-progress test \
606+
mvn -e --no-transfer-progress test \
607607
-Dnet.ladenthin.llama.vision.model=models/${VISION_MODEL_NAME} \
608608
-Dnet.ladenthin.llama.vision.mmproj=models/${VISION_MMPROJ_NAME} \
609609
-Dnet.ladenthin.llama.vision.image=${VISION_IMAGE_PATH}
@@ -667,7 +667,7 @@ jobs:
667667
run: ulimit -c unlimited
668668
- name: Run tests
669669
run: |
670-
mvn --no-transfer-progress test \
670+
mvn -e --no-transfer-progress test \
671671
-Dnet.ladenthin.llama.vision.model=models/${VISION_MODEL_NAME} \
672672
-Dnet.ladenthin.llama.vision.mmproj=models/${VISION_MMPROJ_NAME} \
673673
-Dnet.ladenthin.llama.vision.image=${VISION_IMAGE_PATH}
@@ -750,7 +750,7 @@ jobs:
750750
Get-ItemProperty -Path $key | Format-List
751751
- name: Run tests
752752
run: |
753-
mvn --no-transfer-progress test `
753+
mvn -e --no-transfer-progress test `
754754
"-Dnet.ladenthin.llama.vision.model=models/$env:VISION_MODEL_NAME" `
755755
"-Dnet.ladenthin.llama.vision.mmproj=models/$env:VISION_MMPROJ_NAME" `
756756
"-Dnet.ladenthin.llama.vision.image=$env:VISION_IMAGE_PATH"

CLAUDE.md

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
66

77
Java bindings for [llama.cpp](https://github.com/ggerganov/llama.cpp) via JNI, providing a high-level API for LLM inference in Java. The Java layer communicates with a native C++ library through JNI.
88

9-
Current llama.cpp pinned version: **b9642**
9+
Current llama.cpp pinned version: **b9682**
1010

1111
## Upgrading CUDA Version
1212

@@ -590,7 +590,7 @@ ctest --test-dir build --output-on-failure -R "ResultsToJson"
590590

591591
#### Upstream source location (in CMake build tree)
592592

593-
llama.cpp is fetched via CMake FetchContent, pinned to `GIT_TAG b9642`.
593+
llama.cpp is fetched via CMake FetchContent, pinned to `GIT_TAG b9682`.
594594

595595
```
596596
build/_deps/llama.cpp-src/tools/server/ ← server-task.h, server-common.h, etc.
@@ -763,6 +763,10 @@ See [`../workspace/policies/jqwik-prompt-injection.md`](../workspace/policies/jq
763763
764764
See [`../workspace/policies/lombok-config.md`](../workspace/policies/lombok-config.md).
765765
766+
## CI Test Diagnostics
767+
768+
See [`../workspace/policies/ci-test-diagnostics.md`](../workspace/policies/ci-test-diagnostics.md).
769+
766770
## JPMS Module Descriptor
767771
768772
This repo ships a `module-info.java` compiled in a separate `release 9` execution. Javadoc

CMakeLists.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -139,7 +139,7 @@ set(LLAMA_BUILD_APP OFF CACHE BOOL "" FORCE)
139139
FetchContent_Declare(
140140
llama.cpp
141141
GIT_REPOSITORY https://github.com/ggerganov/llama.cpp.git
142-
GIT_TAG b9642
142+
GIT_TAG b9682
143143
)
144144
FetchContent_MakeAvailable(llama.cpp)
145145

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
**Build:**
22
![Java 8+](https://img.shields.io/badge/Java-8%2B-informational)
33
![Platform](https://img.shields.io/badge/Platform-Linux%20%7C%20macOS%20%7C%20Windows%20%7C%20Android-lightgrey)
4-
[![llama.cpp b9642](https://img.shields.io/badge/llama.cpp-%23b9642-informational)](https://github.com/ggml-org/llama.cpp/releases/tag/b9642)
4+
[![llama.cpp b9682](https://img.shields.io/badge/llama.cpp-%23b9682-informational)](https://github.com/ggml-org/llama.cpp/releases/tag/b9682)
55
[![JPMS](https://img.shields.io/badge/JPMS-modular%20JAR-25A162)](https://openjdk.org/projects/jigsaw/)
66
![JUnit](https://img.shields.io/badge/tested%20with-JUnit6-25A162)
77
[![JSpecify](https://img.shields.io/badge/JSpecify-1.0.0%20%40NullMarked-25A162)](https://jspecify.dev)

docs/history/llama-cpp-breaking-changes.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -356,3 +356,8 @@ Used during `llama.cpp` version bumps: when upgrading, scan this file from the r
356356
| b9637–b9642 | `ggml/src/ggml-webgpu/wgsl-shaders/mul_mat_decls.tmpl` | WebGPU matmul shared-memory dequant templates rewritten: legacy/k-quant `#elif` chains converted to independent `#if defined(...)` blocks, and the i-quant (super-block 256) IQ1/IQ2/IQ3/IQ4 paths reworked to process `NQ` quants per thread with vectorized `store_shmem_iquants`/`create_iq_gw4` helpers. Internal WebGPU backend — the project builds CPU/CUDA/Metal/OpenCL only, never WebGPU. No project changes required |
357357
| b9637–b9642 | `tools/ui/`, `tools/ui/src/lib/utils/heic-to-jpeg.ts` (new) | WebUI gains a "render thinking as Markdown" display setting and client-side HEIC/HEIF image upload support (lazy CDN-loaded `heic-to` decoder → JPEG). The project compiles `server-context/queue/task/models` but not `tools/ui`, so the WebUI is absent from `jllama`. No project changes required |
358358
| b9637–b9642 | `convert_lora_to_gguf.py`, `tests/test-backend-ops.cpp` | LoRA converter now resolves the base-model architecture via `get_model_architecture(hparams, ModelType.TEXT)` instead of hand-reading `text_config`/`architectures`; a `GGML_TYPE_BF16` `test_repeat` case was added to the backend-ops test. Python tooling and an upstream test — neither is compiled into `jllama`. No project changes required |
359+
| b9642–b9682 | `tools/mtmd/mtmd-helper.h` + `tools/mtmd/mtmd-helper.cpp` | `mtmd_helper_decode_image_chunk` gained two parameters — a post-decode callback plus its `user_data` — so callers can hook each decoded multimodal chunk; the standalone `process_chunk` helper was removed and folded into `mtmd_helper_eval_chunk_single`. Consumed only inside the upstream-compiled `mtmd-helper.cpp` / `server-context.cpp`; the project's hand-written C++ references no `mtmd_*`/`process_chunk` symbol (zero matches in `src/main/cpp`). No project source changes required. **New feature:** the post-decode callback enables multimodal speculative-draft decoding — exposable later as a vision + draft-model Java path |
360+
| b9642–b9682 | `common/common.cpp` (`build_lora_mm_id`) | The LoRA multimodal id-embedding builder gained a `w_s` scale-weight argument for per-adapter scaling. Internal to the upstream-compiled `common` library; the project never calls it. No project source changes required |
361+
| b9642–b9682 | `common/speculative.{h,cpp}` | Speculative decoding now accumulates per-draft-position acceptance statistics and adds an Eagle3 backend-sampling path (the draft model samples on the compute backend). `common_speculative_*` is compiled into `common` and reached only through the upstream server's speculative slot; the project's C++ references no `speculative`/`draft` symbol. No project source changes required. **New feature:** per-position draft-acceptance metrics — could surface as speculative-decoding telemetry in a future Java API |
362+
| b9642–b9682 | `tools/server/server-context.cpp` | Server slot refactored so an `mtmd` (multimodal) prompt can feed a speculative draft model: image/media chunks are routed through the new `mtmd_helper_decode_image_chunk` callback before drafting. Compiled directly into `jllama` (the project builds `server-context/queue/task/models`), but the change is internal to the slot state machine and binds no new/renamed symbol; verified that `jllama.cpp` and the `*_helpers.hpp` headers call none of the touched functions. No project source changes required |
363+
| b9642–b9682 | `ggml/src/ggml-*` backends, `tools/` (incl. `llama-bench --offline`), conda-forge packaging, `docs/`, `.github/` | Routine backend kernel updates and tooling/docs/CI tweaks (a new `llama-bench --offline` flag, conda-forge recipe notes). None are compiled into `jllama` beyond the already-built CPU/CUDA/Metal/OpenCL backends, and none change a symbol the project binds. No project changes required |

pom.xml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -80,7 +80,7 @@ SPDX-License-Identifier: MIT
8080
<spotbugs.version>4.10.2.0</spotbugs.version>
8181
<fb-contrib.version>7.7.4</fb-contrib.version>
8282
<findsecbugs.version>1.14.0</findsecbugs.version>
83-
<spotless.version>3.6.0</spotless.version>
83+
<spotless.version>3.7.0</spotless.version>
8484
<palantir-java-format.version>2.92.0</palantir-java-format.version>
8585
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
8686
<project.build.outputTimestamp>${git.commit.time}</project.build.outputTimestamp>
@@ -330,7 +330,7 @@ SPDX-License-Identifier: MIT
330330
<plugin>
331331
<groupId>org.sonatype.central</groupId>
332332
<artifactId>central-publishing-maven-plugin</artifactId>
333-
<version>0.10.0</version>
333+
<version>0.11.0</version>
334334
</plugin>
335335
</plugins>
336336
</pluginManagement>
@@ -587,7 +587,7 @@ SPDX-License-Identifier: MIT
587587
<groupId>org.apache.maven.plugins</groupId>
588588
<artifactId>maven-surefire-plugin</artifactId>
589589
<configuration>
590-
<argLine>@{argLine} -XX:ErrorFile=hs_err_pid%p.log -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=.</argLine>
590+
<argLine>@{argLine} -Xmx2g -XX:ErrorFile=hs_err_pid%p.log -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=.</argLine>
591591
<!--
592592
Capture each test class's stdout/stderr into
593593
target/surefire-reports/<class>-output.txt. When a native crash

0 commit comments

Comments
 (0)