Skip to content

Upgrade llama.cpp to b9682 and improve CI test diagnostics#239

Merged
bernardladenthin merged 3 commits into
mainfrom
claude/intelligent-cray-9tfnxv
Jun 17, 2026
Merged

Upgrade llama.cpp to b9682 and improve CI test diagnostics#239
bernardladenthin merged 3 commits into
mainfrom
claude/intelligent-cray-9tfnxv

Conversation

@bernardladenthin

Copy link
Copy Markdown
Owner

Summary

  • Upgrade llama.cpp from b9642 to b9682, incorporating multimodal speculative-draft decoding infrastructure and routine backend updates
  • Add -e flag to all Maven test invocations in CI workflows for enhanced error diagnostics
  • Increase JVM heap limit to 2GB for Surefire tests to prevent OOM failures
  • Update Spotless to 3.7.0 and central-publishing-maven-plugin to 0.11.0
  • Document CI test diagnostics policy and llama.cpp breaking changes for b9642–b9682 range

Test plan

  • Affected unit / integration tests pass locally
  • CI is green on this branch
  • Docs / CHANGELOG updated where applicable

Related issues / PRs

None

Checklist

  • I have read CONTRIBUTING.md and CODE_OF_CONDUCT.md
  • My commits follow Conventional Commits
  • No security-sensitive changes (if there are, I have notified the maintainer privately per SECURITY.md)

Details

llama.cpp upgrade (b9642 → b9682):
The upstream release includes new multimodal speculative-draft decoding capabilities (post-decode callbacks in mtmd_helper_decode_image_chunk, per-position draft-acceptance statistics in speculative decoding, and Eagle3 backend-sampling). These are internal to the upstream-compiled libraries and server slot state machine; no changes to jllama's C++ bindings or Java API are required. All changes are documented in docs/history/llama-cpp-breaking-changes.md.

CI improvements:

  • Added -e flag to Maven invocations across all test jobs (jcstress, Linux, macOS, Windows) to capture full error output and stack traces, improving diagnostics for test failures
  • Increased Surefire argLine heap limit from default to -Xmx2g to prevent out-of-memory errors during test execution
  • Added reference to new CI test diagnostics policy in CLAUDE.md

Dependency updates:

  • Spotless 3.6.0 → 3.7.0
  • central-publishing-maven-plugin 0.10.0 → 0.11.0

https://claude.ai/code/session_01JBzF5wtCjRu5t4FMzphryM

claude added 3 commits June 17, 2026 12:42
…stics sync)

Insert -Xmx2g into the surefire argLine (repo already had the -XX crash/heap-dump flags and memory before/after CI steps); add -e to the Java test mvn invocations. Implements workspace/policies/ci-test-diagnostics.md.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01JBzF5wtCjRu5t4FMzphryM
Clean version bump: no project C++ source changes required. The upstream
changes in this range are all internal to upstream-compiled translation
units, and src/main/cpp references none of the touched symbols (grep for
mtmd/speculative/draft/process_chunk/build_lora_mm_id returns zero matches):

- process_chunk removed, folded into mtmd_helper_eval_chunk_single
- mtmd_helper_decode_image_chunk gained a post-decode callback + user_data
- build_lora_mm_id gained a w_s scale-weight argument
- speculative decoding: per-position acceptance stats + Eagle3 backend sampling
- server-context refactor lets an mtmd prompt feed a speculative draft model

Verified: mvn compile (JNI headers) and cmake configure against b9682 both
succeed. Documented in docs/history/llama-cpp-breaking-changes.md.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01JBzF5wtCjRu5t4FMzphryM
….11.0

Mirrors the streambuffer Dependabot updates (#108, #109) on the
java-llama.cpp branch. Both target versions are the current latest
releases on Maven Central.

Verified:
- spotless:check passes with 3.7.0 (no reformatting of existing sources;
  palantir-java-format stays pinned at 2.92.0)
- central-publishing 0.11.0 resolves from Maven Central (used only by the
  release/deploy profile)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01JBzF5wtCjRu5t4FMzphryM
@bernardladenthin bernardladenthin merged commit 91d9799 into main Jun 17, 2026
8 of 36 checks passed
@bernardladenthin bernardladenthin deleted the claude/intelligent-cray-9tfnxv branch June 17, 2026 14:02
@sonarqubecloud

Copy link
Copy Markdown

bernardladenthin pushed a commit that referenced this pull request Jun 18, 2026
The Win32 (x86) C++ test job intermittently failed at build-time
gtest_discover_tests. llama/ggml/mtmd are linked statically into one
large jllama_test binary; on 32-bit Windows its startup plus
--gtest_list_tests enumeration sits near the default 5s discovery
timeout on shared CI runners. The same b9682 binary discovered within
5s in the #239 merge run but was killed at the 5s timeout in this run
(process still alive, empty output — a timeout, not a crash); the b9682
upgrade and 5 newly added tests nudged a marginal case over the limit.
x64, Linux and macOS finish well under the default and are unaffected.

Raise DISCOVERY_TIMEOUT to 120s (a maximum, so fast platforms still
return immediately), which keeps full C++ test coverage on x86 rather
than skipping the binary there. Verified locally: 445/445 C++ tests
still pass.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_014L2dLbAtwdq7C6a2gFRsQQ
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants