feat: llama-langchain4j — rename, CI build/test + Central publish, model-backed tests, upfront model cache by bernardladenthin · Pull Request #285 · bernardladenthin/java-llama.cpp

bernardladenthin · 2026-07-01T17:53:39Z

Summary

Rename langchain4j-jllama → llama-langchain4j so the adapter artifact groups with the core net.ladenthin:llama family (Java package net.ladenthin.llama.langchain4j unchanged). The core dependency is pinned to ${project.version} (drops the drift-prone jllama.version property); a CI version-lockstep guard fails the build if the module version ever diverges from the core version.
Wire the module into CI and Maven Central. New test-java-llama-langchain4j job (install core Java jar → lockstep guard → mvn verify, which also builds the javadoc jar so a release-time break is caught in PR CI); publish-snapshot/publish-release now needs: it and deploy net.ladenthin:llama-langchain4j at the same version as the core (own sources/javadoc/GPG mirroring the core release profile).
Close the adapter test gap + reuse existing models. Added model-backed integration tests for the embedding and scoring adapters (previously only chat had one); a new test-java-llama-langchain4j-integration job exercises chat/embedding/scoring against the models the pipeline already caches (Qwen3-0.6B, nomic-embed, jina-reranker) — no new model, no duplicate download.
DRY the model cache. Introduced a single upfront download-models job (the only place the ~5 GB set + validate-models.sh live); all 6 Java test jobs + the integration job now needs: it and restore-only the shared gguf-models-v1 cache. Removes the per-job download duplication and the cold-start save race (publish.yml −91 lines net).

Test plan

Affected unit / integration tests pass locally — module builds green: 7 model-free mapping tests pass, 4 model-backed integration tests self-skip without a GGUF; javadoc/sources jars build under doclint=all.
CI is green on this branch — the model-backed integration run, the download-models cold-cache run, and the Central deploy can only be exercised in the pipeline (need the native lib + warm cache + secrets, unavailable locally).
Docs updated — module README.md (per-adapter model properties + cache reuse) and CLAUDE.md (new module section, why it is a separate artifact not a classifier, the download-models policy, and the integration-job wiring).

Related issues / PRs

Follows up on Add langchain4j-jllama module: in-process LangChain4j adapters #284 (the merged langchain4j-jllama module), which this renames, wires into CI, and makes releasable.

Checklist

My commits follow Conventional Commits
No security-sensitive changes

🤖 Generated with Claude Code

Generated by Claude Code

…Central publish Cleans up the integration of the merged langchain4j adapters (PR #284) so the module is built, gated, version-locked and releasable — without touching the native build/release pipeline. - Rename artifact + directory langchain4j-jllama -> llama-langchain4j so it groups with the core net.ladenthin:llama family (Java package unchanged). - Pin the core dependency to ${project.version} (drops the drift-prone jllama.version property); a CI guard fails the build if the module version ever diverges from the core version (standalone module can't inherit it from a reactor). - Add per-artifact release plumbing (sources + javadoc + gpg + Central Publishing) mirroring the core release profile, so the module can deploy to Maven Central at the same version. - publish.yml: new test-java-llama-langchain4j job (install core Java jar, version-lockstep guard, mvn verify — builds the javadoc jar so a release-time javadoc break is caught in PR CI). publish-snapshot/publish-release now depend on it and deploy the module alongside the core. - REUSE.toml + README updated to the new name; CLAUDE.md documents the module, why it is a separate artifact (not a classifier), and the CI/publish wiring. Verified locally: core Java jar installs, module builds green (7 mapping tests pass, 2 model-backed integration tests self-skip), and the main/sources/javadoc jars all build under doclint=all. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Rt1paYztGJ2AKUuBuAGDXE

…GGUF cache Adds end-to-end coverage for the embedding and scoring adapters (previously only chat had an integration test) and wires a CI job that exercises all four adapters against the models the pipeline already caches — no new model, no duplicate download logic. - New JllamaEmbeddingModelIntegrationTest / JllamaScoringModelIntegrationTest, self-skipping via -Dnet.ladenthin.llama.langchain4j.embedding.model and .rerank.model (mirrors the existing chat integration test). Module now: 7 mapping tests + 4 model-backed integration tests (self-skip without a GGUF). - New test-java-llama-langchain4j-integration job reuses the existing shared cache (gguf-models-v1, restore-only) and the Linux-x86_64-libraries native artifact. It runs after test-java-linux-x86_64 (which populates the cache), installs the core jar with the bundled native lib, and points the adapters at the already-cached chat (Qwen3-0.6B), nomic-embedding and jina-reranker models. Validation-only (not a release gate); a cold cache degrades to a self-skip. - README + CLAUDE.md document the per-adapter model properties and the cache reuse. Verified locally: module builds green, 7 mapping tests pass, 4 integration tests self-skip without models. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Rt1paYztGJ2AKUuBuAGDXE

…he pipeline) Replaces the per-job model download logic (duplicated across all 6 Java test jobs) with a single upfront download-models job; every test job now only restores the shared cache. - New download-models job (needs: startgate) is the single source of the download logic: the ten curl steps + validate-models.sh live only there. It restores the shared cache (key gguf-models-v1) or, on a cold cache, downloads the full ~5 GB set, validates, and saves it. - All 6 Java test jobs (Linux, 3x macOS, 2x Windows) drop their duplicated download blocks and now needs: download-models + restore-only the cache, keeping validate-models.{sh,bat} as a per-job integrity guard. Removes the cold-start save race; net -91 lines in publish.yml. - The llama-langchain4j integration job now needs [crosscompile-linux-x86_64, download-models] instead of chaining behind test-java-linux-x86_64, so it runs in parallel off the guaranteed-populated cache. - CLAUDE.md updated: the model-cache section documents the upfront download-models job (replacing the old "save race" description) and the integration job wiring. YAML validated; structural counts verified (10 curl steps only in download-models; 7 jobs restore the cache; 7 validate steps). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Rt1paYztGJ2AKUuBuAGDXE

claude · 2026-07-01T17:55:39Z

Review of PR 285: llama-langchain4j module integration and model cache refactoring

This PR successfully accomplishes its stated goals with well-structured changes across three commits.

STRENGTHS:

Version lockstep guard in publish.yml:167-176 catches divergence early with explicit error messaging
Module POM is clean: proper version pinning, mirrors core release profile, no unexpected dependencies
Integration tests follow established pattern with proper Assumptions.assumeTrue, clear property names, good documentation
Workflow refactoring is comprehensive: all 6 Java test jobs correctly reference download-models and use restore-only cache
7 mapping tests + 4 model-backed integration tests (self-skipping) provides solid coverage
CI exercises all three adapters (chat/embedding/scoring) on actual models

ARCHITECTURE:

Cache key immutability (gguf-models-v1) is intentional to avoid partial hits
Integration test correctly reuses Linux library and cached models from core pipeline
DAG is clean: download-models runs once upfront, all test jobs depend on it
Removes ~91 lines of duplication and eliminates the old save-race problem

RELEASE READINESS:

Sources and javadoc jars built by mvn verify (caught in PR CI)
GPG signing and Central Publishing plugin wired in release profile
publish-snapshot/publish-release jobs depend on module test job

RECOMMENDATION: Approve. PR is well-executed, properly tested, and ready for CI.

claude · 2026-07-01T17:55:55Z

+      # one entry (key gguf-models-v1). validate-models is kept as an integrity guard so a
+      # partial/absent restore fails loudly instead of silently self-skipping required tests.
+      - name: Restore shared GGUF model cache (populated by download-models; no re-download)
        uses: actions/cache@v6


The cache action here will attempt to save on successful runs by default, which could interfere with the centralized download-models job's cache save (the key gguf-models-v1 is immutable, so a duplicate save from a test job fails silently). To ensure restore-only behavior and avoid the save-race mentioned in the PR description, add save-always: false to explicitly disable saving:

Suggested change

uses: actions/cache@v6

- name: Restore shared GGUF model cache (populated by download-models; no re-download)

uses: actions/cache@v6

with:

path: models/

key: gguf-models-v1

save-always: false

This applies to all 6 Java test jobs (lines 1064, 1181, 1245, 1309, 1375) and the langchain4j integration job (line 210)."

These steps are already restore-only in effect, so no change is needed here.

actions/cache saves at post-job only on a cache miss (plus job success) — on a cache hit it skips the save entirely. Because download-models populates the immutable key gguf-models-v1 before any test job runs, and every test job needs: download-models, these steps always get a cache hit → no post-job save. So there is no duplicate-save / save-race here.

save-always: false also wouldn't be the right mechanism: it is the default already, and in current actions/cache the save-always input is deprecated — it governs save-on-failure, not save-on-hit, so it doesn't express "restore-only." The idiomatic explicit form would be actions/cache/restore (no save path at all); I'm happy to switch the seven steps to that in a follow-up if explicit intent is preferred, but it isn't required for correctness.

Generated by Claude Code

claude · 2026-07-01T17:56:27Z

+`validate-models.sh` gate live only there). Every `test-java-*` job — and the langchain4j
+integration job — `needs: download-models` and then only **restores** that cache (no per-job
+download, no cold-start save race), keeping `validate-models.{sh,bat}` as a per-job integrity
+guard. GGUF is platform-independent, so the one ubuntu `download-models` cache is reused by the


The PR description states that "CLAUDE.md documents the module, why it is a separate artifact (not a classifier), and the CI/publish wiring," but the diff shows the entire ## LangChain4j integration (llama-langchain4j sibling module) section (previously ~56 lines) has been removed.

This section documented:

Why llama-langchain4j is a separate artifact (Java 17 requirement, avoid forcing it on core users)

How the module is wired into CI (test job, version-lockstep guard, release profile)

The new download-models job pattern

Consider restoring this documentation section, or updating the PR description to reflect that this documentation was intentionally removed in favor of the module's own README.md."

This section was added in this PR, not removed — the diff is being misread.

git diff origin/main -- CLAUDE.md is +62 / −6: the whole ## LangChain4j integration (llama-langchain4j sibling module) block is new (it exists on this branch, grep -c = 1, but not on main, grep -c = 0). The −6 lines are only the reworded "GGUF cache" paragraph (the old "save race" wording replaced by the download-models description). So there's no documentation regression and the PR description is accurate — no action needed.

Generated by Claude Code

sonarqubecloud · 2026-07-01T17:56:33Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

claude · 2026-07-01T17:56:33Z

Additional Findings

Found 2 issues during detailed review:

1. Cache save behavior in test jobs (CRITICAL) — The test jobs use 'actions/cache' without 'save-always: false', which means they will attempt to save on every run. This conflicts with the DRY model cache refactoring goal. The immutable cache key 'gguf-models-v1' means the save-race should have been eliminated by the download-models job being the only writer, but test jobs attempting to save could interfere.

Recommendation: Add 'save-always: false' to all 6 Java test jobs + the langchain4j integration job to explicitly disable saving and prevent any save-race regression.

2. Documentation regression in CLAUDE.md (INFORMATIONAL) — The diff shows that the entire '## LangChain4j integration (llama-langchain4j sibling module)' section (~56 lines) was removed from CLAUDE.md. This section documented the module architecture, why it's a separate artifact, the version-lockstep guard, and the CI wiring. The PR description claims this documentation was added/updated, but the diff shows it was actually deleted.

The module's README.md does cover these topics now, but CLAUDE.md is the canonical project documentation. Consider restoring the section or updating the PR description to clarify the intent.

claude · 2026-07-01T17:56:44Z

Review Complete

The PR is well-structured with good code quality and architecture. However, it has two issues to address before merge:

Must Fix:

Cache save behavior in test jobs — add 'save-always: false' to prevent save-race regression (details in inline comment on publish.yml:1064)

Should Fix:

CLAUDE.md documentation regression — the comprehensive LangChain4j module section was deleted; consider restoring or clarifying intent in commit message (details in inline comment on CLAUDE.md:980)

Once those are addressed, this PR is ready to merge.

claude added 3 commits July 1, 2026 11:14

bernardladenthin temporarily deployed to startgate July 1, 2026 17:53 — with GitHub Actions Inactive

claude Bot reviewed Jul 1, 2026

View reviewed changes

bernardladenthin merged commit 0b4b58e into main Jul 1, 2026
49 of 50 checks passed

bernardladenthin deleted the claude/pr-284-explanation-me9aym branch July 1, 2026 19:19

bernardladenthin mentioned this pull request Jul 1, 2026

refactor: move core into llama/ submodule under a Maven reactor (parent + llama + llama-langchain4j) #288

Draft

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: llama-langchain4j — rename, CI build/test + Central publish, model-backed tests, upfront model cache#285

feat: llama-langchain4j — rename, CI build/test + Central publish, model-backed tests, upfront model cache#285
bernardladenthin merged 3 commits into
mainfrom
claude/pr-284-explanation-me9aym

bernardladenthin commented Jul 1, 2026

Uh oh!

claude Bot commented Jul 1, 2026

Uh oh!

claude Bot Jul 1, 2026

Uh oh!

bernardladenthin Jul 1, 2026

Uh oh!

claude Bot Jul 1, 2026

Uh oh!

bernardladenthin Jul 1, 2026

Uh oh!

sonarqubecloud Bot commented Jul 1, 2026

Uh oh!

claude Bot commented Jul 1, 2026

Uh oh!

claude Bot commented Jul 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

bernardladenthin commented Jul 1, 2026

Summary

Test plan

Related issues / PRs

Checklist

Uh oh!

claude Bot commented Jul 1, 2026

Uh oh!

claude Bot Jul 1, 2026

Choose a reason for hiding this comment

Uh oh!

bernardladenthin Jul 1, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot Jul 1, 2026

Choose a reason for hiding this comment

Uh oh!

bernardladenthin Jul 1, 2026

Choose a reason for hiding this comment

Uh oh!

sonarqubecloud Bot commented Jul 1, 2026

Quality Gate passed

Uh oh!

claude Bot commented Jul 1, 2026

Additional Findings

Uh oh!

claude Bot commented Jul 1, 2026

Review Complete

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants