feat: llama-langchain4j — rename, CI build/test + Central publish, model-backed tests, upfront model cache#285
Conversation
…Central publish Cleans up the integration of the merged langchain4j adapters (PR #284) so the module is built, gated, version-locked and releasable — without touching the native build/release pipeline. - Rename artifact + directory langchain4j-jllama -> llama-langchain4j so it groups with the core net.ladenthin:llama family (Java package unchanged). - Pin the core dependency to ${project.version} (drops the drift-prone jllama.version property); a CI guard fails the build if the module version ever diverges from the core version (standalone module can't inherit it from a reactor). - Add per-artifact release plumbing (sources + javadoc + gpg + Central Publishing) mirroring the core release profile, so the module can deploy to Maven Central at the same version. - publish.yml: new test-java-llama-langchain4j job (install core Java jar, version-lockstep guard, mvn verify — builds the javadoc jar so a release-time javadoc break is caught in PR CI). publish-snapshot/publish-release now depend on it and deploy the module alongside the core. - REUSE.toml + README updated to the new name; CLAUDE.md documents the module, why it is a separate artifact (not a classifier), and the CI/publish wiring. Verified locally: core Java jar installs, module builds green (7 mapping tests pass, 2 model-backed integration tests self-skip), and the main/sources/javadoc jars all build under doclint=all. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Rt1paYztGJ2AKUuBuAGDXE
…GGUF cache Adds end-to-end coverage for the embedding and scoring adapters (previously only chat had an integration test) and wires a CI job that exercises all four adapters against the models the pipeline already caches — no new model, no duplicate download logic. - New JllamaEmbeddingModelIntegrationTest / JllamaScoringModelIntegrationTest, self-skipping via -Dnet.ladenthin.llama.langchain4j.embedding.model and .rerank.model (mirrors the existing chat integration test). Module now: 7 mapping tests + 4 model-backed integration tests (self-skip without a GGUF). - New test-java-llama-langchain4j-integration job reuses the existing shared cache (gguf-models-v1, restore-only) and the Linux-x86_64-libraries native artifact. It runs after test-java-linux-x86_64 (which populates the cache), installs the core jar with the bundled native lib, and points the adapters at the already-cached chat (Qwen3-0.6B), nomic-embedding and jina-reranker models. Validation-only (not a release gate); a cold cache degrades to a self-skip. - README + CLAUDE.md document the per-adapter model properties and the cache reuse. Verified locally: module builds green, 7 mapping tests pass, 4 integration tests self-skip without models. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Rt1paYztGJ2AKUuBuAGDXE
…he pipeline)
Replaces the per-job model download logic (duplicated across all 6 Java test
jobs) with a single upfront download-models job; every test job now only
restores the shared cache.
- New download-models job (needs: startgate) is the single source of the
download logic: the ten curl steps + validate-models.sh live only there. It
restores the shared cache (key gguf-models-v1) or, on a cold cache, downloads
the full ~5 GB set, validates, and saves it.
- All 6 Java test jobs (Linux, 3x macOS, 2x Windows) drop their duplicated
download blocks and now needs: download-models + restore-only the cache,
keeping validate-models.{sh,bat} as a per-job integrity guard. Removes the
cold-start save race; net -91 lines in publish.yml.
- The llama-langchain4j integration job now needs [crosscompile-linux-x86_64,
download-models] instead of chaining behind test-java-linux-x86_64, so it runs
in parallel off the guaranteed-populated cache.
- CLAUDE.md updated: the model-cache section documents the upfront download-models
job (replacing the old "save race" description) and the integration job wiring.
YAML validated; structural counts verified (10 curl steps only in download-models;
7 jobs restore the cache; 7 validate steps).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01Rt1paYztGJ2AKUuBuAGDXE
|
Review of PR 285: llama-langchain4j module integration and model cache refactoring This PR successfully accomplishes its stated goals with well-structured changes across three commits. STRENGTHS:
ARCHITECTURE:
RELEASE READINESS:
RECOMMENDATION: Approve. PR is well-executed, properly tested, and ready for CI. |
| # one entry (key gguf-models-v1). validate-models is kept as an integrity guard so a | ||
| # partial/absent restore fails loudly instead of silently self-skipping required tests. | ||
| - name: Restore shared GGUF model cache (populated by download-models; no re-download) | ||
| uses: actions/cache@v6 |
There was a problem hiding this comment.
The cache action here will attempt to save on successful runs by default, which could interfere with the centralized download-models job's cache save (the key gguf-models-v1 is immutable, so a duplicate save from a test job fails silently). To ensure restore-only behavior and avoid the save-race mentioned in the PR description, add save-always: false to explicitly disable saving:
| uses: actions/cache@v6 | |
| - name: Restore shared GGUF model cache (populated by download-models; no re-download) | |
| uses: actions/cache@v6 | |
| with: | |
| path: models/ | |
| key: gguf-models-v1 | |
| save-always: false |
This applies to all 6 Java test jobs (lines 1064, 1181, 1245, 1309, 1375) and the langchain4j integration job (line 210)."
There was a problem hiding this comment.
These steps are already restore-only in effect, so no change is needed here.
actions/cache saves at post-job only on a cache miss (plus job success) — on a cache hit it skips the save entirely. Because download-models populates the immutable key gguf-models-v1 before any test job runs, and every test job needs: download-models, these steps always get a cache hit → no post-job save. So there is no duplicate-save / save-race here.
save-always: false also wouldn't be the right mechanism: it is the default already, and in current actions/cache the save-always input is deprecated — it governs save-on-failure, not save-on-hit, so it doesn't express "restore-only." The idiomatic explicit form would be actions/cache/restore (no save path at all); I'm happy to switch the seven steps to that in a follow-up if explicit intent is preferred, but it isn't required for correctness.
Generated by Claude Code
| `validate-models.sh` gate live only there). Every `test-java-*` job — and the langchain4j | ||
| integration job — `needs: download-models` and then only **restores** that cache (no per-job | ||
| download, no cold-start save race), keeping `validate-models.{sh,bat}` as a per-job integrity | ||
| guard. GGUF is platform-independent, so the one ubuntu `download-models` cache is reused by the |
There was a problem hiding this comment.
The PR description states that "CLAUDE.md documents the module, why it is a separate artifact (not a classifier), and the CI/publish wiring," but the diff shows the entire ## LangChain4j integration (llama-langchain4j sibling module) section (previously ~56 lines) has been removed.
This section documented:
- Why llama-langchain4j is a separate artifact (Java 17 requirement, avoid forcing it on core users)
- How the module is wired into CI (test job, version-lockstep guard, release profile)
- The new download-models job pattern
Consider restoring this documentation section, or updating the PR description to reflect that this documentation was intentionally removed in favor of the module's own README.md."
There was a problem hiding this comment.
This section was added in this PR, not removed — the diff is being misread.
git diff origin/main -- CLAUDE.md is +62 / −6: the whole ## LangChain4j integration (llama-langchain4j sibling module) block is new (it exists on this branch, grep -c = 1, but not on main, grep -c = 0). The −6 lines are only the reworded "GGUF cache" paragraph (the old "save race" wording replaced by the download-models description). So there's no documentation regression and the PR description is accurate — no action needed.
Generated by Claude Code
|
Additional FindingsFound 2 issues during detailed review: 1. Cache save behavior in test jobs (CRITICAL) — The test jobs use 'actions/cache' without 'save-always: false', which means they will attempt to save on every run. This conflicts with the DRY model cache refactoring goal. The immutable cache key 'gguf-models-v1' means the save-race should have been eliminated by the download-models job being the only writer, but test jobs attempting to save could interfere. Recommendation: Add 'save-always: false' to all 6 Java test jobs + the langchain4j integration job to explicitly disable saving and prevent any save-race regression. 2. Documentation regression in CLAUDE.md (INFORMATIONAL) — The diff shows that the entire '## LangChain4j integration (llama-langchain4j sibling module)' section (~56 lines) was removed from CLAUDE.md. This section documented the module architecture, why it's a separate artifact, the version-lockstep guard, and the CI wiring. The PR description claims this documentation was added/updated, but the diff shows it was actually deleted. The module's README.md does cover these topics now, but CLAUDE.md is the canonical project documentation. Consider restoring the section or updating the PR description to clarify the intent. |
Review CompleteThe PR is well-structured with good code quality and architecture. However, it has two issues to address before merge: Must Fix:
Should Fix:
Once those are addressed, this PR is ready to merge. |



Summary
langchain4j-jllama→llama-langchain4jso the adapter artifact groups with the corenet.ladenthin:llamafamily (Java packagenet.ladenthin.llama.langchain4junchanged). The core dependency is pinned to${project.version}(drops the drift-pronejllama.versionproperty); a CI version-lockstep guard fails the build if the module version ever diverges from the core version.test-java-llama-langchain4jjob (install core Java jar → lockstep guard →mvn verify, which also builds the javadoc jar so a release-time break is caught in PR CI);publish-snapshot/publish-releasenowneeds:it and deploynet.ladenthin:llama-langchain4jat the same version as the core (own sources/javadoc/GPG mirroring the corereleaseprofile).test-java-llama-langchain4j-integrationjob exercises chat/embedding/scoring against the models the pipeline already caches (Qwen3-0.6B, nomic-embed, jina-reranker) — no new model, no duplicate download.download-modelsjob (the only place the ~5 GB set +validate-models.shlive); all 6 Java test jobs + the integration job nowneeds:it and restore-only the sharedgguf-models-v1cache. Removes the per-job download duplication and the cold-start save race (publish.yml−91 lines net).Test plan
doclint=all.download-modelscold-cache run, and the Central deploy can only be exercised in the pipeline (need the native lib + warm cache + secrets, unavailable locally).README.md(per-adapter model properties + cache reuse) andCLAUDE.md(new module section, why it is a separate artifact not a classifier, thedownload-modelspolicy, and the integration-job wiring).Related issues / PRs
langchain4j-jllamamodule), which this renames, wires into CI, and makes releasable.Checklist
🤖 Generated with Claude Code
Generated by Claude Code