Skip to content

Commit 0b4b58e

Browse files
Merge pull request bernardladenthin#285 from bernardladenthin/claude/pr-284-explanation-me9aym
feat: llama-langchain4j — rename, CI build/test + Central publish, model-backed tests, upfront model cache
2 parents 93f48c0 + c2ed8c8 commit 0b4b58e

16 files changed

Lines changed: 585 additions & 288 deletions

.github/workflows/publish.yml

Lines changed: 197 additions & 182 deletions
Large diffs are not rendered by default.

CLAUDE.md

Lines changed: 62 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -971,12 +971,18 @@ properties set, so `LlamaEmbeddingsTest`, `MultimodalIntegrationTest`, and `TtsI
971971
these as **required** (a missing model hard-fails the job before tests run, so a download
972972
regression can never silently downgrade to a skip). The only model still self-skipping is the
973973
audio-input model (`AudioInputIntegrationTest`) — it has no committed clip and no CI download.
974-
The shared GGUF cache (`actions/cache`, key `gguf-models-v1`, path `models/`) holds the full set;
975-
since every test job downloads the full set before the cache can save, whichever job wins the
976-
save race caches everything. Because the cache key is immutable, changing the model set means the
977-
**existing cache entry must be deleted** (not bumped to `v2`) so the next run rebuilds it complete
978-
— locally the model tests still self-skip when a GGUF is absent (`Assume.assumeTrue`), so a
979-
partial local checkout is fine.
974+
The shared GGUF cache (`actions/cache`, key `gguf-models-v1`, path `models/`) holds the full set
975+
and is populated **once, upfront** by a dedicated **`download-models`** job (`needs: startgate`):
976+
it is the single place the ~5 GB set is fetched from HuggingFace (the ten `curl` steps + the
977+
`validate-models.sh` gate live only there). Every `test-java-*` job — and the langchain4j
978+
integration job — `needs: download-models` and then only **restores** that cache (no per-job
979+
download, no cold-start save race), keeping `validate-models.{sh,bat}` as a per-job integrity
980+
guard. GGUF is platform-independent, so the one ubuntu `download-models` cache is reused by the
981+
macOS and Windows jobs too. `validate-models.{sh,bat}` treats the models as **required** (a
982+
missing model hard-fails the job before tests run). Because the cache key is immutable, changing
983+
the model set means the **existing cache entry must be deleted** (not bumped to `v2`) so
984+
`download-models` rebuilds it complete — locally the model tests still self-skip when a GGUF is
985+
absent (`Assume.assumeTrue`), so a partial local checkout is fine.
980986

981987
Set the model path via system property or environment variable (see test files for exact property names).
982988

@@ -1219,6 +1225,56 @@ keeping it clear of the JPMS module-mode javadoc trap that bit BAF. **Before rai
12191225
javadoc source level to ≥ 9, read**
12201226
[`../workspace/policies/jpms-module-descriptor.md`](../workspace/policies/jpms-module-descriptor.md).
12211227
1228+
## LangChain4j integration (`llama-langchain4j` sibling module)
1229+
1230+
`llama-langchain4j/` adapts a `LlamaModel` to LangChain4j's `ChatModel`,
1231+
`StreamingChatModel`, `EmbeddingModel` and `ScoringModel` interfaces **in-process over
1232+
JNI** (no HTTP hop). It is a **standalone sibling module**, deliberately *not* in the root
1233+
reactor, so the native build/release pipeline is untouched.
1234+
1235+
Why it is a **separate artifact** and not a classifier of the core: langchain4j 1.x
1236+
requires **Java 17** (the core stays Java 8), and classifiers share the core's single POM —
1237+
adding `langchain4j-core` there would force it (and the Java 17 floor) on every plain
1238+
`net.ladenthin:llama` consumer. A separate `artifactId` with its own POM is the only way to
1239+
keep that dependency (and Java floor) off the core. It is pure Java with **no per-classifier
1240+
matrix**: it compiles against the core's Java API, which is identical across every native
1241+
classifier; the backend (CPU/CUDA/OpenCL/Vulkan) is a runtime classpath choice for the
1242+
consumer.
1243+
1244+
Wiring:
1245+
1246+
1. **`llama-langchain4j/pom.xml`** — `net.ladenthin:llama-langchain4j`, `release 17`,
1247+
depends on `net.ladenthin:llama:${project.version}` (so the core dep always matches the
1248+
module's own version) and `dev.langchain4j:langchain4j-core`. Carries its own
1249+
sources/javadoc/gpg + `release` profile (Central requires per-artifact signing; the module
1250+
has no parent to inherit them from — plugin versions are pinned in lockstep with the root
1251+
`pom.xml`). Java package stays `net.ladenthin.llama.langchain4j` (package name need not track
1252+
the artifactId).
1253+
2. **`.github/workflows/publish.yml`** — the `test-java-llama-langchain4j` job installs the
1254+
core Java jar, runs a **version-lockstep guard** (module version must equal core version,
1255+
else the build fails — the standalone module can't inherit `${project.version}` from a
1256+
reactor), then `mvn -f llama-langchain4j/pom.xml verify` (7 model-free mapping unit tests
1257+
run; the 4 model-backed integration tests self-skip without a GGUF; `verify` also builds the
1258+
javadoc jar so a release-time javadoc break is caught in PR CI). The
1259+
`publish-snapshot`/`publish-release` jobs `needs:` this job and, after the core `deploy`
1260+
(which installs the core jar locally), run a second `deploy` of the module at the same
1261+
version. A separate **`test-java-llama-langchain4j-integration`** job runs the model-backed
1262+
tests (chat/streaming/embedding/scoring adapters) by **reusing** the shared GGUF cache
1263+
(`gguf-models-v1`, restore-only — no extra download) and the `Linux-x86_64-libraries` native
1264+
artifact: it `needs: [crosscompile-linux-x86_64, download-models]` (so the cache is already
1265+
populated and it runs in parallel), installs the core jar with the downloaded native lib
1266+
bundled, and passes the already-cached chat
1267+
(`REASONING_MODEL_NAME`), nomic-embedding and jina-reranker model paths via the module's
1268+
`-Dnet.ladenthin.llama.langchain4j.{embedding,rerank}.model` / `net.ladenthin.llama.model.path`
1269+
properties. It is validation-only (not a release gate); a cold cache degrades to a self-skip.
1270+
3. **Version bumps** — when the root `pom.xml` `<version>` changes, bump
1271+
`llama-langchain4j/pom.xml` `<version>` to match in the same commit, or the lockstep guard
1272+
reds CI.
1273+
1274+
**Open follow-ups** (documented in `llama-langchain4j/README.md`): tool calling
1275+
(`ToolSpecification` ↔ jllama `ToolDefinition`), `response_format`/JSON mode, and multimodal
1276+
user input (currently flattened to text).
1277+
12221278
## Open TODOs
12231279
12241280
Open TODOs for this repo live in [`TODO.md`](TODO.md). Cross-repo status

REUSE.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@ path = [
2424
".github/ISSUE_TEMPLATE/bug_report.md",
2525
".github/ISSUE_TEMPLATE/feature_request.md",
2626
".claude/commands/find-cpp-duplication.md",
27-
"langchain4j-jllama/README.md",
27+
"llama-langchain4j/README.md",
2828
]
2929
SPDX-FileCopyrightText = [
3030
"2023-2025 Konstantin Herud",

langchain4j-jllama/pom.xml

Lines changed: 0 additions & 94 deletions
This file was deleted.
Lines changed: 17 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# langchain4j-jllama
1+
# llama-langchain4j
22

33
[LangChain4j](https://github.com/langchain4j/langchain4j) adapters backed by an **in-process**
44
[java-llama.cpp](https://github.com/bernardladenthin/java-llama.cpp) model over JNI — no HTTP server,
@@ -61,7 +61,7 @@ ScoringModel reranker = new JllamaScoringModel(rerankLlama);
6161
```xml
6262
<dependency>
6363
<groupId>net.ladenthin</groupId>
64-
<artifactId>langchain4j-jllama</artifactId>
64+
<artifactId>llama-langchain4j</artifactId>
6565
<version>5.0.4-SNAPSHOT</version>
6666
</dependency>
6767
```
@@ -79,16 +79,28 @@ build here:
7979
mvn -DskipTests install
8080

8181
# then build/test this module
82-
cd langchain4j-jllama
82+
cd llama-langchain4j
8383
mvn test
8484
```
8585

86-
The end-to-end test (`JllamaChatModelIntegrationTest`) self-skips unless you pass a model:
86+
The model-backed integration tests self-skip unless you point them at a GGUF. Each adapter has
87+
its own property so you can run them independently (a chat/instruct model, an embedding-mode model,
88+
and a reranking-mode model respectively):
8789

8890
```bash
89-
mvn test -Dnet.ladenthin.llama.model.path=/abs/path/to/model.gguf
91+
# chat + streaming (JllamaChatModelIntegrationTest)
92+
mvn test -Dnet.ladenthin.llama.model.path=/abs/path/to/chat.gguf
93+
# embeddings (JllamaEmbeddingModelIntegrationTest)
94+
mvn test -Dnet.ladenthin.llama.langchain4j.embedding.model=/abs/path/to/embedding.gguf
95+
# re-ranking / scoring (JllamaScoringModelIntegrationTest)
96+
mvn test -Dnet.ladenthin.llama.langchain4j.rerank.model=/abs/path/to/reranker.gguf
9097
```
9198

99+
In CI these reuse the project's existing shared GGUF cache (the chat, nomic-embedding and
100+
jina-reranker models the core test jobs already download) — the
101+
`test-java-llama-langchain4j-integration` job restores that cache and the
102+
`Linux-x86_64` native library artifact, so no extra model is downloaded.
103+
92104
## Not mapped yet
93105

94106
- **Tool calling.** `ChatRequest.toolSpecifications()` are not forwarded, so the chat adapters return

0 commit comments

Comments
 (0)