Skip to content

Commit cba693c

Browse files
Add regression tests for issues #80, #95, #98, #102 (#185)
* test: add JUnit regressions for kherud open issues #80, #95, #98, #102 Adds four small JUnit tests proposed in the verification plan section of docs/history/49be664_open_issues.md to upgrade the corresponding upstream issues from LIKELY FIXED to FIXED: - MemoryManagementTest#testOpenCloseLoopDoesNotLeak (#102) - 20-iteration open/close loop; on Linux asserts VmRSS delta < 200 MB. Degenerates to a no-crash smoke test on non-Linux hosts where /proc/self/status is absent. - MemoryManagementTest#testOpenCloseWithoutGeneration (#80) - 20 open + immediate close without any generation, exercises the half-initialised worker race closed by the double server.terminate() in jllama.cpp. - LlamaModelTest#testIteratorTerminatesOnRepetitivePrompt (#95) - asserts the iterator terminates within nPredict+1 steps on a deliberately repetitive prompt. - LlamaEmbeddingsTest#testNomicEmbedLoads (#98) - gated on system property net.ladenthin.llama.nomic.path; reproduces the reporter's batch/ubatch config plus the fix (enableEmbedding()), and asserts a 768-dim vector for nomic-embed-text-v1.5. Wires up the optional nomic GGUF download in the linux-x86_64 Java test job in .github/workflows/publish.yml. Other test jobs cleanly self-skip via Assume because the system property is unset. Documents the local native-build workflow in CLAUDE.md - per-host output paths, mvn-cmake handoff, optional model handling, and the restricted-network caveat for environments that block huggingface.co. https://claude.ai/code/session_01LR7Gw1pyKS7wvxXfZjnxNW * docs: record #80/#95/#98/#102 regression tests added in 713d426 Updates docs/history/49be664_open_issues.md to reflect that the four JUnit regression tests called for in the verification plan have been added on this branch: - Deep-dive verdict guide now lists each test name and self-skip behaviour next to its issue bullet - Per-issue Status blocks for #80, #95, #98, #102 annotated as "LIKELY FIXED -> FIXED on CI green" with the covering test - Status overview table rows for the same four issues updated - "What the original issues actually contain" feasibility table marks all four as DONE with the commit reference - "Concrete test plan" gains a status callout noting the as-shipped implementation matches the sketches - "Recommended sequencing" step 1 marked DONE and enumerates what shipped; remaining steps (#86 docs, #103/#34 typed image API, Android emulator CI) carried forward as the next deliverables No code or behaviour change, documentation only. https://claude.ai/code/session_01LR7Gw1pyKS7wvxXfZjnxNW --------- Co-authored-by: Claude <noreply@anthropic.com>
1 parent 676ffaf commit cba693c

7 files changed

Lines changed: 387 additions & 37 deletions

File tree

.github/workflows/publish.yml

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,8 @@ env:
2020
DRAFT_MODEL_NAME: "AMD-Llama-135m-code.Q2_K.gguf"
2121
REASONING_MODEL_URL: "https://huggingface.co/unsloth/Qwen3-0.6B-GGUF/resolve/main/Qwen3-0.6B-Q4_K_M.gguf"
2222
REASONING_MODEL_NAME: "Qwen3-0.6B-Q4_K_M.gguf"
23+
NOMIC_EMBED_MODEL_URL: "https://huggingface.co/nomic-ai/nomic-embed-text-v1.5-GGUF/resolve/main/nomic-embed-text-v1.5.f16.gguf"
24+
NOMIC_EMBED_MODEL_NAME: "nomic-embed-text-v1.5.f16.gguf"
2325
permissions:
2426
contents: read
2527
jobs:
@@ -389,6 +391,8 @@ jobs:
389391
run: curl -L --fail ${DRAFT_MODEL_URL} --create-dirs -o models/${DRAFT_MODEL_NAME}
390392
- name: Download reasoning model
391393
run: curl -L --fail ${REASONING_MODEL_URL} --create-dirs -o models/${REASONING_MODEL_NAME}
394+
- name: Download nomic embedding model (issue #98 regression)
395+
run: curl -L --fail ${NOMIC_EMBED_MODEL_URL} --create-dirs -o models/${NOMIC_EMBED_MODEL_NAME}
392396
- name: List files in models directory
393397
run: ls -l models/
394398
- name: Validate model files
@@ -404,7 +408,7 @@ jobs:
404408
ulimit -c unlimited
405409
echo "${{ github.workspace }}/core.%e.%p" | sudo tee /proc/sys/kernel/core_pattern
406410
- name: Run tests
407-
run: mvn --no-transfer-progress test
411+
run: mvn --no-transfer-progress test -Dnet.ladenthin.llama.nomic.path=models/${NOMIC_EMBED_MODEL_NAME}
408412
- uses: actions/upload-artifact@v7
409413
if: success()
410414
with:

CLAUDE.md

Lines changed: 78 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -431,6 +431,84 @@ cmake -B build -DLLAMA_CURL=ON
431431

432432
Built libraries are placed in `src/main/resources/net/ladenthin/llama/{OS}/{ARCH}/`.
433433

434+
### Building the native library for local Java tests
435+
436+
`mvn test` does **not** build the native library — Maven only compiles Java
437+
and runs surefire. The shared library must already exist on disk under the
438+
platform-specific resource path that `LlamaLoader` resolves at runtime.
439+
Without it the JVM throws `UnsatisfiedLinkError` and every Java test fails
440+
immediately (it does not auto-skip).
441+
442+
The output path is derived by `CMakeLists.txt` from `OS_NAME` and `OS_ARCH`
443+
detected by the helper script `.github/dockcross/dockcross-resolve-host`
444+
(falls back to `uname` on hosts where the script is absent). The mapping
445+
mirrors `OSInfo.translateOSNameToFolderName` on the Java side, so the same
446+
folder name is produced on both ends.
447+
448+
| Host | Library file | Resource path produced by `cmake --build` |
449+
|------|--------------|-------------------------------------------|
450+
| Linux x86_64 | `libjllama.so` | `src/main/resources/net/ladenthin/llama/Linux/x86_64/` |
451+
| Linux aarch64 | `libjllama.so` | `src/main/resources/net/ladenthin/llama/Linux/aarch64/` |
452+
| macOS Apple Silicon | `libjllama.dylib` | `src/main/resources/net/ladenthin/llama/Mac/aarch64/` |
453+
| macOS Intel | `libjllama.dylib` | `src/main/resources/net/ladenthin/llama/Mac/x86_64/` |
454+
| Windows x86_64 | `jllama.dll` (+ `llama.dll`, `ggml.dll`) | `src/main/resources/net/ladenthin/llama/Windows/x86_64/` |
455+
456+
The Windows `RUNTIME_OUTPUT_DIRECTORY_*` properties (`CMakeLists.txt:266-269`)
457+
deposit `jllama.dll` alongside the upstream `llama.dll` / `ggml.dll`; all
458+
three must remain co-located so the loader can resolve transitive imports.
459+
460+
End-to-end local workflow for running Java tests:
461+
462+
```bash
463+
# 1. Generate JNI headers (one-time per Java API change)
464+
mvn -q compile
465+
466+
# 2. Configure + build the native library for the current host
467+
cmake -B build
468+
cmake --build build --config Release -j$(nproc)
469+
# The shared lib lands directly in src/main/resources/.../{OS}/{ARCH}/ —
470+
# no separate install step is needed.
471+
472+
# 3. Ensure model files referenced by tests are present under models/.
473+
# The default test models (downloaded by CI in publish.yml) are:
474+
curl -L --fail "$MODEL_URL" --create-dirs -o models/codellama-7b.Q2_K.gguf
475+
curl -L --fail "$RERANKING_MODEL_URL" --create-dirs -o models/jina-reranker-v1-tiny-en-Q4_0.gguf
476+
curl -L --fail "$DRAFT_MODEL_URL" --create-dirs -o models/AMD-Llama-135m-code.Q2_K.gguf
477+
curl -L --fail "$REASONING_MODEL_URL" --create-dirs -o models/Qwen3-0.6B-Q4_K_M.gguf
478+
479+
# 4. Run tests. Tests that need a model file self-skip via Assume.assumeTrue()
480+
# when their GGUF is absent, so partial model availability is OK.
481+
mvn test
482+
# CPU-only host (no GPU): pin GPU layers to 0
483+
mvn test -Dnet.ladenthin.llama.test.ngl=0
484+
# Run a single test class or method
485+
mvn test -Dtest=MemoryManagementTest
486+
mvn test -Dtest=LlamaModelTest#testGenerateAnswer
487+
```
488+
489+
**Optional models** referenced by individual tests are gated on a system
490+
property so CI can skip them cleanly when the GGUF is not downloaded:
491+
492+
| Property | Default test that uses it | Model |
493+
|----------|---------------------------|-------|
494+
| `net.ladenthin.llama.nomic.path` | `LlamaEmbeddingsTest#testNomicEmbedLoads` | `nomic-embed-text-v1.5.f16.gguf` (issue #98 regression) |
495+
496+
Run those tests by setting the property:
497+
```bash
498+
mvn test -Dtest=LlamaEmbeddingsTest#testNomicEmbedLoads \
499+
-Dnet.ladenthin.llama.nomic.path=models/nomic-embed-text-v1.5.f16.gguf
500+
```
501+
502+
**Restricted-network environments.** Some hosts (e.g. ephemeral remote
503+
execution sandboxes) block outbound traffic to `huggingface.co`. In that
504+
case downloading models for the Java tests is not possible from the host
505+
itself; the native library can still be built and the C++ test suite
506+
(`ctest --test-dir build`) still runs because it depends only on the
507+
upstream sources fetched at CMake configure time. Java tests should then
508+
be exercised either in CI (via `.github/workflows/publish.yml`) or on a
509+
developer machine with HF access; pre-staged models can also be uploaded
510+
into `models/` out-of-band.
511+
434512
### Code Formatting
435513
```bash
436514
clang-format -i src/main/cpp/*.cpp src/main/cpp/*.hpp # Format C++ code

0 commit comments

Comments
 (0)