Skip to content

Commit 14df14a

Browse files
committed
Add Linux Vulkan classifiers + Windows arm64 CPU to the build matrix
Extends the artifact matrix toward upstream llama.cpp's release set with three new native builds (all wired for CI; the user runs CI to validate on the GPU/ arm runners this environment lacks): 1. vulkan-linux-x86-64 — Linux x86_64 Vulkan classifier JAR 2. vulkan-linux-aarch64 — Linux aarch64 Vulkan classifier JAR 3. Windows arm64 CPU — folded into the DEFAULT JAR (no classifier) Linux Vulkan (vendor-neutral GPU jar, no CUDA toolkit) — the intersection of the existing Vulkan-Windows and CUDA-Linux wiring: - CMakeLists: the elseif(GGML_VULKAN) branch is now OS-aware like GGML_CUDA (Windows -> resources_windows_vulkan, else resources_linux_vulkan/.../Linux/ ${OS_ARCH}); one tree holds both arches. - pom.xml: profiles vulkan-linux / vulkan-linux-aarch64, both reading the shared resources_linux_vulkan tree with an arch-scoped resource-copy <includes> (Linux/x86_64 vs Linux/aarch64), so each classifier JAR carries only its arch. Verified locally with staged dummy natives: each jar contains exactly one libjllama.so for its arch. - publish.yml: build-linux-x86_64-vulkan (native ubuntu-latest) + build-linux-aarch64-vulkan (ubuntu-24.04-arm, GCC 14); both apt-install the Vulkan SDK, build -DGGML_VULKAN=ON -DGGML_NATIVE=OFF, build-only (GPU-less runners). Artifacts merge into one resources_linux_vulkan tree in package/ publish; profiles added to the three -P lists. - .gitignore: ignore resources_linux_vulkan (also fixed the pre-existing resources_cuda_linux -> resources_linux_cuda typo). Windows arm64 CPU (default JAR): - build-windows-arm64 on the free windows-11-arm runner (msvc-dev-cmd arch:arm64, Ninja Multi-Config, -DOS_ARCH=aarch64, build + ctest), emitting to the canonical resources/.../Windows/aarch64 and uploading Windows-aarch64-libraries, which the *-libraries glob merges into the default tree. No Java change: OSInfo already maps a Windows-on-ARM JVM (os.arch=aarch64) to Windows/aarch64. Matches the existing Windows CPU jobs (committed jllama.h + bundled JNI headers, so no mvn compile / setup-java needed). All three added to package.needs. Runtime GPU libs are never bundled (driver supplies libvulkan.so.1) — same policy as every GPU classifier. Local verification (CI does the real GPU/arm builds): CMake configures clean and the CPU branch still routes correctly; pom.xml is well-formed and both new profiles are recognized and activate; the per-arch classifier split was proven by packaging staged dummy natives; the workflow YAML parses (40 jobs) with all needs resolving and all -P lists updated. README classifier table + snippets and CLAUDE.md document the additions. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01HL7d4uQ3cKR5HwYFPvZvv7
1 parent e8abfc1 commit 14df14a

6 files changed

Lines changed: 392 additions & 10 deletions

File tree

.github/workflows/publish.yml

Lines changed: 152 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -437,6 +437,90 @@ jobs:
437437
name: Linux-aarch64-libraries
438438
path: ${{ github.workspace }}/llama/src/main/resources/net/ladenthin/llama/
439439

440+
build-linux-x86_64-vulkan:
441+
name: Build Linux x86_64 Vulkan
442+
needs: [startgate, build-webui]
443+
# Native ubuntu build (NOT dockcross) — the Vulkan SDK is trivial to apt-install here, and
444+
# upstream llama.cpp builds its ubuntu-vulkan artifact the same way. GPU runtime libvulkan.so.1
445+
# is supplied by the consumer's driver (nothing bundled). GitHub runners have NO GPU, so this
446+
# is a BUILD-ONLY job (no -DBUILD_TESTING/ctest: a Vulkan-linked jllama_test errors enumerating
447+
# devices on a GPU-less runner — same rationale as the Windows GPU jobs). GGML_NATIVE=OFF keeps
448+
# the artifact portable across x86_64 CPU generations. Trade-off vs the manylinux CPU jar: the
449+
# glibc floor rises to the ubuntu-latest baseline (same as the native aarch64 job). build.sh
450+
# self-fetches sccache; the probe guards it (a miss just builds uncached).
451+
runs-on: ubuntu-latest
452+
env:
453+
USE_CACHE: ${{ github.event_name != 'workflow_dispatch' || inputs.use_cache }}
454+
SCCACHE_WEBDAV_ENDPOINT: https://cache.depot.dev
455+
SCCACHE_WEBDAV_TOKEN: ${{ secrets.DEPOT_TOKEN }}
456+
steps:
457+
- uses: actions/checkout@v7
458+
- name: Download shared WebUI assets
459+
uses: actions/download-artifact@v8
460+
with:
461+
name: webui-generated
462+
path: ${{ github.workspace }}/llama/webui-generated/
463+
- uses: actions/setup-java@v5
464+
with:
465+
distribution: 'temurin'
466+
java-version: ${{ env.JAVA_VERSION }}
467+
- name: Install Vulkan SDK (headers + loader + glslc shader compiler)
468+
run: |
469+
sudo apt-get update
470+
sudo apt-get install -y libvulkan-dev glslc glslang-tools
471+
- name: Build libraries
472+
shell: bash
473+
run: |
474+
mvn --no-transfer-progress -f llama/pom.xml compile
475+
.github/build.sh "-DGGML_VULKAN=ON -DGGML_NATIVE=OFF -DOS_NAME=Linux -DOS_ARCH=x86_64"
476+
- name: Upload artifacts
477+
uses: actions/upload-artifact@v7
478+
with:
479+
name: Linux-x86_64-vulkan
480+
path: ${{ github.workspace }}/llama/src/main/resources_linux_vulkan/net/ladenthin/llama/
481+
if-no-files-found: error
482+
483+
build-linux-aarch64-vulkan:
484+
name: Build Linux aarch64 Vulkan
485+
needs: [startgate, build-webui]
486+
# Native ARM64 Vulkan build on GitHub's free arm64 runner (same runner as the aarch64 CPU job).
487+
# Build-only (GPU-less runner); GGML_NATIVE=OFF for portability across ARMv8 generations; GCC 14
488+
# to match the aarch64 CPU job. Reuses the resources_linux_vulkan tree (arch subdir Linux/aarch64);
489+
# the vulkan-linux-aarch64 Maven profile packages only that subtree.
490+
runs-on: ubuntu-24.04-arm
491+
env:
492+
USE_CACHE: ${{ github.event_name != 'workflow_dispatch' || inputs.use_cache }}
493+
SCCACHE_WEBDAV_ENDPOINT: https://cache.depot.dev
494+
SCCACHE_WEBDAV_TOKEN: ${{ secrets.DEPOT_TOKEN }}
495+
steps:
496+
- uses: actions/checkout@v7
497+
- name: Download shared WebUI assets
498+
uses: actions/download-artifact@v8
499+
with:
500+
name: webui-generated
501+
path: ${{ github.workspace }}/llama/webui-generated/
502+
- uses: actions/setup-java@v5
503+
with:
504+
distribution: 'temurin'
505+
java-version: ${{ env.JAVA_VERSION }}
506+
- name: Install toolchain (GCC 14) + Vulkan SDK
507+
run: |
508+
sudo apt-get update
509+
sudo apt-get install -y gcc-14 g++-14 libvulkan-dev glslc glslang-tools
510+
echo "CC=gcc-14" >> "$GITHUB_ENV"
511+
echo "CXX=g++-14" >> "$GITHUB_ENV"
512+
- name: Build libraries
513+
shell: bash
514+
run: |
515+
mvn --no-transfer-progress -f llama/pom.xml compile
516+
.github/build.sh "-DGGML_VULKAN=ON -DGGML_NATIVE=OFF -DOS_NAME=Linux -DOS_ARCH=aarch64"
517+
- name: Upload artifacts
518+
uses: actions/upload-artifact@v7
519+
with:
520+
name: Linux-aarch64-vulkan
521+
path: ${{ github.workspace }}/llama/src/main/resources_linux_vulkan/net/ladenthin/llama/
522+
if-no-files-found: error
523+
440524
crosscompile-android-aarch64:
441525
name: Cross-Compile Android aarch64
442526
needs: [startgate, build-webui]
@@ -788,6 +872,42 @@ jobs:
788872
name: Windows-x86-libraries
789873
path: ${{ github.workspace }}/llama/src/main/resources/net/ladenthin/llama/
790874

875+
build-windows-arm64:
876+
name: Build and Test Windows 11 arm64 (Ninja Multi-Config, default)
877+
needs: [startgate, build-webui]
878+
# Native arm64 build on GitHub's free windows-11-arm runner. Goes into the DEFAULT JAR (no
879+
# classifier): OSInfo maps a Windows-on-ARM JVM (os.arch=aarch64) to Windows/aarch64, the same
880+
# path CMake emits here, and the `*-libraries` glob in the package/publish jobs merges it into
881+
# src/main/resources. sccache is intentionally omitted (the existing install step pulls the
882+
# x86_64 sccache zip; an arm64 build would need the aarch64 release — not worth the extra path
883+
# for one CPU job, so build.bat just builds uncached when sccache is absent).
884+
runs-on: windows-11-arm
885+
steps:
886+
- uses: actions/checkout@v7
887+
- name: Download shared WebUI assets
888+
uses: actions/download-artifact@v8
889+
with:
890+
name: webui-generated
891+
path: ${{ github.workspace }}/llama/webui-generated/
892+
- name: Set up MSVC developer environment (arm64)
893+
uses: ilammy/msvc-dev-cmd@v1
894+
with:
895+
arch: arm64
896+
- name: Build libraries
897+
shell: cmd
898+
# No mvn compile needed: the JNI header (jllama.h) is committed and the native build
899+
# uses the bundled JNI headers in .github/include, and OS_NAME/OS_ARCH are passed
900+
# explicitly (so the OSInfo-class OS-detection path is skipped) — same as the x86_64 job.
901+
run: |
902+
.github\build.bat -G "Ninja Multi-Config" -DOS_NAME=Windows -DOS_ARCH=aarch64 -DBUILD_TESTING=ON
903+
- name: Run C++ unit tests
904+
run: ctest --test-dir llama/build --output-on-failure
905+
- name: Upload artifacts
906+
uses: actions/upload-artifact@v7
907+
with:
908+
name: Windows-aarch64-libraries
909+
path: ${{ github.workspace }}/llama/src/main/resources/net/ladenthin/llama/
910+
791911
# ---------------------------------------------------------------------------
792912
# Windows GPU classifiers (x86_64 only) — CUDA, Vulkan, OpenCL.
793913
# All three use the same Ninja Multi-Config + MSVC + sccache toolchain as the
@@ -1521,10 +1641,13 @@ jobs:
15211641
needs:
15221642
- crosscompile-linux-x86_64-cuda
15231643
- crosscompile-linux-aarch64
1644+
- build-linux-x86_64-vulkan
1645+
- build-linux-aarch64-vulkan
15241646
- crosscompile-android-aarch64
15251647
- crosscompile-android-aarch64-opencl
15261648
- build-windows-x86_64
15271649
- build-windows-x86
1650+
- build-windows-arm64
15281651
- build-windows-x86_64-msvc
15291652
- build-windows-x86-msvc
15301653
- build-windows-x86_64-cuda
@@ -1550,6 +1673,16 @@ jobs:
15501673
with:
15511674
name: linux-libraries-cuda
15521675
path: ${{ github.workspace }}/llama/src/main/resources_linux_cuda/net/ladenthin/llama/
1676+
# Linux Vulkan classifiers (x86_64 + aarch64) share one tree; the two Maven profiles
1677+
# split it by arch subdir into one single-arch classifier JAR each.
1678+
- uses: actions/download-artifact@v8
1679+
with:
1680+
name: Linux-x86_64-vulkan
1681+
path: ${{ github.workspace }}/llama/src/main/resources_linux_vulkan/net/ladenthin/llama/
1682+
- uses: actions/download-artifact@v8
1683+
with:
1684+
name: Linux-aarch64-vulkan
1685+
path: ${{ github.workspace }}/llama/src/main/resources_linux_vulkan/net/ladenthin/llama/
15531686
- uses: actions/download-artifact@v8
15541687
with:
15551688
name: android-libraries-opencl
@@ -1590,7 +1723,7 @@ jobs:
15901723
# Windows classifier JARs: `windows-msvc` (MSVC-built CPU natives) plus the GPU
15911724
# backends `cuda-windows` / `vulkan-windows` / `opencl-windows`. The default JAR's
15921725
# Windows natives are the Ninja `*-libraries` merged into src/main/resources/ above.
1593-
run: mvn --batch-mode --no-transfer-progress -P release,cuda,opencl-android,windows-msvc,cuda-windows,vulkan-windows,opencl-windows,assembly -Dmaven.test.skip=true -Dgpg.skip=true package
1726+
run: mvn --batch-mode --no-transfer-progress -P release,cuda,vulkan-linux,vulkan-linux-aarch64,opencl-android,windows-msvc,cuda-windows,vulkan-windows,opencl-windows,assembly -Dmaven.test.skip=true -Dgpg.skip=true package
15941727
- name: Upload JARs
15951728
uses: actions/upload-artifact@v7
15961729
with:
@@ -1664,6 +1797,14 @@ jobs:
16641797
with:
16651798
name: linux-libraries-cuda
16661799
path: ${{ github.workspace }}/llama/src/main/resources_linux_cuda/net/ladenthin/llama/
1800+
- uses: actions/download-artifact@v8
1801+
with:
1802+
name: Linux-x86_64-vulkan
1803+
path: ${{ github.workspace }}/llama/src/main/resources_linux_vulkan/net/ladenthin/llama/
1804+
- uses: actions/download-artifact@v8
1805+
with:
1806+
name: Linux-aarch64-vulkan
1807+
path: ${{ github.workspace }}/llama/src/main/resources_linux_vulkan/net/ladenthin/llama/
16671808
- uses: actions/download-artifact@v8
16681809
with:
16691810
name: android-libraries-opencl
@@ -1712,7 +1853,7 @@ jobs:
17121853
# :llama-langchain4j. The `release` profile (GPG + Central Publishing) is inherited
17131854
# from the parent, so every module — including the parent pom — is signed.
17141855
- name: Publish snapshot (reactor - parent + llama + llama-langchain4j)
1715-
run: mvn --batch-mode --no-transfer-progress -P release,cuda,opencl-android,windows-msvc,cuda-windows,vulkan-windows,opencl-windows -Dmaven.test.skip=true deploy
1856+
run: mvn --batch-mode --no-transfer-progress -P release,cuda,vulkan-linux,vulkan-linux-aarch64,opencl-android,windows-msvc,cuda-windows,vulkan-windows,opencl-windows -Dmaven.test.skip=true deploy
17161857
env:
17171858
MAVEN_USERNAME: ${{ secrets.CENTRAL_USERNAME }}
17181859
MAVEN_PASSWORD: ${{ secrets.CENTRAL_TOKEN }}
@@ -1774,6 +1915,14 @@ jobs:
17741915
with:
17751916
name: linux-libraries-cuda
17761917
path: ${{ github.workspace }}/llama/src/main/resources_linux_cuda/net/ladenthin/llama/
1918+
- uses: actions/download-artifact@v8
1919+
with:
1920+
name: Linux-x86_64-vulkan
1921+
path: ${{ github.workspace }}/llama/src/main/resources_linux_vulkan/net/ladenthin/llama/
1922+
- uses: actions/download-artifact@v8
1923+
with:
1924+
name: Linux-aarch64-vulkan
1925+
path: ${{ github.workspace }}/llama/src/main/resources_linux_vulkan/net/ladenthin/llama/
17771926
- uses: actions/download-artifact@v8
17781927
with:
17791928
name: android-libraries-opencl
@@ -1813,7 +1962,7 @@ jobs:
18131962
# :llama-langchain4j. The `release` profile (GPG + Central Publishing) is inherited
18141963
# from the parent, so every module — including the parent pom — is signed.
18151964
- name: Publish release (reactor - parent + llama + llama-langchain4j)
1816-
run: mvn --batch-mode --no-transfer-progress -P release,cuda,opencl-android,windows-msvc,cuda-windows,vulkan-windows,opencl-windows -Dmaven.test.skip=true deploy
1965+
run: mvn --batch-mode --no-transfer-progress -P release,cuda,vulkan-linux,vulkan-linux-aarch64,opencl-android,windows-msvc,cuda-windows,vulkan-windows,opencl-windows -Dmaven.test.skip=true deploy
18171966
env:
18181967
MAVEN_USERNAME: ${{ secrets.CENTRAL_USERNAME }}
18191968
MAVEN_PASSWORD: ${{ secrets.CENTRAL_TOKEN }}

.gitignore

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -39,9 +39,10 @@ replay_pid*
3939

4040
models/*.gguf
4141
llama/src/main/cpp/net_ladenthin_llama_*.h
42-
llama/src/main/resources_cuda_linux/
42+
llama/src/main/resources_linux_cuda/
4343
# Per-classifier native trees, staged by CI before the matching Maven profile runs,
4444
# never committed (same policy as the default-tree native libs below).
45+
llama/src/main/resources_linux_vulkan/
4546
llama/src/main/resources_windows_msvc/
4647
llama/src/main/resources_windows_cuda/
4748
llama/src/main/resources_windows_vulkan/

CLAUDE.md

Lines changed: 45 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -198,7 +198,8 @@ Wiring (mirrors the CUDA-Linux / OpenCL-Android classifier pattern):
198198

199199
1. **`llama/CMakeLists.txt`** — the `if(GGML_CUDA) … elseif(GGML_VULKAN) … elseif(GGML_OPENCL) … else()`
200200
chain is **OS-aware**: CUDA → `resources_windows_cuda` on Windows (else `resources_linux_cuda`),
201-
Vulkan → `resources_windows_vulkan`, OpenCL → `resources_windows_opencl` on Windows (else
201+
Vulkan → `resources_windows_vulkan` on Windows (else `resources_linux_vulkan` — see "Linux Vulkan
202+
classifiers" above), OpenCL → `resources_windows_opencl` on Windows (else
202203
`resources_android_opencl`). The default CPU build (both generators) still emits to the canonical
203204
`src/main/resources/.../Windows/{x86_64,x86}/`, so the Ninja-vs-MSVC split is purely a
204205
CI-artifact-name + pom-profile concern (no CMake change for it).
@@ -253,6 +254,49 @@ ctest --test-dir build --output-on-failure
253254
.github\build_opencl_windows.bat -G "Ninja Multi-Config" -DGGML_OPENCL=ON -DGGML_OPENCL_EMBED_KERNELS=ON -DOS_NAME=Windows -DOS_ARCH=x86_64
254255
```
255256

257+
## Linux Vulkan classifiers + Windows arm64 CPU
258+
259+
Three additional artifacts extend the matrix toward upstream llama.cpp's release set. They follow
260+
the same classifier/resource-tree pattern as CUDA-Linux and Vulkan-Windows.
261+
262+
**Linux Vulkan (`vulkan-linux-x86-64` + `vulkan-linux-aarch64`).** A vendor-neutral GPU jar for
263+
Linux (NVIDIA / AMD / Intel) with no CUDA toolkit — the intersection of the existing Vulkan-Windows
264+
and CUDA-Linux wiring. Four places:
265+
266+
1. **`llama/CMakeLists.txt`** — the `elseif(GGML_VULKAN)` branch is now **OS-aware** (mirrors
267+
`GGML_CUDA`): Windows → `resources_windows_vulkan`, else → `resources_linux_vulkan`
268+
(`.../Linux/${OS_ARCH}/`). One tree holds both arches under `Linux/{x86_64,aarch64}`.
269+
2. **`.github/workflows/publish.yml`**`build-linux-x86_64-vulkan` (native `ubuntu-latest`, **not**
270+
dockcross — the Vulkan SDK is a trivial apt install and upstream builds ubuntu-vulkan the same way)
271+
and `build-linux-aarch64-vulkan` (`ubuntu-24.04-arm` + GCC 14). Both `apt-get install libvulkan-dev
272+
glslc glslang-tools`, build `-DGGML_VULKAN=ON -DGGML_NATIVE=OFF`, and are **build-only** (no
273+
`ctest`: a Vulkan-linked `jllama_test` errors enumerating devices on a GPU-less runner — same as the
274+
Windows GPU jobs). Artifacts `Linux-{x86_64,aarch64}-vulkan` → both downloaded into the **one**
275+
`resources_linux_vulkan/` tree by `package`/`publish-*`. Glibc floor rises to the ubuntu baseline
276+
(like the aarch64 CPU jar); acceptable for a GPU artifact.
277+
3. **`llama/pom.xml`** — profiles `vulkan-linux` (classifier `vulkan-linux-x86-64`) and
278+
`vulkan-linux-aarch64` (classifier `vulkan-linux-aarch64`). Both read the shared
279+
`resources_linux_vulkan` tree but the resource-copy `<includes>` is **arch-scoped**
280+
(`net/ladenthin/llama/Linux/{x86_64,aarch64}/**`), so each classifier JAR carries only its own
281+
arch (verified: each jar contains exactly one `libjllama.so`). Separate output dirs
282+
`_linux_vulkan` / `_linux_vulkan_aarch64` avoid collision. Activated in CI via
283+
`-P …,vulkan-linux,vulkan-linux-aarch64,…`.
284+
4. **`README.md`** — classifier table + dependency snippets.
285+
286+
`src/main/resources_linux_vulkan/` is git-ignored (staged by CI, never committed). GPU runtime
287+
`libvulkan.so.1` is supplied by the consumer's driver — nothing is bundled (same policy as every GPU
288+
classifier).
289+
290+
**Windows arm64 CPU (default JAR, no classifier).** `build-windows-arm64` runs natively on GitHub's
291+
free `windows-11-arm` runner (`ilammy/msvc-dev-cmd` `arch: arm64`, Ninja Multi-Config, `-DOS_ARCH=aarch64`,
292+
build + `ctest`). It emits to the **canonical** `resources/.../Windows/aarch64/` and uploads
293+
`Windows-aarch64-libraries`, which the `package`/`publish-*` `*-libraries` glob merges into the default
294+
tree — so it ships in the **default** JAR alongside Windows x86-64 / x86 (like those, it is not a
295+
classifier). No Java change was needed: `OSInfo` already maps a Windows-on-ARM JVM (`os.arch=aarch64`)
296+
to `Windows/aarch64` (it isn't in `archMapping`, so it falls through `translateArchNameToFolderName`).
297+
sccache is intentionally omitted (the shared install step pulls the x86_64 sccache zip; not worth an
298+
arm64 path for one CPU job — `build.bat` just builds uncached).
299+
256300
## WebUI (llama.cpp Svelte UI) embedding
257301

258302
The llama.cpp WebUI is **built once in CI and shared to every native build**, then

0 commit comments

Comments
 (0)