Skip to content

Commit 48f0863

Browse files
committed
ci(windows): ship Ninja build as ninja-windows classifier alongside permanent MSVC
Builds on the prior Ninja evaluation jobs. Per owner decision, the MSVC / Visual Studio Windows build is the default JAR and is kept permanently (the sccache cache loss on it is accepted); the Ninja Multi-Config build is shipped ALONGSIDE it as a separate classifier JAR, never as a replacement. The two generators produce different jllama.dll files, so they cannot share a resource path in one JAR — hence the classifier (mirrors the cuda / opencl-android pattern). Result: 4 permanent Windows build jobs, both generators distributed and tested end-to-end. - pom.xml: add `windows-ninja` profile producing a <classifier>ninja-windows</classifier> JAR from ${outputDirectory}_windows_ninja (separate compile pass + resource copy + classified jar; mirrors cuda / opencl-android). - publish.yml: the package, publish-snapshot, and publish-release jobs download Windows-{x86_64,x86}-ninja into src/main/resources_windows_ninja/ and activate the `windows-ninja` profile (-P ...,windows-ninja). Add a test-java-windows-x86_64-ninja job that loads the Ninja DLL via JNI and runs the full model-backed suite (parity with test-java-windows-x86_64). Wire the Ninja build + Java-test jobs into the package `needs:` graph. - .gitignore: ignore src/main/resources_windows_ninja/ (CI-staged, never committed). - README.md: add the `ninja-windows` classifier row + dependency snippet. - CLAUDE.md: add "Windows Ninja artifact" section; refresh the sccache "Windows" note (no CMakeLists change — routing is CI-download + pom-profile, not a GGML flag). - TODO.md: rewrite the Windows section to the final dual-build design (MSVC kept forever; remaining work is cache-hit verification, not a redesign). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01SfvSZ76NW4e1qX1PjL4RKq
1 parent e113ed3 commit 48f0863

6 files changed

Lines changed: 329 additions & 78 deletions

File tree

.github/workflows/publish.yml

Lines changed: 127 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1191,6 +1191,99 @@ jobs:
11911191
${{ github.workspace }}/src/main/resources/net/ladenthin/llama/**/*
11921192
if-no-files-found: warn
11931193

1194+
# Java/inference validation of the Ninja-built x86_64 DLL (the analogue of
1195+
# test-java-windows-x86_64 for the MSVC build). Loads the Ninja jllama.dll via
1196+
# JNI and runs the full model-backed suite, so both Windows generators are
1197+
# validated end-to-end before the `ninja-windows` classifier JAR ships.
1198+
test-java-windows-x86_64-ninja:
1199+
name: Java Tests Windows 2025 x86_64 (Ninja, eval)
1200+
needs: build-windows-x86_64-ninja
1201+
runs-on: windows-2025-vs2026
1202+
steps:
1203+
- uses: actions/checkout@v7
1204+
- name: Display CPU Info
1205+
shell: pwsh
1206+
run: |
1207+
Write-Host "=== CPU Information (Get-CimInstance - All Properties) ==="
1208+
Get-CimInstance Win32_Processor | Select-Object * | Format-List
1209+
Write-Host ""
1210+
Write-Host "=== CPU Information (systeminfo) ==="
1211+
systeminfo | Select-String "Processor"
1212+
Write-Host ""
1213+
Write-Host "=== CPU Information (Get-ComputerInfo) ==="
1214+
Get-ComputerInfo -Property "CsProcessors*" 2>$null || Write-Host "Get-ComputerInfo not available"
1215+
- uses: actions/download-artifact@v8
1216+
with:
1217+
name: Windows-x86_64-ninja
1218+
path: ${{ github.workspace }}/src/main/resources/net/ladenthin/llama/
1219+
- name: Download text generation model
1220+
run: curl -L --proto =https --proto-redir =https --fail --retry 5 --retry-all-errors $env:MODEL_URL --create-dirs -o models/$env:MODEL_NAME
1221+
- name: Download reranking model
1222+
run: curl -L --proto =https --proto-redir =https --fail --retry 5 --retry-all-errors $env:RERANKING_MODEL_URL --create-dirs -o models/$env:RERANKING_MODEL_NAME
1223+
- name: Download draft model
1224+
run: curl -L --proto =https --proto-redir =https --fail --retry 5 --retry-all-errors $env:DRAFT_MODEL_URL --create-dirs -o models/$env:DRAFT_MODEL_NAME
1225+
- name: Download reasoning model
1226+
run: curl -L --proto =https --proto-redir =https --fail --retry 5 --retry-all-errors $env:REASONING_MODEL_URL --create-dirs -o models/$env:REASONING_MODEL_NAME
1227+
- name: Download tool-calling model
1228+
run: curl -L --proto =https --proto-redir =https --fail --retry 5 --retry-all-errors $env:TOOL_MODEL_URL --create-dirs -o models/$env:TOOL_MODEL_NAME
1229+
- name: Download vision model (issues #103 / #34)
1230+
run: curl -L --proto =https --proto-redir =https --fail --retry 5 --retry-all-errors $env:VISION_MODEL_URL --create-dirs -o models/$env:VISION_MODEL_NAME
1231+
- name: Download vision mmproj
1232+
run: curl -L --proto =https --proto-redir =https --fail --retry 5 --retry-all-errors $env:VISION_MMPROJ_URL --create-dirs -o models/$env:VISION_MMPROJ_NAME
1233+
- name: List files in models directory
1234+
run: ls -l models/
1235+
- name: Validate model files
1236+
run: .github\validate-models.bat
1237+
- uses: actions/setup-java@v5
1238+
with:
1239+
distribution: 'temurin'
1240+
java-version: ${{ env.JAVA_VERSION }}
1241+
- name: Memory before tests
1242+
run: Get-CimInstance Win32_OperatingSystem | Select-Object FreePhysicalMemory,TotalVisibleMemorySize | Format-List
1243+
shell: pwsh
1244+
- name: Enable WER LocalDumps for java.exe
1245+
# Windows Error Reporting writes minidumps when java.exe (or any other
1246+
# registered process) crashes via __fastfail / abort / unhandled SEH.
1247+
# We use it as the Windows analogue of Linux core dumps so that a JVM
1248+
# crash inside the JNI layer leaves us a real native callstack instead
1249+
# of just surefire's "VM terminated without saying goodbye" line.
1250+
# DumpType=2 == MiniDumpWithFullMemory; the workspace dumps/ folder is
1251+
# globbed by the failure-upload step below.
1252+
shell: pwsh
1253+
run: |
1254+
$key = 'HKLM:\SOFTWARE\Microsoft\Windows\Windows Error Reporting\LocalDumps\java.exe'
1255+
New-Item -Path $key -Force | Out-Null
1256+
New-Item -Path "${{ github.workspace }}\dumps" -ItemType Directory -Force | Out-Null
1257+
New-ItemProperty -Path $key -Name 'DumpFolder' -Value "${{ github.workspace }}\dumps" -PropertyType ExpandString -Force | Out-Null
1258+
New-ItemProperty -Path $key -Name 'DumpType' -Value 2 -PropertyType DWord -Force | Out-Null
1259+
New-ItemProperty -Path $key -Name 'DumpCount' -Value 5 -PropertyType DWord -Force | Out-Null
1260+
Get-ItemProperty -Path $key | Format-List
1261+
- name: Run tests
1262+
run: |
1263+
mvn -e --no-transfer-progress test `
1264+
"-Dnet.ladenthin.llama.tool.model=models/$env:TOOL_MODEL_NAME" `
1265+
"-Dnet.ladenthin.llama.vision.model=models/$env:VISION_MODEL_NAME" `
1266+
"-Dnet.ladenthin.llama.vision.mmproj=models/$env:VISION_MMPROJ_NAME" `
1267+
"-Dnet.ladenthin.llama.vision.image=$env:VISION_IMAGE_PATH"
1268+
- name: Memory after tests
1269+
if: always()
1270+
run: Get-CimInstance Win32_OperatingSystem | Select-Object FreePhysicalMemory,TotalVisibleMemorySize | Format-List
1271+
shell: pwsh
1272+
- if: failure()
1273+
uses: actions/upload-artifact@v7
1274+
with:
1275+
name: windows-output-ninja
1276+
path: |
1277+
${{ github.workspace }}\hs_err_pid*.log
1278+
${{ github.workspace }}\*.hprof
1279+
${{ github.workspace }}\dumps\*.dmp
1280+
${{ github.workspace }}\target\surefire-reports\*.dump
1281+
${{ github.workspace }}\target\surefire-reports\*.dumpstream
1282+
${{ github.workspace }}\target\surefire-reports\*.txt
1283+
${{ github.workspace }}\target\surefire-reports\TEST-*.xml
1284+
${{ github.workspace }}/src/main/resources/net/ladenthin/llama/**/*
1285+
if-no-files-found: warn
1286+
11941287
# ---------------------------------------------------------------------------
11951288
# Package and publish
11961289
# ---------------------------------------------------------------------------
@@ -1203,13 +1296,16 @@ jobs:
12031296
- crosscompile-android-aarch64
12041297
- crosscompile-android-aarch64-opencl
12051298
- build-windows-x86
1299+
- build-windows-x86_64-ninja
1300+
- build-windows-x86-ninja
12061301
- test-cpp-linux-x86_64
12071302
- build-macos-arm64-metal-15
12081303
- test-java-linux-x86_64
12091304
- test-java-macos-arm64-metal
12101305
- test-java-macos-arm64-no-metal
12111306
- test-java-macos-arm64-metal-15
12121307
- test-java-windows-x86_64
1308+
- test-java-windows-x86_64-ninja
12131309
runs-on: ubuntu-latest
12141310
steps:
12151311
- uses: actions/checkout@v7
@@ -1226,6 +1322,17 @@ jobs:
12261322
with:
12271323
name: android-libraries-opencl
12281324
path: ${{ github.workspace }}/src/main/resources_android_opencl/net/ladenthin/llama/
1325+
# Ninja-built Windows natives -> separate tree consumed by the `windows-ninja`
1326+
# Maven profile (the `ninja-windows` classifier JAR). The default JAR keeps the
1327+
# MSVC `*-libraries` natives downloaded above.
1328+
- uses: actions/download-artifact@v8
1329+
with:
1330+
name: Windows-x86_64-ninja
1331+
path: ${{ github.workspace }}/src/main/resources_windows_ninja/net/ladenthin/llama/
1332+
- uses: actions/download-artifact@v8
1333+
with:
1334+
name: Windows-x86-ninja
1335+
path: ${{ github.workspace }}/src/main/resources_windows_ninja/net/ladenthin/llama/
12291336
- uses: actions/setup-java@v5
12301337
with:
12311338
distribution: 'temurin'
@@ -1236,7 +1343,8 @@ jobs:
12361343
# default-platform native libs in one drop-on-classpath JAR, runnable via its
12371344
# OpenAiCompatServer Main-Class). It lands in target/ and is uploaded in the `llama-jars`
12381345
# artifact below - a CI run artifact only, not a Maven Central / GitHub-Release asset.
1239-
run: mvn --batch-mode --no-transfer-progress -P release,cuda,opencl-android,assembly -Dmaven.test.skip=true -Dgpg.skip=true package
1346+
# `windows-ninja` attaches the `ninja-windows` classifier JAR (Ninja-built Windows natives).
1347+
run: mvn --batch-mode --no-transfer-progress -P release,cuda,opencl-android,windows-ninja,assembly -Dmaven.test.skip=true -Dgpg.skip=true package
12401348
- name: Upload JARs
12411349
uses: actions/upload-artifact@v7
12421350
with:
@@ -1314,6 +1422,14 @@ jobs:
13141422
with:
13151423
name: android-libraries-opencl
13161424
path: ${{ github.workspace }}/src/main/resources_android_opencl/net/ladenthin/llama/
1425+
- uses: actions/download-artifact@v8
1426+
with:
1427+
name: Windows-x86_64-ninja
1428+
path: ${{ github.workspace }}/src/main/resources_windows_ninja/net/ladenthin/llama/
1429+
- uses: actions/download-artifact@v8
1430+
with:
1431+
name: Windows-x86-ninja
1432+
path: ${{ github.workspace }}/src/main/resources_windows_ninja/net/ladenthin/llama/
13171433
- name: Set up Maven Central Repository
13181434
uses: actions/setup-java@v5
13191435
with:
@@ -1334,7 +1450,7 @@ jobs:
13341450
*) echo "::error::Refusing to publish non-SNAPSHOT version '$VERSION' from the snapshot job. Snapshot publishing requires a -SNAPSHOT version; releases go through the v* tag path."; exit 1 ;;
13351451
esac
13361452
- name: Publish snapshot
1337-
run: mvn --batch-mode --no-transfer-progress -P release,cuda,opencl-android -Dmaven.test.skip=true deploy
1453+
run: mvn --batch-mode --no-transfer-progress -P release,cuda,opencl-android,windows-ninja -Dmaven.test.skip=true deploy
13381454
env:
13391455
MAVEN_USERNAME: ${{ secrets.CENTRAL_USERNAME }}
13401456
MAVEN_PASSWORD: ${{ secrets.CENTRAL_TOKEN }}
@@ -1398,6 +1514,14 @@ jobs:
13981514
with:
13991515
name: android-libraries-opencl
14001516
path: ${{ github.workspace }}/src/main/resources_android_opencl/net/ladenthin/llama/
1517+
- uses: actions/download-artifact@v8
1518+
with:
1519+
name: Windows-x86_64-ninja
1520+
path: ${{ github.workspace }}/src/main/resources_windows_ninja/net/ladenthin/llama/
1521+
- uses: actions/download-artifact@v8
1522+
with:
1523+
name: Windows-x86-ninja
1524+
path: ${{ github.workspace }}/src/main/resources_windows_ninja/net/ladenthin/llama/
14011525
- name: Set up Maven Central Repository
14021526
uses: actions/setup-java@v5
14031527
with:
@@ -1409,7 +1533,7 @@ jobs:
14091533
gpg-private-key: ${{ secrets.GPG_PRIVATE_KEY }}
14101534
gpg-passphrase: MAVEN_GPG_PASSPHRASE
14111535
- name: Publish release
1412-
run: mvn --batch-mode --no-transfer-progress -P release,cuda,opencl-android -Dmaven.test.skip=true deploy
1536+
run: mvn --batch-mode --no-transfer-progress -P release,cuda,opencl-android,windows-ninja -Dmaven.test.skip=true deploy
14131537
env:
14141538
MAVEN_USERNAME: ${{ secrets.CENTRAL_USERNAME }}
14151539
MAVEN_PASSWORD: ${{ secrets.CENTRAL_TOKEN }}

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -40,6 +40,7 @@ replay_pid*
4040
models/*.gguf
4141
src/main/cpp/net_ladenthin_llama_*.h
4242
src/main/resources_cuda_linux/
43+
src/main/resources_windows_ninja/
4344
src/main/resources/**/*.so
4445
src/main/resources/**/*.dylib
4546
src/main/resources/**/*.dll

CLAUDE.md

Lines changed: 50 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -160,6 +160,51 @@ At runtime the device must provide its own OpenCL ICD (`libOpenCL.so`);
160160
Qualcomm Adreno drivers do. Devices without an ICD should use the default
161161
CPU-only Android JAR.
162162

163+
## Windows Ninja artifact (sccache-cached, parallel to the MSVC build)
164+
165+
The Visual Studio generator ignores `CMAKE_{C,CXX}_COMPILER_LAUNCHER`, so the two MSVC Windows
166+
jobs (`build-windows-x86_64`, `build-windows-x86`) **cannot** use the sccache/Depot cache. Rather
167+
than switch the trusted MSVC build, the repo builds the **same CPU natives a second time** with the
168+
**`Ninja Multi-Config`** generator (which *does* honor the launcher) and ships them as a separate
169+
**`ninja-windows`** Maven classifier JAR. **The MSVC build is the default JAR and is kept
170+
permanently** — the Ninja artifact is an additional, cache-accelerated, independently
171+
end-to-end-tested option, not a replacement. (Upstream llama.cpp ships its `windows-cuda` artifact
172+
with Ninja Multi-Config + MSVC, proving the combination works on the same tree.)
173+
174+
Unlike the CUDA / OpenCL classifiers — which differ by a **GGML backend flag** and route their
175+
output in `CMakeLists.txt` — the Ninja Windows build differs only by **generator/toolchain**, so
176+
there is **no `CMakeLists.txt` change**: both generators emit to the canonical
177+
`src/main/resources/.../Windows/{x86_64,x86}/`. Routing to the classifier tree happens purely at the
178+
CI-download + pom-profile level. Four places wire it together:
179+
180+
1. **`.github/build.bat`** — sccache probe guard mirroring `build.sh`'s `sccache_can_wrap_compiler()`:
181+
when `USE_CACHE=true` and `sccache` is on PATH, it compiles a trivial TU through `sccache cl.exe`;
182+
only on success does it pass `-DCMAKE_{C,CXX}_COMPILER_LAUNCHER=sccache` and print
183+
`sccache --show-stats`. A missing/crashing sccache falls back to a green uncached build. The MSVC
184+
jobs do not set `USE_CACHE`, so the guard is inert for them.
185+
2. **`.github/workflows/publish.yml`** — build jobs `build-windows-x86_64-ninja` /
186+
`build-windows-x86-ninja` (`windows-2025-vs2026`, `ilammy/msvc-dev-cmd@v1` for the arch env,
187+
sccache v0.16.0 from the GitHub release **zip** + Depot WebDAV, `build.bat -G "Ninja Multi-Config"`),
188+
uploading artifacts `Windows-{x86_64,x86}-ninja` (**not** `*-libraries`, so the `package` job's
189+
`pattern: "*-libraries"` ignores them). `test-java-windows-x86_64-ninja` loads the Ninja DLL via
190+
JNI and runs the full model-backed suite. The `package`, `publish-snapshot`, and `publish-release`
191+
jobs download `Windows-*-ninja` into `src/main/resources_windows_ninja/` and activate the
192+
`windows-ninja` Maven profile.
193+
3. **`pom.xml`** — the `windows-ninja` profile produces a second JAR with `<classifier>ninja-windows</classifier>`
194+
from the `${project.build.outputDirectory}_windows_ninja` tree (separate compile pass + resource
195+
copy + classified jar; mirrors the `cuda` / `opencl-android` profiles). Activated only in CI.
196+
4. **`README.md`** — the `ninja-windows` row + dependency snippet in "Choosing the right classifier".
197+
198+
`src/main/resources_windows_ninja/` is git-ignored (staged by CI, never committed — same policy as
199+
the native libs and the CUDA/OpenCL trees).
200+
201+
**Local sanity build** (needs MSVC + a Ninja on PATH; sccache optional):
202+
```bat
203+
mvn -q compile
204+
.github\build.bat -G "Ninja Multi-Config" -DOS_NAME=Windows -DOS_ARCH=x86_64 -DBUILD_TESTING=ON
205+
ctest --test-dir build --output-on-failure
206+
```
207+
163208
## WebUI (llama.cpp Svelte UI) embedding
164209

165210
The llama.cpp WebUI is **built once in CI and shared to every native build**, then
@@ -271,9 +316,11 @@ Per-job recipe: add `env:` { `USE_CACHE`, `SCCACHE_WEBDAV_ENDPOINT`, `SCCACHE_WE
271316
`DOCKCROSS_ARGS: "-e SCCACHE_WEBDAV_ENDPOINT -e SCCACHE_WEBDAV_TOKEN -e USE_CACHE"` — the
272317
dockcross wrapper only forwards host env it is explicitly told to via `-e`. The fetched sccache
273318
version is the `SCCACHE_DL_VERSION` knob in `build.sh` (default **0.16.0**; overridable per-job
274-
to try a different build against a container that crashed another). **Windows** (`build.bat` +
275-
MSVC) is separate and last: use `mozilla-actions/sccache-action` / sccache's MSVC support, not
276-
the `build.sh` musl fetch.
319+
to try a different build against a container that crashed another). **Windows** is handled
320+
separately (the Visual Studio generator ignores `CMAKE_*_COMPILER_LAUNCHER`): see
321+
"Windows Ninja artifact" below — the cached path uses the **Ninja Multi-Config** generator with a
322+
`build.bat` sccache probe and a direct sccache zip download (not `mozilla-actions/sccache-action`),
323+
shipped as a parallel `ninja-windows` classifier JAR while the MSVC default stays the trusted build.
277324

278325
**Cross-repo scope.** This Depot/sccache compiler cache makes sense only for java-llama.cpp —
279326
it is the only sibling repo with a native (C++/JNI) build. It does not apply to the pure-Maven

README.md

Lines changed: 13 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -157,14 +157,16 @@ If any of these match your platform, you can include the Maven dependency and ge
157157
### Choosing the right classifier
158158

159159
The Maven coordinate `net.ladenthin:llama` publishes one default JAR (CPU-only)
160-
plus two optional GPU/accelerator JARs selected via a Maven `<classifier>`.
161-
Pick at most one — they are mutually exclusive.
160+
plus optional JARs selected via a Maven `<classifier>`: two GPU/accelerator
161+
builds and one alternate-toolchain Windows build. Pick at most one GPU/accelerator
162+
classifier — those are mutually exclusive — and optionally the Windows build.
162163

163164
| Classifier | Backend | Target platform | Runtime requirement |
164165
|---|---|---|---|
165-
| _(none)_ | CPU | Linux x86-64 / aarch64, macOS x86-64 / aarch64, Windows x86-64, Android aarch64 (CPU) | None beyond a JDK 8+ JVM |
166+
| _(none)_ | CPU | Linux x86-64 / aarch64, macOS x86-64 / aarch64, Windows x86-64 (MSVC / Visual Studio generator), Android aarch64 (CPU) | None beyond a JDK 8+ JVM |
166167
| `cuda13-linux-x86-64` | CUDA 13 | Linux x86-64 with NVIDIA GPU | NVIDIA driver + CUDA 13 runtime libraries (`libcudart.so.13`, `libcublas.so.13`) installed on the host. The shared library is dynamically linked against them and will fail to `dlopen` if they are absent — there is no automatic fallback to CPU. |
167168
| `opencl-android-aarch64` | OpenCL (Adreno) | Android aarch64 with Qualcomm Adreno GPU | A device-supplied OpenCL ICD (`libOpenCL.so`). Devices without an ICD (e.g. most non-Snapdragon Android hardware) must use the default CPU JAR. |
169+
| `ninja-windows` | CPU (Ninja Multi-Config + MSVC) | Windows x86-64 and x86 | None beyond a JDK 8+ JVM. Same CPU backend as the default JAR's Windows natives, but compiled with the `Ninja Multi-Config` generator (sccache-cached in CI) instead of the Visual Studio generator. Provided so both Windows builds are available; functionally equivalent for normal use. |
168170

169171
```xml
170172
<!-- CPU (default) -->
@@ -189,6 +191,14 @@ Pick at most one — they are mutually exclusive.
189191
<version>5.0.2</version>
190192
<classifier>opencl-android-aarch64</classifier>
191193
</dependency>
194+
195+
<!-- Windows natives built with the Ninja Multi-Config generator (CPU) -->
196+
<dependency>
197+
<groupId>net.ladenthin</groupId>
198+
<artifactId>llama</artifactId>
199+
<version>5.0.2</version>
200+
<classifier>ninja-windows</classifier>
201+
</dependency>
192202
```
193203

194204
> [!IMPORTANT]

0 commit comments

Comments
 (0)