Skip to content

Add Windows Ninja Multi-Config build with sccache caching#248

Merged
bernardladenthin merged 15 commits into
mainfrom
claude/exciting-fermi-ltkrku
Jun 20, 2026
Merged

Add Windows Ninja Multi-Config build with sccache caching#248
bernardladenthin merged 15 commits into
mainfrom
claude/exciting-fermi-ltkrku

Conversation

@bernardladenthin

Copy link
Copy Markdown
Owner

Summary

  • Ship Windows natives built with Ninja Multi-Config + sccache/Depot as a parallel ninja-windows classifier JAR, alongside the existing MSVC/Visual Studio build (which remains the default).
  • Add two new CI jobs (build-windows-x86_64-ninja, build-windows-x86-ninja) that build with Ninja, cache via sccache + Depot WebDAV, and run C++ unit tests.
  • Add test-java-windows-x86_64-ninja to validate the Ninja-built DLL end-to-end via JNI.
  • Implement sccache probe guard in .github/build.bat (mirrors build.sh) to safely enable the compiler launcher only when sccache is present and working.
  • Add windows-ninja Maven profile to produce the classifier JAR from a separate resource tree (src/main/resources_windows_ninja/).
  • Update package, publish-snapshot, and publish-release jobs to download and activate the Ninja build.

Rationale

The Visual Studio generator ignores CMAKE_{C,CXX}_COMPILER_LAUNCHER, so the MSVC Windows jobs cannot use sccache caching. Rather than switch the trusted MSVC build, this change builds the same CPU natives a second time with Ninja Multi-Config (which does honor the launcher) and ships them as an optional classifier JAR. The MSVC build is kept as the permanent default; the Ninja artifact is an additional, cache-accelerated, independently validated option for users to compare/adopt. Upstream llama.cpp ships its windows-cuda artifact with Ninja Multi-Config + MSVC, proving the combination works on the same tree.

Test plan

  • CI is green on this branch (all four Windows build jobs pass, both Java test jobs pass)
  • Existing MSVC Windows builds remain unchanged and unaffected
  • New Ninja jobs run C++ unit tests (ctest) and Java model-backed tests
  • Docs updated: README.md classifier table, CLAUDE.md "Windows Ninja artifact" section, TODO.md status

Related issues / PRs

Closes the Windows compiler cache task in TODO.md (deferred → shipped as dual-artifact design).

Checklist

  • I have read CONTRIBUTING.md and CODE_OF_CONDUCT.md
  • My commits follow Conventional Commits
  • No security-sensitive changes

https://claude.ai/code/session_01SfvSZ76NW4e1qX1PjL4RKq

claude added 2 commits June 20, 2026 16:56
…dows

The two Windows native build jobs are the only uncached native builds: the
Visual Studio generator ignores CMAKE_{C,CXX}_COMPILER_LAUNCHER, so sccache
cannot be wired into them. Upstream llama.cpp ships its windows-cuda artifact
with Ninja Multi-Config + MSVC, proving the combination works on the same tree.

- build.bat: add an sccache probe guard mirroring build.sh's
  sccache_can_wrap_compiler() — when USE_CACHE=true and sccache is on PATH,
  compile a trivial TU through `sccache cl.exe`; only on success pass
  -DCMAKE_{C,CXX}_COMPILER_LAUNCHER=sccache and print `sccache --show-stats`.
  A missing/crashing sccache falls back to a green uncached build. Inert for
  the VS generator jobs (they do not set USE_CACHE).
- publish.yml: add two evaluation-only jobs, build-windows-x86_64-ninja and
  build-windows-x86-ninja (windows-2025-vs2026, ilammy/msvc-dev-cmd@v1,
  sccache v0.16.0, Depot WebDAV env, `build.bat -G "Ninja Multi-Config"`).
  Artifacts are named Windows-*-ninja (NOT *-libraries) so the package job's
  `pattern: "*-libraries"` does not consume them; package's `needs:` is
  unchanged. They run in parallel with the trusted VS jobs.
- TODO.md: flip the Windows cache section from "implementation notes" to
  "evaluation in progress"; document what remains (confirm cache hits, then
  rename artifacts to *-libraries, wire into package `needs`, retire VS jobs).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01SfvSZ76NW4e1qX1PjL4RKq
…ermanent MSVC

Builds on the prior Ninja evaluation jobs. Per owner decision, the MSVC /
Visual Studio Windows build is the default JAR and is kept permanently (the
sccache cache loss on it is accepted); the Ninja Multi-Config build is shipped
ALONGSIDE it as a separate classifier JAR, never as a replacement. The two
generators produce different jllama.dll files, so they cannot share a resource
path in one JAR — hence the classifier (mirrors the cuda / opencl-android
pattern). Result: 4 permanent Windows build jobs, both generators distributed
and tested end-to-end.

- pom.xml: add `windows-ninja` profile producing a <classifier>ninja-windows</classifier>
  JAR from ${outputDirectory}_windows_ninja (separate compile pass + resource
  copy + classified jar; mirrors cuda / opencl-android).
- publish.yml: the package, publish-snapshot, and publish-release jobs download
  Windows-{x86_64,x86}-ninja into src/main/resources_windows_ninja/ and activate
  the `windows-ninja` profile (-P ...,windows-ninja). Add a
  test-java-windows-x86_64-ninja job that loads the Ninja DLL via JNI and runs
  the full model-backed suite (parity with test-java-windows-x86_64). Wire the
  Ninja build + Java-test jobs into the package `needs:` graph.
- .gitignore: ignore src/main/resources_windows_ninja/ (CI-staged, never committed).
- README.md: add the `ninja-windows` classifier row + dependency snippet.
- CLAUDE.md: add "Windows Ninja artifact" section; refresh the sccache "Windows"
  note (no CMakeLists change — routing is CI-download + pom-profile, not a GGML flag).
- TODO.md: rewrite the Windows section to the final dual-build design (MSVC kept
  forever; remaining work is cache-hit verification, not a redesign).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01SfvSZ76NW4e1qX1PjL4RKq
…fix)

The b9739 upgrade (#247) added tools/server/server-schema.cpp to the jllama
library target but not to the jllama_test executable target. server-schema.cpp
defines server_schema::eval_llama_cmpl_schema(), which test_server.cpp and the
upstream server-context.cpp reference, so the test link failed with an
undefined/unresolved external symbol on every platform that builds the tests
(Linux x86_64, macOS, Windows MSVC, Windows Ninja). The shared library itself
links fine — only jllama_test was missing the TU. Add server-schema.cpp to the
jllama_test source list, mirroring its presence in the library target.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01SfvSZ76NW4e1qX1PjL4RKq
…behavior

b9739's server-schema validation changed two behaviors that the existing tests
encoded from an earlier llama.cpp:
- negative n_discard is now range-checked (0 <= value <= INT32_MAX) and throws,
  instead of being silently clamped to 0; and
- every field-validation failure is wrapped in std::invalid_argument
  ("Field '<name>': ...", server-schema.cpp) rather than surfacing the inner
  std::runtime_error.

Update the four affected assertions (NDiscard negative now expects a throw;
dry_sequence_breakers / lora / repeat_last_n expect std::invalid_argument) and
the descriptive comment block. With the prior server-schema.cpp test-link fix,
the full C++ suite passes locally (446/446).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01SfvSZ76NW4e1qX1PjL4RKq
…neW) regression

Root-cause write-up only (fix deferred per discussion). On Windows x86_64,
b9739's common_params_parse has a _WIN32 prologue that replaces the
caller-supplied argv with GetCommandLineW() (the host process command line).
Correct for the standalone llama-server.exe, but under JNI the process is
java.exe (no --model on its command line), so the parse fails with
"Failed to parse model parameters" for every model-loading Java test. Affects
both the MSVC and Ninja DLLs; Linux/macOS unaffected. Documents symptom,
exact upstream code (arg.cpp:924-931), why our argv is platform-neutral, and
three fix options to choose from later.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01SfvSZ76NW4e1qX1PjL4RKq
…cross LTS)

The dockcross linux-arm64-lts image ships GCC 8.5 / glibc 2.17, which can no
longer compile llama.cpp b9739 (its C++17 CTAD-in-`new` needs GCC >= 12). Mirror
upstream llama.cpp's own aarch64 release job: build natively on GitHub's free
ubuntu-24.04-arm runner with GCC 14.

- publish.yml: rewrite crosscompile-linux-aarch64 as a native build (setup-java ->
  mvn compile -> build.sh) with -DGGML_NATIVE=OFF for ARMv8 portability, and run
  ctest on real ARM hardware for the first time. Job id kept (downstream needs:);
  display name -> "Build and Test Linux aarch64".
- build.sh: generalize the Linux sccache auto-fetch from x86_64-only to map
  uname -m -> x86_64/aarch64 static-musl release (x86_64 path unchanged).
- CLAUDE.md: document the native build and the glibc floor change (2.17 -> ~2.39,
  matching upstream's ARM binaries).

Trade-off: the aarch64 artifact's glibc floor rises to ~2.39 (same envelope
upstream's own ARM binaries require); x86_64 stays at glibc 2.17.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01SfvSZ76NW4e1qX1PjL4RKq
The default CPU JAR's Linux aarch64 native is now built natively on
ubuntu-24.04-arm (mirroring upstream), so it requires glibc >= 2.39. Document
this in the "Choosing the right classifier" runtime-requirement column so users
on older-glibc ARM hosts know to expect a load failure. Linux x86-64 stays at
glibc 2.17 (manylinux2014).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01SfvSZ76NW4e1qX1PjL4RKq
…#24779)

Introduce a generic source-patch mechanism for the FetchContent'd llama.cpp tree
so fixes apply to every C++ build (all CI jobs + local) from one place:

- patches/            : drop *.patch / *.diff here (applied in filename order)
- cmake/apply-llama-patches.cmake : cross-platform (cmake -P), idempotent
  (git apply --reverse --check skips already-applied), fail-loud (a stale patch
  aborts configure so it can't be silently dropped from a release build)
- CMakeLists.txt      : wired as the llama.cpp FetchContent PATCH_COMMAND

First patch, 0001-win32-arg-parse-embed-guard.patch, fixes the b9739 Windows JNI
regression from upstream #24779: common_params_parse unconditionally replaced the
caller's argv with the process command line (GetCommandLineW), so an embedded
java.exe lost its --model args -> "Failed to parse model parameters". The guard
adopts the process command line only when the re-derived arg count equals argc:
true for the standalone llama-* tools (UTF-8 CLI fix preserved), false for a JVM
host (our already-UTF-8 argv kept). Verified the patch applies cleanly to b9739
and the applier is idempotent.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01SfvSZ76NW4e1qX1PjL4RKq
…make

The REUSE license-compliance check (fsfe/reuse-action) failed on the new files:
- cmake/apply-llama-patches.cmake had the license but no copyright -> add inline
  SPDX-FileCopyrightText (mirrors build.sh).
- patches/*.patch is unified-diff format and cannot carry an inline header without
  corrupting the diff -> annotate via REUSE.toml with a patches/** glob, so every
  current and future patch in the directory is covered automatically.

Verified locally: `reuse lint` -> compliant with REUSE 3.3.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01SfvSZ76NW4e1qX1PjL4RKq
The count-guard variant of 0001-win32-arg-parse-embed-guard.patch fixed 21/25
Windows Java tests but collided on the 4 server-integration setups (Rerank,
ToolCalling, Multimodal, OpenAiCompatServer) whose argv length happened to equal
java.exe's process arg count — the guard then wrongly adopted java.exe's command
line and they kept failing with "Failed to parse model parameters". Those tests
pass on Linux/macOS, so their args are valid; the collision was the only cause.

Switch to the deterministic fix: keep the make_utf8_argv() call referenced (no
-Wunused-function) but never adopt its result, so the caller's already-UTF-8 JNI
argv is always used. A JNI library is never the process, so the GetCommandLineW
override is pure liability for us. Verified the patch applies cleanly to b9739
and the applier stays idempotent. CLAUDE.md + TODO.md updated to record the
count-guard -> removal change; upstream PR can still expose an opt-out that
preserves the standalone tools' UTF-8 fix.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01SfvSZ76NW4e1qX1PjL4RKq
… Rating)

SonarCloud's GitHub Actions analyzer flagged these (they drive the "Security
Rating on New Code = C" gate — the "C" is the rating grade, not the C language;
the scan is the Maven scanner over YAML/Java, not CFamily):

- clang-format.yml: pip install without --only-binary :all: can execute a source
  package's setup.py at install time. Force wheels-only (clang-format ships
  manylinux wheels). This is the only finding in the new-code window, so it is
  what fails the gate. Uses a block scalar so ":all:" doesn't break YAML.
- osv-scanner.yml / scorecard.yml: tighten top-level "permissions: read-all" to
  "contents: read". Safe — every job in both files already declares its own
  exact permissions (security-events: write, id-token: write, etc.), which fully
  override the workflow default.

Not touched: publish.yml's workflow-level "contents: read" (Sonar wants it at job
level) — Low severity, not in the new-code window (doesn't affect the gate),
already owner-assigned, and spreading it across ~25 jobs is invasive on the
release pipeline. Left for a separate decision.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01SfvSZ76NW4e1qX1PjL4RKq
SonarCloud flagged three Critical Reliability bugs in PairTest: assertNotNull on
pair.hashCode(), which returns a primitive int that can never be null, so the
assertions verified nothing. Replace them with a deterministic-hashCode check
(capture the value, assert a repeated call returns the same int) — this exercises
the real "doesn't throw + consistent across calls" intent. A local variable is
used so the two assertEquals arguments are distinct expressions (avoids the
identical-argument rule java:S5863).

These are Reliability bugs, separate from the failing Security-Rating gate; fixing
them improves the Reliability rating.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01SfvSZ76NW4e1qX1PjL4RKq
claude added 3 commits June 20, 2026 22:03
- Add open items: SonarCloud "Security Rating on New Code" gate status (GHA
  findings fixed in 84297e0, publish.yml owner-accepted, PairTest reliability
  bugs fixed in 9f0d377, gate pending the last dashboard finding); the upstream
  llama.cpp PR to drop the local Windows arg-parse patch; the aarch64 branch-
  protection rename.
- Mark resolved: Windows-JNI arg-parse regression (patch option 2 chosen);
  Windows Ninja verification (green + sccache hits).
- Add Done section for b9739 upgrade, Windows Ninja classifier, native
  ubuntu-24.04-arm aarch64 build, the generic patches/ mechanism, and CUDA
  sccache verification.
- Fix stale b9682 -> b9739 references (pin is b9739).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01SfvSZ76NW4e1qX1PjL4RKq
The dd264b2 scaffold (NativeServer.java + NativeServerSmokeTest) landed the
native-HTTP-transport Java entry point, but TODO.md tracked no follow-up for
wiring the upstream server.cpp route table to JNI. Add it under the server
follow-ups so the remaining native-server work is visible.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01SfvSZ76NW4e1qX1PjL4RKq
…248 item

The combined commit status on #248 shows a "License Compliance" check failing
with "17 issues found" — a dependency-license scanner app, separate from REUSE
(green) and SonarCloud. Document it as open, note it is almost certainly
pre-existing (the PR changes no dependencies), and how to triage it.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01SfvSZ76NW4e1qX1PjL4RKq
@bernardladenthin bernardladenthin merged commit 2b0fcf5 into main Jun 20, 2026
11 of 12 checks passed
@bernardladenthin bernardladenthin deleted the claude/exciting-fermi-ltkrku branch June 20, 2026 23:02
@sonarqubecloud

Copy link
Copy Markdown

Quality Gate Failed Quality Gate failed

Failed conditions
C Security Rating on New Code (required ≥ A)

See analysis details on SonarQube Cloud

Catch issues before they fail your Quality Gate with our IDE extension SonarQube for IDE

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants