@@ -30,7 +30,7 @@ NanoHTTPD server; NanoHTTPD + its dependency deleted.)
3030primary goal: agentic tool-calling with Qwen):
3131
3232- Agentic tool-calling verified wire-correct: C++ guard pins ` tool_calls.function.arguments ` as a JSON
33- ** string** (not object) at b9682 (llama.cpp #20198 ), plus the existing ` finish_reason:"tool_calls" `
33+ ** string** (not object) at b9739 (llama.cpp #20198 ), plus the existing ` finish_reason:"tool_calls" `
3434 test.
3535- ` stream_options.include_usage ` forwarded (new ` InferenceParameters.withStreamOptions ` ) so the trailing
3636 usage chunk is emitted, and ` OpenAiSseFormatter.ensureUsageCachedTokens ` guarantees
@@ -82,7 +82,7 @@ primary goal: agentic tool-calling with Qwen):
8282 What remains is manual validation against the actual editor clients — point Copilot's Ollama provider /
8383 a Custom Endpoint, Claude Code, and a Responses client at the running server — since a server-side
8484 round-trip confirms the wire shapes but not each client's own parser.
85- - ** Gemma 4 tool-calling validation.** Confirm the pinned llama.cpp (` b9682 ` ) includes the Gemma 4
85+ - ** Gemma 4 tool-calling validation.** Confirm the pinned llama.cpp (` b9739 ` ) includes the Gemma 4
8686 tool-call parser fixes; if not, bump per the upgrade procedure.
8787
8888### Windows compiler cache (sccache) — dual build shipped (MSVC default + Ninja classifier)
@@ -96,7 +96,7 @@ independently validated second Windows artifact is available for users to compar
9696** Why two builds.** The cache mechanism is the CMake * compiler launcher*
9797(` -DCMAKE_C_COMPILER_LAUNCHER=sccache ` ). ** The Visual Studio generator ignores it entirely**
9898(only Ninja/Makefile generators honor it), so the MSVC jobs can never cache. The Ninja
99- Multi-Config generator * does* honor it (upstream llama.cpp ` b9682 ` ships ` windows-cuda ` this way,
99+ Multi-Config generator * does* honor it (upstream llama.cpp ` b9739 ` ships ` windows-cuda ` this way,
100100proving Ninja Multi-Config + MSVC works on the same tree). The two builds produce ** different
101101` jllama.dll ` s** , so they cannot coexist at the same resource path in one JAR — hence the classifier.
102102
@@ -117,14 +117,14 @@ proving Ninja Multi-Config + MSVC works on the same tree). The two builds produc
117117 ` windows-ninja ` profile; the Ninja build + Java-test jobs are in the ` package ` ` needs: ` graph.
118118- Docs: ` README.md ` classifier table + ` CLAUDE.md ` "Windows Ninja artifact" section.
119119
120- ** What remains (verification, not redesign): **
121- - Confirm the first green CI run shows ** ` sccache --show-stats ` cache hits ** in the Ninja job logs
122- (cold first, warm thereafter). If sccache never warms on the Depot WebDAV backend from Windows
123- runners, the Ninja build still ships correctly — it just falls back to uncached (green) — so this
124- is a perf check, not a correctness gate.
125- - Optionally smoke-test that the published ` ninja-windows ` classifier JAR loads its DLL on a clean
126- Windows host. Publishing is gated behind ` publish_to_central ` , so a broken Windows job blocks the
127- release before any artifact reaches Central/GitHub Releases.
120+ ** Verification — DONE (PR # 248 ). ** The Ninja jobs are green and cache-warm: `Build and Test
121+ Windows … (Ninja … sccache, eval) ` builds + ` ctest ` pass, and ` Java Tests Windows 2025 x86_64
122+ (Ninja, eval)` loads the DLL via JNI and runs the full model-backed suite green (after the b9739
123+ arg-parse patch landed). ` sccache --show-stats ` confirms cache hits on the Ninja jobs.
124+
125+ ** Optional follow-up: ** smoke-test that the * published* ` ninja-windows ` classifier JAR loads its DLL
126+ on a clean Windows host. Publishing is gated behind ` publish_to_central ` , so a broken Windows job
127+ blocks the release before any artifact reaches Central/GitHub Releases.
128128
129129** Reference notes:**
130130- Cache backend is ** sccache + Depot WebDAV** (consistent with the other 8 jobs — one token, shared
@@ -187,16 +187,52 @@ parser. Our JNI already passes correct UTF-8 argv (`GetStringUTFChars`), so the
187187unnecessary for us. **This is an upstream bug affecting every embedded Windows consumer of
188188`common_params_parse`.**
189189
190- **Fix options (decide later):**
191- 1. FetchContent `PATCH_COMMAND` that **guards** the block — only override when `make_utf8_argv()` arg
192- count equals the caller's `argc` (true for the standalone exe, false for JNI). Minimal + semantically
193- correct; also the shape to PR upstream.
194- 2. FetchContent patch that **removes** the `_WIN32` override for our build.
195- 3. File an upstream issue/PR and wait for a fixed llama.cpp build.
196-
197- Any patch re-applies on every llama.cpp bump — add it to the upgrade checklist. Pre-existing on `main`
198- since #247 (b9682→b9739); independent of the Windows-Ninja classifier work. Windows **build + C++ ctest**
199- jobs are green; only the Windows **Java/inference** jobs are affected.
190+ **Fix options (history — option 2 chosen).** (1) guard the block by arg-count — *tried first, it
191+ collided* (see the count-guard note above); (2) **remove the `_WIN32` override for our build — CHOSEN**
192+ (deterministic; our JNI always passes correct UTF-8 argv); (3) file an upstream PR and wait. The patch
193+ re-applies on every llama.cpp bump and the applier fails loud if it stops applying — it is part of the
194+ upgrade checklist. Pre-existing on `main` since #247 (b9682→b9739); independent of the Windows-Ninja
195+ classifier work. **Remaining open item: the upstream PR** (see "Upstream llama.cpp PR" below) so the
196+ local patch can eventually be dropped.
197+
198+ ### SonarCloud "Security Rating on New Code" gate — PR #248 (open)
199+
200+ The PR's **only** red is SonarCloud's "Security Rating on New Code" gate (every build/test job is
201+ green; SonarCloud is **not** a merge-blocking build job). The findings are GitHub-Actions/Java
202+ analyzer issues from the Maven scanner — **"C" is the rating *grade* (A–E), not the C language**;
203+ there is no CFamily/C-C++ scan configured. Addressed:
204+
205+ - **`clang-format.yml`** — `pip install` without `--only-binary :all:` can run a package's `setup.py`;
206+ forced wheels-only (`84297e0`, block scalar so `:all:` doesn't break YAML). *If Sonar still flags it,
207+ try the `--only-binary=:all:` equals form.*
208+ - **`osv-scanner.yml` / `scorecard.yml`** — top-level `permissions: read-all` → `contents: read`
209+ (`84297e0`); safe because every job in both files already declares its own exact permissions.
210+ - **`publish.yml`** — workflow-level `permissions: contents: read` (Sonar wants it per-job); **owner
211+ marked it Accept/"Won't fix" on the dashboard** rather than spreading perms across ~25 release jobs.
212+ Alternative if ever desired: add `permissions: contents: read` to the ~19 read-only jobs (the 5
213+ publish/report jobs already declare `contents: write`) and drop the top-level block.
214+ - **`PairTest.java`** — 3 Critical *Reliability* bugs (`assertNotNull` on the primitive `hashCode()`)
215+ replaced with a determinism check (`9f0d377`). Reliability rating, **not** the Security gate.
216+
217+ **Still open:** the gate was still red as of `9f0d377`. SonarCloud's issues API is auth-gated (403 from
218+ CI), so the exact remaining new-code Vulnerability must be read off the dashboard. Resolve the last
219+ finding, accept it on the dashboard, or merge on the green build/test checks.
220+
221+ ### Upstream llama.cpp PR — drop the local Windows arg-parse patch (open)
222+
223+ `patches/0001-win32-arg-parse-embed-guard.patch` is a **local** fix re-applied on every build. To drop
224+ it, PR upstream (against #24779): add a `common_params_parse_argv` companion (or a
225+ `common_params_parse` opt-out flag) that trusts the caller's argv — preserving the standalone tools'
226+ UTF-8 fix while letting embedders (JNI, and any FFI binding) pass their own argv. Ship with the
227+ standalone-safe repro (a plain exe that passes a synthetic argv and shows it gets discarded on Windows
228+ because `GetCommandLineW()` returns the host process line). Once merged and the pin is bumped past it,
229+ delete the patch.
230+
231+ ### Branch protection — aarch64 job renamed (open, owner action)
232+
233+ The native aarch64 switch renamed the check **`Cross-Compile Linux aarch64 (LTS)` → `Build and Test
234+ Linux aarch64`**. If a required status check pinned the old name, repoint it or it will sit pending
235+ forever.
200236
201237### llama.cpp upstream feature exposure (queued, deferred by policy)
202238
@@ -266,6 +302,28 @@ These are JNI plumbing items for upstream API additions. Policy: add only after
266302
267303## Done (kept for history)
268304
305+ ### b9739 upgrade + PR #248 (Windows Ninja, native aarch64, patches mechanism)
306+
307+ - **llama.cpp b9682 → b9739** (#247, merged) + build fixes: `server-schema.cpp` added to the
308+ `jllama_test` sources (b9739 link fix, `38be6db`); `test_server.cpp` `ParamsFromJsonCmpl`
309+ expectations updated to b9739 schema behavior (`aaba886`).
310+ - **Windows Ninja artifact** — `ninja-windows` classifier JAR built with Ninja Multi-Config + sccache,
311+ shipped alongside the permanent MSVC default; both build + Java-test jobs green (`e113ed3`,
312+ `48f0863`). (See the open section above for the design rationale; verification is done.)
313+ - **Linux aarch64 → native `ubuntu-24.04-arm` build** (`ed9ecbb`). The dockcross `linux-arm64-lts`
314+ image (GCC 8.5 / glibc 2.17) could no longer compile b9739's C++17 CTAD-in-`new`; now builds natively
315+ with GCC 14 (mirroring upstream), runs `ctest` on real ARM (446 tests green), and warms sccache
316+ (99.66% hits). Trade-off: glibc floor 2.17 → ~2.39 (same envelope as upstream's ARM binaries);
317+ documented in the README classifier table. `build.sh` sccache auto-fetch generalized to aarch64.
318+ - **Generic `patches/` mechanism** — drop `*.patch`/`*.diff` in repo-root `patches/`, applied to the
319+ FetchContent'd llama.cpp source by `cmake/apply-llama-patches.cmake` via the llama.cpp
320+ `PATCH_COMMAND` (cross-platform, idempotent, fail-loud). Covers every C++ build from one place.
321+ First patch fixes the Windows JNI arg-parse regression (`1d875b1` → deterministic form `f651b53`).
322+ REUSE annotated via `patches/**` glob (`0cffac1`).
323+ - **CUDA sccache verified** — the `manylinux_2_28 (CUDA)` job caches all gcc C/C++ TUs (247/248 hits,
324+ 99.60%); the nvcc `.cu` kernels remain uncached (sccache limitation), and `CUDA_FAST_BUILD` keeps
325+ PR/validation runs single-arch. (Doc/observation; no code change.)
326+
269327### Layered package restructure (flat root package → layered hierarchy)
270328
271329The flat `net.ladenthin.llama` root package was split (via `git mv`, history
0 commit comments