Skip to content

Commit 4e5b11c

Browse files
committed
docs(TODO): refresh open/done items for the b9739 + PR #248 work
- Add open items: SonarCloud "Security Rating on New Code" gate status (GHA findings fixed in 84297e0, publish.yml owner-accepted, PairTest reliability bugs fixed in 9f0d377, gate pending the last dashboard finding); the upstream llama.cpp PR to drop the local Windows arg-parse patch; the aarch64 branch- protection rename. - Mark resolved: Windows-JNI arg-parse regression (patch option 2 chosen); Windows Ninja verification (green + sccache hits). - Add Done section for b9739 upgrade, Windows Ninja classifier, native ubuntu-24.04-arm aarch64 build, the generic patches/ mechanism, and CUDA sccache verification. - Fix stale b9682 -> b9739 references (pin is b9739). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01SfvSZ76NW4e1qX1PjL4RKq
1 parent 9f0d377 commit 4e5b11c

1 file changed

Lines changed: 79 additions & 21 deletions

File tree

TODO.md

Lines changed: 79 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@ NanoHTTPD server; NanoHTTPD + its dependency deleted.)
3030
primary goal: agentic tool-calling with Qwen):
3131

3232
- Agentic tool-calling verified wire-correct: C++ guard pins `tool_calls.function.arguments` as a JSON
33-
**string** (not object) at b9682 (llama.cpp #20198), plus the existing `finish_reason:"tool_calls"`
33+
**string** (not object) at b9739 (llama.cpp #20198), plus the existing `finish_reason:"tool_calls"`
3434
test.
3535
- `stream_options.include_usage` forwarded (new `InferenceParameters.withStreamOptions`) so the trailing
3636
usage chunk is emitted, and `OpenAiSseFormatter.ensureUsageCachedTokens` guarantees
@@ -82,7 +82,7 @@ primary goal: agentic tool-calling with Qwen):
8282
What remains is manual validation against the actual editor clients — point Copilot's Ollama provider /
8383
a Custom Endpoint, Claude Code, and a Responses client at the running server — since a server-side
8484
round-trip confirms the wire shapes but not each client's own parser.
85-
- **Gemma 4 tool-calling validation.** Confirm the pinned llama.cpp (`b9682`) includes the Gemma 4
85+
- **Gemma 4 tool-calling validation.** Confirm the pinned llama.cpp (`b9739`) includes the Gemma 4
8686
tool-call parser fixes; if not, bump per the upgrade procedure.
8787

8888
### Windows compiler cache (sccache) — dual build shipped (MSVC default + Ninja classifier)
@@ -96,7 +96,7 @@ independently validated second Windows artifact is available for users to compar
9696
**Why two builds.** The cache mechanism is the CMake *compiler launcher*
9797
(`-DCMAKE_C_COMPILER_LAUNCHER=sccache`). **The Visual Studio generator ignores it entirely**
9898
(only Ninja/Makefile generators honor it), so the MSVC jobs can never cache. The Ninja
99-
Multi-Config generator *does* honor it (upstream llama.cpp `b9682` ships `windows-cuda` this way,
99+
Multi-Config generator *does* honor it (upstream llama.cpp `b9739` ships `windows-cuda` this way,
100100
proving Ninja Multi-Config + MSVC works on the same tree). The two builds produce **different
101101
`jllama.dll`s**, so they cannot coexist at the same resource path in one JAR — hence the classifier.
102102

@@ -117,14 +117,14 @@ proving Ninja Multi-Config + MSVC works on the same tree). The two builds produc
117117
`windows-ninja` profile; the Ninja build + Java-test jobs are in the `package` `needs:` graph.
118118
- Docs: `README.md` classifier table + `CLAUDE.md` "Windows Ninja artifact" section.
119119

120-
**What remains (verification, not redesign):**
121-
- Confirm the first green CI run shows **`sccache --show-stats` cache hits** in the Ninja job logs
122-
(cold first, warm thereafter). If sccache never warms on the Depot WebDAV backend from Windows
123-
runners, the Ninja build still ships correctly — it just falls back to uncached (green) — so this
124-
is a perf check, not a correctness gate.
125-
- Optionally smoke-test that the published `ninja-windows` classifier JAR loads its DLL on a clean
126-
Windows host. Publishing is gated behind `publish_to_central`, so a broken Windows job blocks the
127-
release before any artifact reaches Central/GitHub Releases.
120+
**Verification — DONE (PR #248).** The Ninja jobs are green and cache-warm: `Build and Test
121+
Windows … (Ninja … sccache, eval)` builds + `ctest` pass, and `Java Tests Windows 2025 x86_64
122+
(Ninja, eval)` loads the DLL via JNI and runs the full model-backed suite green (after the b9739
123+
arg-parse patch landed). `sccache --show-stats` confirms cache hits on the Ninja jobs.
124+
125+
**Optional follow-up:** smoke-test that the *published* `ninja-windows` classifier JAR loads its DLL
126+
on a clean Windows host. Publishing is gated behind `publish_to_central`, so a broken Windows job
127+
blocks the release before any artifact reaches Central/GitHub Releases.
128128

129129
**Reference notes:**
130130
- Cache backend is **sccache + Depot WebDAV** (consistent with the other 8 jobs — one token, shared
@@ -187,16 +187,52 @@ parser. Our JNI already passes correct UTF-8 argv (`GetStringUTFChars`), so the
187187
unnecessary for us. **This is an upstream bug affecting every embedded Windows consumer of
188188
`common_params_parse`.**
189189
190-
**Fix options (decide later):**
191-
1. FetchContent `PATCH_COMMAND` that **guards** the block — only override when `make_utf8_argv()` arg
192-
count equals the caller's `argc` (true for the standalone exe, false for JNI). Minimal + semantically
193-
correct; also the shape to PR upstream.
194-
2. FetchContent patch that **removes** the `_WIN32` override for our build.
195-
3. File an upstream issue/PR and wait for a fixed llama.cpp build.
196-
197-
Any patch re-applies on every llama.cpp bump — add it to the upgrade checklist. Pre-existing on `main`
198-
since #247 (b9682→b9739); independent of the Windows-Ninja classifier work. Windows **build + C++ ctest**
199-
jobs are green; only the Windows **Java/inference** jobs are affected.
190+
**Fix options (history — option 2 chosen).** (1) guard the block by arg-count — *tried first, it
191+
collided* (see the count-guard note above); (2) **remove the `_WIN32` override for our build — CHOSEN**
192+
(deterministic; our JNI always passes correct UTF-8 argv); (3) file an upstream PR and wait. The patch
193+
re-applies on every llama.cpp bump and the applier fails loud if it stops applying — it is part of the
194+
upgrade checklist. Pre-existing on `main` since #247 (b9682→b9739); independent of the Windows-Ninja
195+
classifier work. **Remaining open item: the upstream PR** (see "Upstream llama.cpp PR" below) so the
196+
local patch can eventually be dropped.
197+
198+
### SonarCloud "Security Rating on New Code" gate — PR #248 (open)
199+
200+
The PR's **only** red is SonarCloud's "Security Rating on New Code" gate (every build/test job is
201+
green; SonarCloud is **not** a merge-blocking build job). The findings are GitHub-Actions/Java
202+
analyzer issues from the Maven scanner — **"C" is the rating *grade* (A–E), not the C language**;
203+
there is no CFamily/C-C++ scan configured. Addressed:
204+
205+
- **`clang-format.yml`** — `pip install` without `--only-binary :all:` can run a package's `setup.py`;
206+
forced wheels-only (`84297e0`, block scalar so `:all:` doesn't break YAML). *If Sonar still flags it,
207+
try the `--only-binary=:all:` equals form.*
208+
- **`osv-scanner.yml` / `scorecard.yml`** — top-level `permissions: read-all` → `contents: read`
209+
(`84297e0`); safe because every job in both files already declares its own exact permissions.
210+
- **`publish.yml`** — workflow-level `permissions: contents: read` (Sonar wants it per-job); **owner
211+
marked it Accept/"Won't fix" on the dashboard** rather than spreading perms across ~25 release jobs.
212+
Alternative if ever desired: add `permissions: contents: read` to the ~19 read-only jobs (the 5
213+
publish/report jobs already declare `contents: write`) and drop the top-level block.
214+
- **`PairTest.java`** — 3 Critical *Reliability* bugs (`assertNotNull` on the primitive `hashCode()`)
215+
replaced with a determinism check (`9f0d377`). Reliability rating, **not** the Security gate.
216+
217+
**Still open:** the gate was still red as of `9f0d377`. SonarCloud's issues API is auth-gated (403 from
218+
CI), so the exact remaining new-code Vulnerability must be read off the dashboard. Resolve the last
219+
finding, accept it on the dashboard, or merge on the green build/test checks.
220+
221+
### Upstream llama.cpp PR — drop the local Windows arg-parse patch (open)
222+
223+
`patches/0001-win32-arg-parse-embed-guard.patch` is a **local** fix re-applied on every build. To drop
224+
it, PR upstream (against #24779): add a `common_params_parse_argv` companion (or a
225+
`common_params_parse` opt-out flag) that trusts the caller's argv — preserving the standalone tools'
226+
UTF-8 fix while letting embedders (JNI, and any FFI binding) pass their own argv. Ship with the
227+
standalone-safe repro (a plain exe that passes a synthetic argv and shows it gets discarded on Windows
228+
because `GetCommandLineW()` returns the host process line). Once merged and the pin is bumped past it,
229+
delete the patch.
230+
231+
### Branch protection — aarch64 job renamed (open, owner action)
232+
233+
The native aarch64 switch renamed the check **`Cross-Compile Linux aarch64 (LTS)` → `Build and Test
234+
Linux aarch64`**. If a required status check pinned the old name, repoint it or it will sit pending
235+
forever.
200236
201237
### llama.cpp upstream feature exposure (queued, deferred by policy)
202238
@@ -266,6 +302,28 @@ These are JNI plumbing items for upstream API additions. Policy: add only after
266302
267303
## Done (kept for history)
268304
305+
### b9739 upgrade + PR #248 (Windows Ninja, native aarch64, patches mechanism)
306+
307+
- **llama.cpp b9682 → b9739** (#247, merged) + build fixes: `server-schema.cpp` added to the
308+
`jllama_test` sources (b9739 link fix, `38be6db`); `test_server.cpp` `ParamsFromJsonCmpl`
309+
expectations updated to b9739 schema behavior (`aaba886`).
310+
- **Windows Ninja artifact** — `ninja-windows` classifier JAR built with Ninja Multi-Config + sccache,
311+
shipped alongside the permanent MSVC default; both build + Java-test jobs green (`e113ed3`,
312+
`48f0863`). (See the open section above for the design rationale; verification is done.)
313+
- **Linux aarch64 → native `ubuntu-24.04-arm` build** (`ed9ecbb`). The dockcross `linux-arm64-lts`
314+
image (GCC 8.5 / glibc 2.17) could no longer compile b9739's C++17 CTAD-in-`new`; now builds natively
315+
with GCC 14 (mirroring upstream), runs `ctest` on real ARM (446 tests green), and warms sccache
316+
(99.66% hits). Trade-off: glibc floor 2.17 → ~2.39 (same envelope as upstream's ARM binaries);
317+
documented in the README classifier table. `build.sh` sccache auto-fetch generalized to aarch64.
318+
- **Generic `patches/` mechanism** — drop `*.patch`/`*.diff` in repo-root `patches/`, applied to the
319+
FetchContent'd llama.cpp source by `cmake/apply-llama-patches.cmake` via the llama.cpp
320+
`PATCH_COMMAND` (cross-platform, idempotent, fail-loud). Covers every C++ build from one place.
321+
First patch fixes the Windows JNI arg-parse regression (`1d875b1` → deterministic form `f651b53`).
322+
REUSE annotated via `patches/**` glob (`0cffac1`).
323+
- **CUDA sccache verified** — the `manylinux_2_28 (CUDA)` job caches all gcc C/C++ TUs (247/248 hits,
324+
99.60%); the nvcc `.cu` kernels remain uncached (sccache limitation), and `CUDA_FAST_BUILD` keeps
325+
PR/validation runs single-arch. (Doc/observation; no code change.)
326+
269327
### Layered package restructure (flat root package → layered hierarchy)
270328
271329
The flat `net.ladenthin.llama` root package was split (via `git mv`, history

0 commit comments

Comments
 (0)