@@ -75,9 +75,10 @@ sccache-over-Depot cache, which now wraps nvcc (`-DCMAKE_CUDA_COMPILER_LAUNCHER=
7575` build.sh ` for CUDA builds, gated behind the same probe). The launcher is safe to enable
7676unconditionally: if sccache cannot wrap nvcc it runs it directly (uncached), and ` build.sh ` 's
7777mid-build retry treats an sccache ` Compiler not supported ` failure like any other cache error and
78- rebuilds the job without the launcher rather than redding it. ** Verify it works:** the premise
79- (sccache producing nvcc cache hits inside the manylinux_2_28 container) is proven only by a ** warm**
80- run — check ` sccache --show-stats ` shows CUDA hits on the second build before trusting the speedup.
78+ rebuilds the job without the launcher rather than redding it. ** Verified:** a warm run in the
79+ manylinux_2_28 container hit ** 100%** on CUDA / CUBIN / device-code (139 CUDA hits, 99.86% overall,
80+ 3 misses) and cut the job from ** ~ 51 min cold to ~ 15 min warm** — nvcc caching works here. ` build.sh `
81+ prints ` sccache --show-stats ` at the end of every run so the hit table stays visible.
8182
8283## Android minimum API level
8384
@@ -323,14 +324,15 @@ v0.16.0 + the probe this is no longer a risk.) Job-by-job status:
323324 ** v0.16.0** probe passed in-container (devtoolset-10 gcc), ` sccache ON ` over Depot WebDAV,
324325 warm cache 277/278 hits (99.64%), 1m46s build time.
3253262 . ` crosscompile-linux-x86_64-cuda ` (via ` build_cuda_linux.sh ` , which execs ` build.sh ` ) —
326- 🚧 ** nvcc caching enabled , full-arch always** (diagnostics on). ` build.sh ` now also wraps nvcc
327+ ✅ ** verified green with nvcc caching, full-arch always. ** ` build.sh ` also wraps nvcc
327328 (` -DCMAKE_CUDA_COMPILER_LAUNCHER=sccache ` , scoped to CUDA builds), so both the gcc C/C++ TUs
328329 (134 model files + ggml + httplib) ** and** the per-arch ` .cu ` device passes cache over Depot.
329330 CI dropped the single-arch validation shortcut (` CUDA_FAST_BUILD ` /` CUDA_ARCH ` removed from the
330- job) — every run builds the full arch set and leans on the warm cache for speed. ** Unverified
331- until a warm run:** confirm ` sccache --show-stats ` reports CUDA hits on the second build; if
332- nvcc caching proves weak in this container, the cold-vs-warm delta will be small and the job
333- stays ~ 70 min (the mid-build retry guards against an nvcc-hostile sccache redding the build).
331+ job) — every run builds the full arch set and leans on the warm cache for speed. A warm run hit
332+ ** 100%** on CUDA / CUBIN / device-code (139 CUDA hits, 99.86% overall, 3 misses), cutting the job
333+ from ** ~ 51 min cold to ~ 15 min warm** . The first-run debug diagnostics (` SCCACHE_LOG ` /
334+ ` SCCACHE_ERROR_LOG ` / ` RUST_BACKTRACE ` ) were dropped once confirmed; ` sccache --show-stats ` still
335+ prints the hit table every run.
3343363 . ` crosscompile-linux-aarch64 ` — ✅ ** enabled** , now a ** native ` ubuntu-24.04-arm ` build** (not
335337 dockcross): ` build.sh ` self-fetches the aarch64 static-musl sccache (the fetch block in
336338 ` build.sh ` maps ` uname -m ` → ` x86_64 ` /` aarch64 ` ) and the probe guards it. See "Linux aarch64:
0 commit comments