Windows GPU jobs: build artifact only (drop ctest on GPU-less runners)

claude · claude · commit 788aaef7c225 · 2026-06-28T17:05:30.000Z
Run 28329190065 (d36d026): the CUDA full-toolkit fix worked — CUDA 13.2 now compiles the entire ggml-cuda backend and the build succeeds. It then failed at ctest: gtest_discover_tests cannot enumerate the CUDA-linked jllama_test.exe on a GPU-less GitHub runner (the binary errors probing for a CUDA device at startup), so CMake registers the failing jllama_test_NOT_BUILT sentinel. Running a GPU-linked unit-test binary on a runner with no GPU is not possible, and the C++ unit suite is CPU-only logic already fully covered by the `C++ Tests` job and the CPU Windows jobs. So the three Windows GPU build jobs now build the artifact only: drop -DBUILD_TESTING and the ctest step from cuda/vulkan/opencl. The jllama.dll artifact (the real deliverable) is unaffected. Docs (CLAUDE.md, TODO.md) updated. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Deqf9xS6jz9t1idytVTaPV
diff --git a/.github/workflows/publish.yml b/.github/workflows/publish.yml
@@ -723,10 +723,12 @@ jobs:
           Add-Content -Path $env:GITHUB_PATH -Value "$env:RUNNER_TEMP\sccache\$rel"
       - name: Build libraries
         shell: cmd
+        # GPU jobs build the artifact only — no -DBUILD_TESTING / ctest. The C++ unit
+        # suite is CPU-only and fully covered by the `C++ Tests` job + the CPU Windows
+        # jobs; a GPU-linked jllama_test.exe cannot be discovered/run on a GPU-less
+        # GitHub runner (it errors probing for a CUDA device -> ctest *_NOT_BUILT).
         run: |
-          .github\build.bat -G "Ninja Multi-Config" -DGGML_CUDA=ON -DOS_NAME=Windows -DOS_ARCH=x86_64 -DBUILD_TESTING=ON
-      - name: Run C++ unit tests
-        run: ctest --test-dir build --output-on-failure
+          .github\build.bat -G "Ninja Multi-Config" -DGGML_CUDA=ON -DOS_NAME=Windows -DOS_ARCH=x86_64
       - name: Upload artifacts
         uses: actions/upload-artifact@v7
         with:
@@ -771,10 +773,10 @@ jobs:
           Add-Content -Path $env:GITHUB_PATH -Value "$env:RUNNER_TEMP\sccache\$rel"
       - name: Build libraries
         shell: cmd
+        # Build the artifact only (see the CUDA job's note: GPU-less runner can't run a
+        # GPU-linked jllama_test; the C++ unit suite is covered by the CPU jobs).
         run: |
-          .github\build.bat -G "Ninja Multi-Config" -DGGML_VULKAN=ON -DOS_NAME=Windows -DOS_ARCH=x86_64 -DBUILD_TESTING=ON
-      - name: Run C++ unit tests
-        run: ctest --test-dir build --output-on-failure
+          .github\build.bat -G "Ninja Multi-Config" -DGGML_VULKAN=ON -DOS_NAME=Windows -DOS_ARCH=x86_64
       - name: Upload artifacts
         uses: actions/upload-artifact@v7
         with:
@@ -814,10 +816,10 @@ jobs:
           Add-Content -Path $env:GITHUB_PATH -Value "$env:RUNNER_TEMP\sccache\$rel"
       - name: Build libraries
         shell: cmd
+        # Build the artifact only (see the CUDA job's note: GPU-less runner can't run a
+        # GPU-linked jllama_test; the C++ unit suite is covered by the CPU jobs).
         run: |
-          .github\build_opencl_windows.bat -G "Ninja Multi-Config" -DGGML_OPENCL=ON -DGGML_OPENCL_EMBED_KERNELS=ON -DOS_NAME=Windows -DOS_ARCH=x86_64 -DBUILD_TESTING=ON
-      - name: Run C++ unit tests
-        run: ctest --test-dir build --output-on-failure
+          .github\build_opencl_windows.bat -G "Ninja Multi-Config" -DGGML_OPENCL=ON -DGGML_OPENCL_EMBED_KERNELS=ON -DOS_NAME=Windows -DOS_ARCH=x86_64
       - name: Upload artifacts
         uses: actions/upload-artifact@v7
         with:
diff --git a/CLAUDE.md b/CLAUDE.md
@@ -188,8 +188,11 @@ model-backed Java suite (`test-java-windows-x86_64` = default/Ninja, `test-java-
 installed CUDA 13 Toolkit (`cudart64_13.dll`/`cublas64_13.dll`/`cublasLt64_13.dll` on `PATH`); Vulkan
 needs `vulkan-1.dll` (ships with current GPU drivers); OpenCL needs the vendor ICD
 (`System32\OpenCL.dll`). Not bundling = no NVIDIA-EULA redistribution obligation. **GitHub-hosted
-Windows runners have NO GPU**, so the GPU jobs build + run the C++ unit suite (`ctest`, CPU-only) but
-**cannot run model-backed GPU inference** — end-to-end GPU validation is local / self-hosted.
+Windows runners have NO GPU**, so the GPU jobs **build the artifact only** (no `-DBUILD_TESTING`/`ctest`)
+— a GPU-linked `jllama_test.exe` can't even be enumerated on a GPU-less runner (it errors probing for a
+device, so `gtest_discover_tests` registers a failing `*_NOT_BUILT` sentinel). The CPU-only C++ unit
+suite is fully covered by the `C++ Tests` job + the CPU Windows jobs; model-backed GPU inference is
+local / self-hosted.
 
 Wiring (mirrors the CUDA-Linux / OpenCL-Android classifier pattern):
 
diff --git a/TODO.md b/TODO.md
@@ -196,7 +196,8 @@ Multi-Config + MSVC.
   artifacts `Windows-{arch}-libraries`); `build-windows-x86_64-msvc` / `build-windows-x86-msvc` are
   **MSVC** (artifacts `Windows-{arch}-msvc`). `test-java-windows-x86_64` (default/Ninja) and
   `test-java-windows-x86_64-msvc` both load the DLL via JNI and run the full model-backed suite.
-- **GPU build jobs (x86_64, Ninja, build + `ctest` only — runners have no GPU):**
+- **GPU build jobs (x86_64, Ninja, build the artifact only — runners have no GPU, and a
+  GPU-linked jllama_test can't be enumerated there; C++ suite runs on the CPU jobs):**
   `build-windows-x86_64-cuda` (`Jimver/cuda-toolkit@v0.2.35` CUDA `13.2.0` + `-DGGML_CUDA=ON`),
   `build-windows-x86_64-vulkan` (`jakoch/install-vulkan-sdk-action` + `-DGGML_VULKAN=ON`),
   `build-windows-x86_64-opencl` (`build_opencl_windows.bat` stages the ICD loader + `-DGGML_OPENCL=ON`).