Support clang as the CUDA compiler and add clang-CUDA CI coverage#12165
Conversation
There was a problem hiding this comment.
Pull request overview
Adds first-class support for building XGBoost’s CUDA code with clang as the CUDA compiler, and wires up CI to compile the clang-CUDA configuration and run a clang-tidy smoke check from a clang-generated compilation database.
Changes:
- Extend CMake CUDA configuration/flag handling to support both NVCC and clang-CUDA.
- Update a few CUDA/device utilities to be clang-CUDA compatible (device annotations, byteswap intrinsics, bitfield ops).
- Add CI scripts and GitHub Actions jobs for clang-CUDA compile-only coverage and a clang-tidy smoke job based on the clang-CUDA compile DB.
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
src/common/quantile.cu |
Adjusts device/host annotation on a Thrust transform lambda for clang-CUDA compatibility. |
src/common/bitfield.h |
Refactors device/host annotations and operators to compile cleanly under clang-CUDA. |
include/xgboost/byteswap.h |
Switches CUDA-device byteswap to clang builtins when compiling with clang-CUDA. |
cmake/Utils.cmake |
Splits CUDA flag logic by CUDA compiler (NVIDIA vs Clang) and adds helper for host flag wrapping. |
CMakeLists.txt |
Uses toolkit version when available for CUDA minimum-version enforcement (clang-CUDA friendly). |
ops/pipeline/build-cuda-clang.sh |
New script to configure/build CUDA with clang as the CUDA compiler (CI-focused). |
ops/pipeline/run-clang-tidy-clang-cuda.sh |
New script to generate a clang-CUDA compile DB and run clang-tidy against it. |
.github/workflows/main.yml |
Adds a compile-only “build-cuda-clang” CI job. |
.github/workflows/lint.yml |
Adds a clang-tidy smoke job that runs from the clang-CUDA compilation database. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
@RAMitchell Does this have the potential for speeding up the |
|
I am hoping so :) |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 11 out of 11 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
0763e10 to
fdea029
Compare
Summary
Motivation
The existing CUDA build path is NVCC-specific, which makes native
clang-tidy -pdifficult to use correctly for CUDA translation units. This change adds an explicit clang-CUDA build path so we can validate compiler compatibility directly and run clang-tidy from an actual clang-generated compile database instead of reconstructing commands.Testing
bash -n ops/pipeline/build-cuda-clang.sh ops/pipeline/run-clang-tidy-clang-cuda.shPATH=/home/nfs/rorym/anaconda3/envs/xgboost/bin:$PATH XGBOOST_SKIP_CLANG_INSTALL=1 XGBOOST_CLANG_PREFIX=/home/nfs/rorym/anaconda3/envs/xgboost XGBOOST_CMAKE_PREFIX_PATH=/home/nfs/rorym/anaconda3/envs/xgboost XGBOOST_NCCL_INCLUDE_DIR=/home/nfs/rorym/anaconda3/envs/xgboost/include XGBOOST_NCCL_ROOT=/home/nfs/rorym/anaconda3/envs/xgboost XGBOOST_GPU_COMPUTE_VER=70 XGBOOST_TIDY_JOBS=2 ops/pipeline/run-clang-tidy-clang-cuda.sh --build-dir build-clang-tidy-ciNotes
src/common/timer.ccandsrc/predictor/interpretability/shap.cu