Skip to content

Commit 3d67a60

Browse files
authored
Wilcoxon refactor (#636)
* first iteration of refactor * add rmm * update publish and cmake * update notebooks * make dense faster * update tests and fix issues * fix tests * update * safety commit * start cleanup * first draft * update rmm * update ci buildwheel * add csr densification columnwise clean up rmm * fix docker * fix issues * redo memory allocation * clean up * more cleanup * improve memory and dtypes and nnz for large datasets * update comments * start dedup * more dedup * make even smaller * update testing * add more tests * update kernels and layout * remove small and tiny and speed up larger paths * fix logreg order Signed-off-by: Intron7 <severin.dicks@icloud.com> * add 64 bit Signed-off-by: Intron7 <severin.dicks@icloud.com> * update streaming Signed-off-by: Intron7 <severin.dicks@icloud.com> * add memory safety Signed-off-by: Intron7 <severin.dicks@icloud.com> * update streaming Signed-off-by: Intron7 <severin.dicks@icloud.com> * update python Signed-off-by: Intron7 <severin.dicks@icloud.com> * make negative fall bag better Signed-off-by: Intron7 <severin.dicks@icloud.com> * slim down comments Signed-off-by: Intron7 <severin.dicks@icloud.com> --------- Signed-off-by: Intron7 <severin.dicks@icloud.com>
1 parent 554d891 commit 3d67a60

43 files changed

Lines changed: 10910 additions & 940 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.github/workflows/docker.yml

Lines changed: 4 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,5 @@
1-
# This workflow will build two Docker image and push then to GitHub Packages Container registry:
2-
# - a base image with the dependencies
3-
# - a main image with the application code
1+
# Build/push two GHCR images: dependency base and application image.
2+
# Release events push; PR/comment runs only validate.
43

54
name: Docker
65

@@ -73,8 +72,8 @@ jobs:
7372
RAPIDS_VER:
7473
- "26.04"
7574
CUDA_SUFFIX:
76-
- { ver: "12.8.0", label: "cuda12", pkg: "cu12" }
77-
- { ver: "13.0.2", label: "cuda13", pkg: "cu13" }
75+
- { ver: "12.9.1", label: "cuda12", pkg: "cu12" }
76+
- { ver: "13.1.0", label: "cuda13", pkg: "cu13" }
7877
name: Build Docker images (${{ matrix.CUDA_SUFFIX.label }})
7978
runs-on: ubuntu-latest
8079
permissions:

.github/workflows/publish.yml

Lines changed: 50 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -69,16 +69,46 @@ jobs:
6969
path = pathlib.Path("pyproject.toml")
7070
text = path.read_text()
7171
72+
def remove_toml_array(text, key):
73+
lines = text.splitlines(keepends=True)
74+
out = []
75+
i = 0
76+
while i < len(lines):
77+
if lines[i].startswith(f"{key} = ["):
78+
depth = lines[i].count("[") - lines[i].count("]")
79+
i += 1
80+
while i < len(lines) and depth > 0:
81+
depth += lines[i].count("[") - lines[i].count("]")
82+
i += 1
83+
continue
84+
out.append(lines[i])
85+
i += 1
86+
return "".join(out)
87+
7288
# Rename package
7389
text = text.replace(
7490
'name = "rapids-singlecell"',
7591
f'name = "rapids-singlecell-cu{cuda}"',
7692
)
7793
# Rename matching extra to "rapids", remove the other
78-
text = text.replace(f'rapids-cu{cuda} =', 'rapids =')
79-
# Remove the other CUDA extra line entirely
80-
lines = text.splitlines(keepends=True)
81-
text = "".join(l for l in lines if f'rapids-cu{other}' not in l)
94+
text = text.replace(f'rapids-cu{cuda} = [', 'rapids = [')
95+
text = remove_toml_array(text, f"rapids-cu{other}")
96+
97+
# CMake links CUDA extensions against librmm.
98+
# Add the matching wheel to isolated build requirements.
99+
for dep in (
100+
f' "librmm-cu{other}>=25.12",\n',
101+
f' "rmm-cu{other}>=25.12",\n',
102+
):
103+
text = text.replace(dep, "")
104+
rmm_build_req = f' "librmm-cu{cuda}>=25.12",\n'
105+
build_system_text = text.split("[project]", 1)[0]
106+
if f'"librmm-cu{cuda}>=25.12"' not in build_system_text:
107+
text = text.replace(
108+
']\nbuild-backend = "scikit_build_core.build"',
109+
f'{rmm_build_req}]\nbuild-backend = "scikit_build_core.build"',
110+
1,
111+
)
82112
83113
# Set CUDA architectures (replace "native" with CI target archs)
84114
text = text.replace(
@@ -96,6 +126,7 @@ jobs:
96126
97127
- name: Sanity check pyproject.toml
98128
run: |
129+
python3 -c "import tomllib; tomllib.load(open('pyproject.toml', 'rb'))"
99130
grep -E "name|rapids|CUDA_ARCH" pyproject.toml
100131
101132
- name: Build CUDA manylinux image
@@ -116,18 +147,25 @@ jobs:
116147
LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
117148
PATH=/usr/local/cuda/bin:$PATH
118149
CIBW_BEFORE_BUILD: >
150+
rm -f build/.librmm_dir &&
151+
mkdir -p build &&
119152
python -m pip install -U pip
120153
scikit-build-core cmake ninja nanobind
154+
librmm-cu${{ matrix.cuda_major }} &&
155+
RMM_ROOT=$(python -c "import librmm; print(librmm.__path__[0])") &&
156+
LOG_ROOT=$(python -c "import rapids_logger; print(rapids_logger.__path__[0])") &&
157+
echo "[rsc-build] librmm=$RMM_ROOT" &&
158+
echo "[rsc-build] rapids_logger=$LOG_ROOT" &&
159+
ln -sf "$RMM_ROOT/lib64/librmm.so" /usr/local/lib/librmm.so &&
160+
ln -sf "$LOG_ROOT/lib64/librapids_logger.so" /usr/local/lib/librapids_logger.so &&
161+
ldconfig &&
162+
python -c "import librmm; print(librmm.__path__[0])" > build/.librmm_dir &&
163+
echo "[rsc-build] marker=$(cat build/.librmm_dir)"
121164
CIBW_TEST_SKIP: "*"
122165
CIBW_TEST_COMMAND: ""
123-
# Exclude CUDA libs by SONAME glob (auditwheel >=6.2): the runtime
124-
# stack (CuPy / nvidia-* wheels) provides them. Globs are version
125-
# agnostic -- cusolver's SONAME is libcusolver.so.11 on CUDA 12 but
126-
# .12 on CUDA 13, and nvJitLink is .12 vs .13, so pinning to the CUDA
127-
# major would graft the wrong (or no) lib. cusolver's transitive deps
128-
# (cublasLt, cusparse ~186MB, nvJitLink) are reached by auditwheel's
129-
# tree walk and must each be excluded or they bloat the wheel.
130-
CIBW_REPAIR_WHEEL_COMMAND: "auditwheel repair --exclude 'libcublas.so.*' --exclude 'libcublasLt.so.*' --exclude 'libcudart.so.*' --exclude 'libcusolver.so.*' --exclude 'libcusparse.so.*' --exclude 'libnvJitLink.so.*' -w {dest_dir} {wheel}"
166+
# Exclude CUDA/RAPIDS runtime libs provided by dependency wheels.
167+
# Use SONAME globs so CUDA 12/13 suffix changes do not bundle them.
168+
CIBW_REPAIR_WHEEL_COMMAND: "auditwheel repair --exclude 'libcublas.so.*' --exclude 'libcublasLt.so.*' --exclude 'libcudart.so.*' --exclude 'libcusolver.so.*' --exclude 'libcusparse.so.*' --exclude 'libnvJitLink.so.*' --exclude librmm.so --exclude librapids_logger.so -w {dest_dir} {wheel}"
131169
CIBW_BUILD_VERBOSITY: "1"
132170

133171
- uses: actions/upload-artifact@v7

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -54,3 +54,4 @@ AGENTS.md
5454

5555
# tmp_scripts
5656
tmp_scripts/
57+
/benchmarks/

CMakeLists.txt

Lines changed: 178 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,130 @@ if (RSC_BUILD_EXTENSIONS)
1414
find_package(Python REQUIRED COMPONENTS Interpreter Development.Module ${SKBUILD_SABI_COMPONENT})
1515
find_package(nanobind CONFIG REQUIRED)
1616
find_package(CUDAToolkit REQUIRED)
17+
set(RSC_RMM_HINTS)
18+
set(RSC_RAPIDS_CMAKE_PREFIXES)
19+
set(RSC_CCCL_HINTS)
20+
set(RSC_RAPIDS_LOGGER_HINTS)
21+
set(RSC_NVTX3_HINTS)
22+
macro(_rsc_collect_rapids_python_prefix _rsc_prefix)
23+
if (NOT "${_rsc_prefix}" STREQUAL "")
24+
file(GLOB _rsc_rmm_dirs "${_rsc_prefix}/lib/python*/site-packages/librmm/lib64/cmake/rmm")
25+
file(GLOB _rsc_rapids_prefixes
26+
"${_rsc_prefix}/lib/python*/site-packages/librmm/lib64"
27+
"${_rsc_prefix}/lib/python*/site-packages/librmm/lib64/rapids"
28+
"${_rsc_prefix}/lib/python*/site-packages/rapids_logger/lib64"
29+
"${_rsc_prefix}/lib/python*/site-packages/nvidia/cu*/lib"
30+
)
31+
file(GLOB _rsc_cccl_dirs
32+
"${_rsc_prefix}/lib/python*/site-packages/librmm/lib64/rapids/cmake/cccl"
33+
"${_rsc_prefix}/lib/python*/site-packages/nvidia/cu*/lib/cmake/cccl"
34+
)
35+
file(GLOB _rsc_rapids_logger_dirs "${_rsc_prefix}/lib/python*/site-packages/rapids_logger/lib64/cmake/rapids_logger")
36+
file(GLOB _rsc_nvtx3_dirs "${_rsc_prefix}/lib/python*/site-packages/librmm/lib64/cmake/nvtx3")
37+
list(APPEND RSC_RMM_HINTS ${_rsc_rmm_dirs})
38+
list(APPEND RSC_RAPIDS_CMAKE_PREFIXES ${_rsc_rapids_prefixes})
39+
list(APPEND RSC_CCCL_HINTS ${_rsc_cccl_dirs})
40+
list(APPEND RSC_RAPIDS_LOGGER_HINTS ${_rsc_rapids_logger_dirs})
41+
list(APPEND RSC_NVTX3_HINTS ${_rsc_nvtx3_dirs})
42+
endif()
43+
endmacro()
44+
execute_process(
45+
COMMAND "${Python_EXECUTABLE}" -c "import importlib.util, pathlib; spec = importlib.util.find_spec('librmm'); print(pathlib.Path(spec.origin).parent / 'lib64' / 'cmake' / 'rmm' if spec else '')"
46+
OUTPUT_VARIABLE RSC_PYTHON_RMM_DIR
47+
OUTPUT_STRIP_TRAILING_WHITESPACE
48+
ERROR_QUIET
49+
)
50+
if (RSC_PYTHON_RMM_DIR AND EXISTS "${RSC_PYTHON_RMM_DIR}/rmm-config.cmake")
51+
set(_rsc_python_rmm_hint "${RSC_PYTHON_RMM_DIR}")
52+
else()
53+
set(_rsc_python_rmm_hint "")
54+
endif()
55+
# Wheel builds write build/.librmm_dir from CIBW_BEFORE_BUILD.
56+
# publish.yml symlinks runtime libs so auditwheel excludes them.
57+
if(DEFINED ENV{RSC_LIBRMM_DIR} AND EXISTS "$ENV{RSC_LIBRMM_DIR}/lib64/cmake/rmm/rmm-config.cmake")
58+
set(_rsc_librmm_marker "$ENV{RSC_LIBRMM_DIR}")
59+
elseif(EXISTS "${CMAKE_SOURCE_DIR}/build/.librmm_dir")
60+
file(READ "${CMAKE_SOURCE_DIR}/build/.librmm_dir" _rsc_librmm_marker)
61+
string(STRIP "${_rsc_librmm_marker}" _rsc_librmm_marker)
62+
else()
63+
set(_rsc_librmm_marker "")
64+
endif()
65+
if(NOT "${_rsc_librmm_marker}" STREQUAL "" AND EXISTS "${_rsc_librmm_marker}/lib64/cmake/rmm/rmm-config.cmake")
66+
file(GLOB _rsc_marker_rmm_dirs "${_rsc_librmm_marker}/lib64/cmake/rmm")
67+
file(GLOB _rsc_marker_rapids_prefixes
68+
"${_rsc_librmm_marker}/lib64"
69+
"${_rsc_librmm_marker}/lib64/rapids"
70+
"${_rsc_librmm_marker}/../rapids_logger/lib64"
71+
)
72+
file(GLOB _rsc_marker_cccl_dirs
73+
"${_rsc_librmm_marker}/lib64/rapids/cmake/cccl"
74+
)
75+
file(GLOB _rsc_marker_rapids_logger_dirs "${_rsc_librmm_marker}/../rapids_logger/lib64/cmake/rapids_logger")
76+
file(GLOB _rsc_marker_nvtx3_dirs "${_rsc_librmm_marker}/lib64/cmake/nvtx3")
77+
list(APPEND RSC_RMM_HINTS ${_rsc_marker_rmm_dirs})
78+
list(APPEND RSC_RAPIDS_CMAKE_PREFIXES ${_rsc_marker_rapids_prefixes})
79+
list(APPEND RSC_CCCL_HINTS ${_rsc_marker_cccl_dirs})
80+
list(APPEND RSC_RAPIDS_LOGGER_HINTS ${_rsc_marker_rapids_logger_dirs})
81+
list(APPEND RSC_NVTX3_HINTS ${_rsc_marker_nvtx3_dirs})
82+
endif()
83+
foreach(_rsc_python_prefix IN ITEMS "${Python_ROOT_DIR}" "${Python3_ROOT_DIR}")
84+
_rsc_collect_rapids_python_prefix("${_rsc_python_prefix}")
85+
endforeach()
86+
foreach(_rsc_env_prefix IN ITEMS "$ENV{CONDA_PREFIX}" "$ENV{VIRTUAL_ENV}")
87+
_rsc_collect_rapids_python_prefix("${_rsc_env_prefix}")
88+
endforeach()
89+
string(REPLACE ":" ";" _rsc_path_entries "$ENV{PATH}")
90+
foreach(_rsc_path_entry IN LISTS _rsc_path_entries)
91+
get_filename_component(_rsc_path_prefix "${_rsc_path_entry}/.." ABSOLUTE)
92+
_rsc_collect_rapids_python_prefix("${_rsc_path_prefix}")
93+
endforeach()
94+
if (NOT RSC_RMM_HINTS
95+
AND NOT "${_rsc_python_rmm_hint}" STREQUAL "")
96+
list(APPEND RSC_RMM_HINTS "${_rsc_python_rmm_hint}")
97+
endif()
98+
if (RSC_RAPIDS_CMAKE_PREFIXES)
99+
list(APPEND CMAKE_PREFIX_PATH ${RSC_RAPIDS_CMAKE_PREFIXES})
100+
if (RSC_CCCL_HINTS)
101+
list(GET RSC_CCCL_HINTS 0 _rsc_cccl_dir)
102+
set(CCCL_DIR "${_rsc_cccl_dir}" CACHE PATH "Path to CCCL package config" FORCE)
103+
endif()
104+
if (RSC_RAPIDS_LOGGER_HINTS)
105+
list(GET RSC_RAPIDS_LOGGER_HINTS 0 _rsc_rapids_logger_dir)
106+
set(rapids_logger_DIR "${_rsc_rapids_logger_dir}" CACHE PATH "Path to rapids_logger package config" FORCE)
107+
endif()
108+
if (RSC_NVTX3_HINTS)
109+
list(GET RSC_NVTX3_HINTS 0 _rsc_nvtx3_dir)
110+
set(nvtx3_DIR "${_rsc_nvtx3_dir}" CACHE PATH "Path to nvtx3 package config" FORCE)
111+
endif()
112+
endif()
113+
if (RSC_RMM_HINTS)
114+
list(GET RSC_RMM_HINTS 0 _rsc_rmm_dir)
115+
set(rmm_DIR "${_rsc_rmm_dir}" CACHE PATH "Path to rmm package config" FORCE)
116+
find_package(rmm CONFIG REQUIRED)
117+
else()
118+
find_package(rmm CONFIG REQUIRED)
119+
endif()
120+
121+
# CCCL 3.3.0 gates cudaDevAttrHostNumaMemoryPoolsSupported too loosely.
122+
# Fail fast for CUDA 12.6-12.8 source builds with that buggy CCCL.
123+
set(_rsc_cccl_buggy_numa_guard TRUE)
124+
if (DEFINED CCCL_VERSION AND CCCL_VERSION VERSION_GREATER 3.3.0)
125+
set(_rsc_cccl_buggy_numa_guard FALSE)
126+
endif()
127+
if (NOT RSC_SKIP_CUDA_VERSION_CHECK
128+
AND _rsc_cccl_buggy_numa_guard
129+
AND CUDAToolkit_VERSION VERSION_GREATER_EQUAL 12.6
130+
AND CUDAToolkit_VERSION VERSION_LESS 12.9)
131+
message(FATAL_ERROR
132+
"Cannot build rapids_singlecell from source with CUDA ${CUDAToolkit_VERSION} against "
133+
"CCCL ${CCCL_VERSION} (RAPIDS 26.04): it references cudaDevAttrHostNumaMemoryPoolsSupported, "
134+
"which the CUDA 12.6-12.8 toolkit does not define (NVIDIA added it in 12.9). "
135+
"Use CUDA >= 12.9 (or <= 12.5), upgrade to RAPIDS >= 26.06 (CCCL > 3.3.0 fixes the guard), "
136+
"or install the prebuilt wheel (pip install rapids-singlecell-cu12). "
137+
"If your toolkit does define this enum, override with -DRSC_SKIP_CUDA_VERSION_CHECK=ON.")
138+
endif()
139+
140+
message(STATUS "Using RMM for CUDA extension scratch allocations")
17141
message(STATUS "Building for CUDA architectures: ${CMAKE_CUDA_ARCHITECTURES}")
18142
else()
19143
message(STATUS "RSC_BUILD_EXTENSIONS=OFF -> skipping compiled extensions for docs")
@@ -62,6 +186,57 @@ function(add_nb_cuda_module target src)
62186
endif()
63187
endfunction()
64188

189+
# RMM-backed nanobind CUDA module: normal module plus shared scratch allocator.
190+
# Wheels use sibling RAPIDS packages; editable imports still preload fallbacks.
191+
function(add_rmm_cuda_module target src)
192+
add_nb_cuda_module(${target} ${src})
193+
if (RSC_BUILD_EXTENSIONS)
194+
target_sources(${target} PRIVATE
195+
src/rapids_singlecell/_cuda/rmm_scratch.cu)
196+
target_link_libraries(${target} PRIVATE rmm::rmm)
197+
set(_rsc_rmm_build_rpath)
198+
set(_rsc_rmm_have_build_librmm FALSE)
199+
set(_rsc_rmm_have_build_rapids_logger FALSE)
200+
if (DEFINED ENV{CONDA_PREFIX})
201+
set(_rsc_rmm_env_site
202+
"$ENV{CONDA_PREFIX}/lib/python${Python_VERSION_MAJOR}.${Python_VERSION_MINOR}/site-packages")
203+
if (EXISTS "${_rsc_rmm_env_site}/librmm/lib64")
204+
list(APPEND _rsc_rmm_build_rpath
205+
"${_rsc_rmm_env_site}/librmm/lib64")
206+
set(_rsc_rmm_have_build_librmm TRUE)
207+
endif()
208+
if (EXISTS "${_rsc_rmm_env_site}/rapids_logger/lib64")
209+
list(APPEND _rsc_rmm_build_rpath
210+
"${_rsc_rmm_env_site}/rapids_logger/lib64")
211+
set(_rsc_rmm_have_build_rapids_logger TRUE)
212+
endif()
213+
endif()
214+
if (NOT _rsc_rmm_have_build_librmm AND rmm_DIR)
215+
get_filename_component(_rsc_rmm_build_librmm_dir
216+
"${rmm_DIR}/../.." REALPATH)
217+
list(APPEND _rsc_rmm_build_rpath "${_rsc_rmm_build_librmm_dir}")
218+
endif()
219+
if (NOT _rsc_rmm_have_build_rapids_logger AND rapids_logger_DIR)
220+
get_filename_component(_rsc_rmm_build_rapids_logger_dir
221+
"${rapids_logger_DIR}/../.." REALPATH)
222+
list(APPEND _rsc_rmm_build_rpath
223+
"${_rsc_rmm_build_rapids_logger_dir}")
224+
endif()
225+
set(_rsc_rmm_install_rpath
226+
"\$ORIGIN/../../librmm/lib64"
227+
"\$ORIGIN/../../rapids_logger/lib64"
228+
)
229+
if (CUDAToolkit_LIBRARY_DIR)
230+
list(APPEND _rsc_rmm_build_rpath "${CUDAToolkit_LIBRARY_DIR}")
231+
list(APPEND _rsc_rmm_install_rpath "${CUDAToolkit_LIBRARY_DIR}")
232+
endif()
233+
set_target_properties(${target} PROPERTIES
234+
BUILD_RPATH "${_rsc_rmm_build_rpath}"
235+
INSTALL_RPATH "${_rsc_rmm_install_rpath}"
236+
)
237+
endif()
238+
endfunction()
239+
65240
if (RSC_BUILD_EXTENSIONS)
66241
# CUDA modules
67242
add_nb_cuda_module(_mean_var_cuda src/rapids_singlecell/_cuda/mean_var/mean_var.cu)
@@ -91,7 +266,9 @@ if (RSC_BUILD_EXTENSIONS)
91266
add_nb_cuda_module(_pseudobulk_cuda src/rapids_singlecell/_cuda/pseudobulk/pseudobulk.cu)
92267
add_nb_cuda_module(_hvg_cuda src/rapids_singlecell/_cuda/hvg/hvg.cu)
93268
add_nb_cuda_module(_kde_cuda src/rapids_singlecell/_cuda/kde/kde.cu)
94-
add_nb_cuda_module(_wilcoxon_cuda src/rapids_singlecell/_cuda/wilcoxon/wilcoxon.cu)
269+
add_rmm_cuda_module(_wilcoxon_cuda src/rapids_singlecell/_cuda/wilcoxon/wilcoxon.cu)
270+
add_rmm_cuda_module(_wilcoxon_sparse_cuda src/rapids_singlecell/_cuda/wilcoxon/wilcoxon_sparse.cu)
271+
add_nb_cuda_module(_rank_stats_cuda src/rapids_singlecell/_cuda/rank_genes/rank_stats.cu)
95272
# Harmony CUDA modules
96273
add_nb_cuda_module(_harmony_scatter_cuda src/rapids_singlecell/_cuda/harmony/scatter/scatter.cu)
97274
add_nb_cuda_module(_harmony_outer_cuda src/rapids_singlecell/_cuda/harmony/outer/outer.cu)

conda/rsc_rapids_26.04_cuda12.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ channels:
77
dependencies:
88
- rapids=26.04
99
- python=3.14
10-
- cuda-version=12.8
10+
- cuda-version=12.9
1111
- cudnn
1212
- cutensor
1313
- cusparselt

docker/Dockerfile

Lines changed: 14 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,11 @@ ARG GIT_ID=main
55
SHELL ["/bin/bash", "-euo", "pipefail", "-c"]
66

77
ENV PATH=/opt/conda/bin:$PATH
8+
# Point CMake's find_package(rmm) at the conda env. The conda RAPIDS env resolved
9+
# librmm + cuda-version together, so its librmm/rapids_logger headers match the
10+
# image's CUDA toolkit. This is what lets the --no-build-isolation build below
11+
# pick up the CUDA-matched librmm instead of a mismatched PyPI wheel.
12+
ENV CMAKE_PREFIX_PATH=/opt/conda
813
ARG CUDA_ARCHS="75-real;80-real;86-real;89-real;90-real;100-real;120"
914

1015
RUN <<EOF
@@ -18,5 +23,13 @@ git checkout ${GIT_ID}
1823
# Set CUDA architectures directly in pyproject.toml (avoids SKBUILD_CMAKE_ARGS semicolon splitting)
1924
sed -i 's/CMAKE_CUDA_ARCHITECTURES = "native"/CMAKE_CUDA_ARCHITECTURES = "'"${CUDA_ARCHS}"'"/' pyproject.toml
2025
grep CMAKE_CUDA_ARCHITECTURES pyproject.toml
21-
/opt/conda/bin/python -m pip install --no-cache-dir -e .
26+
# Build with --no-build-isolation so the compile uses the conda env's
27+
# CUDA-matched librmm/rapids_logger headers. With isolation, PEP 517 would pull
28+
# a fresh librmm-cu12 from PyPI (hardcoded in [build-system].requires) that
29+
# mismatches the image's CUDA toolkit -> "cudaDevAttr* has no global scope"
30+
# errors on both cu12 (toolkit older than the latest librmm) and cu13 (wrong
31+
# cu12 variant). Install the PEP 517 backend deps first since isolation is off;
32+
# the conda env already provides the librmm/rapids_logger headers + cmake config.
33+
/opt/conda/bin/python -m pip install --no-cache-dir scikit-build-core nanobind setuptools-scm cmake ninja
34+
/opt/conda/bin/python -m pip install --no-cache-dir --no-build-isolation -e .
2235
EOF

docker/Dockerfile.deps

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
ARG CUDA_VER=13.0.2
1+
ARG CUDA_VER=13.1.0
22
ARG LINUX_VER=ubuntu24.04
33

44
FROM nvidia/cuda:${CUDA_VER}-devel-${LINUX_VER}
@@ -7,7 +7,7 @@ SHELL ["/bin/bash", "-euo", "pipefail", "-c"]
77

88
ARG PYTHON_VER=3.13
99
# Re-declare after FROM so it is available to RUN steps (passed by docker.yml build-args)
10-
ARG CUDA_VER=13.0.2
10+
ARG CUDA_VER=13.1.0
1111

1212
ENV PATH=/opt/conda/bin:$PATH
1313
ENV PYTHON_VERSION=${PYTHON_VER}

docker/docker-push.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ rapids_version=26.04
66

77
declare -A cuda_versions=(
88
[cu12]="12.8.0"
9-
[cu13]="13.0.2"
9+
[cu13]="13.1.0"
1010
)
1111

1212
declare -A cuda_archs=(

0 commit comments

Comments
 (0)