Skip to content

Commit ca6f571

Browse files
authored
Feature/mixscape (#688)
* add mixscape * update releasenote * fix issues * add mixscale and fix container * batch and slim down wheel * fix dtype to 64 bit * update code * last few fixes
1 parent 2a4b78c commit ca6f571

19 files changed

Lines changed: 3096 additions & 31 deletions

.github/workflows/publish.yml

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -120,7 +120,14 @@ jobs:
120120
scikit-build-core cmake ninja nanobind
121121
CIBW_TEST_SKIP: "*"
122122
CIBW_TEST_COMMAND: ""
123-
CIBW_REPAIR_WHEEL_COMMAND: "auditwheel repair --exclude libcublas.so.${{ matrix.cuda_major }} --exclude libcublasLt.so.${{ matrix.cuda_major }} --exclude libcudart.so.${{ matrix.cuda_major }} -w {dest_dir} {wheel}"
123+
# Exclude CUDA libs by SONAME glob (auditwheel >=6.2): the runtime
124+
# stack (CuPy / nvidia-* wheels) provides them. Globs are version
125+
# agnostic -- cusolver's SONAME is libcusolver.so.11 on CUDA 12 but
126+
# .12 on CUDA 13, and nvJitLink is .12 vs .13, so pinning to the CUDA
127+
# major would graft the wrong (or no) lib. cusolver's transitive deps
128+
# (cublasLt, cusparse ~186MB, nvJitLink) are reached by auditwheel's
129+
# tree walk and must each be excluded or they bloat the wheel.
130+
CIBW_REPAIR_WHEEL_COMMAND: "auditwheel repair --exclude 'libcublas.so.*' --exclude 'libcublasLt.so.*' --exclude 'libcudart.so.*' --exclude 'libcusolver.so.*' --exclude 'libcusparse.so.*' --exclude 'libnvJitLink.so.*' -w {dest_dir} {wheel}"
124131
CIBW_BUILD_VERBOSITY: "1"
125132

126133
- uses: actions/upload-artifact@v7

CMakeLists.txt

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -73,6 +73,8 @@ if (RSC_BUILD_EXTENSIONS)
7373
add_nb_cuda_module(_norm_cuda src/rapids_singlecell/_cuda/norm/norm.cu)
7474
add_nb_cuda_module(_gmm_cuda src/rapids_singlecell/_cuda/gmm/gmm.cu)
7575
target_link_libraries(_gmm_cuda PRIVATE CUDA::cublas)
76+
target_link_libraries(_gmm_cuda PRIVATE CUDA::cusolver)
77+
add_nb_cuda_module(_mixscale_cuda src/rapids_singlecell/_cuda/mixscale/mixscale.cu)
7678
add_nb_cuda_module(_pr_cuda src/rapids_singlecell/_cuda/pr/pr.cu)
7779
add_nb_cuda_module(_nn_descent_cuda src/rapids_singlecell/_cuda/nn_descent/nn_descent.cu)
7880
add_nb_cuda_module(_aucell_cuda src/rapids_singlecell/_cuda/aucell/aucell.cu)

docker/manylinux_2_28_aarch64_cuda12.2.Dockerfile

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,10 @@ RUN yum -y install dnf-plugins-core && \
1414
libcublas-12-2 \
1515
libcublas-devel-12-2 \
1616
libcusparse-12-2 \
17-
libcusparse-devel-12-2 && \
17+
libcusparse-devel-12-2 \
18+
libcusolver-12-2 \
19+
libcusolver-devel-12-2 \
20+
libnvjitlink-12-2 && \
1821
yum clean all
1922

2023
ENV CUDA_HOME=/usr/local/cuda

docker/manylinux_2_28_aarch64_cuda13.0.Dockerfile

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,10 @@ RUN yum -y install dnf-plugins-core && \
1010
libcublas-13-0 \
1111
libcublas-devel-13-0 \
1212
libcusparse-13-0 \
13-
libcusparse-devel-13-0 && \
13+
libcusparse-devel-13-0 \
14+
libcusolver-13-0 \
15+
libcusolver-devel-13-0 \
16+
libnvjitlink-13-0 && \
1417
yum clean all
1518

1619
ENV CUDA_HOME=/usr/local/cuda

docker/manylinux_2_28_x86_64_cuda12.2.Dockerfile

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,15 +8,20 @@ RUN yum -y install gcc-toolset-12-gcc gcc-toolset-12-gcc-c++ && \
88
RUN yum -y install dnf-plugins-core && \
99
dnf config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/rhel8/x86_64/cuda-rhel8.repo && \
1010
yum -y clean all && yum -y makecache && \
11-
# Install only what you actually link against
11+
# Install what you link against, plus cusolver's runtime nvJitLink dependency
12+
# (cusolver's ELF NEEDs libnvJitLink but its RPM does not declare it, so it
13+
# must be installed explicitly or the nanobind stub-generation import fails).
1214
yum -y install \
1315
cuda-nvcc-12-2 \
1416
cuda-cudart-12-2 \
1517
cuda-cudart-devel-12-2 \
1618
libcublas-12-2 \
1719
libcublas-devel-12-2 \
1820
libcusparse-12-2 \
19-
libcusparse-devel-12-2 && \
21+
libcusparse-devel-12-2 \
22+
libcusolver-12-2 \
23+
libcusolver-devel-12-2 \
24+
libnvjitlink-12-2 && \
2025
yum clean all
2126

2227
ENV CUDA_HOME=/usr/local/cuda

docker/manylinux_2_28_x86_64_cuda13.0.Dockerfile

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,10 @@ RUN yum -y install dnf-plugins-core && \
1212
libcublas-13-0 \
1313
libcublas-devel-13-0 \
1414
libcusparse-13-0 \
15-
libcusparse-devel-13-0 && \
15+
libcusparse-devel-13-0 \
16+
libcusolver-13-0 \
17+
libcusolver-devel-13-0 \
18+
libnvjitlink-13-0 && \
1619
yum clean all
1720

1821
ENV CUDA_HOME=/usr/local/cuda

docs/api/pertpy_gpu.md

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -73,3 +73,35 @@
7373
.. automethod:: assign_mixture_model
7474
:no-index:
7575
```
76+
77+
## Mixscape
78+
79+
```{eval-rst}
80+
.. autosummary::
81+
:toctree: generated
82+
83+
Mixscape
84+
```
85+
86+
```{eval-rst}
87+
.. autoclass:: Mixscape
88+
:no-index:
89+
90+
.. rubric:: Methods
91+
92+
.. autosummary::
93+
94+
~Mixscape.perturbation_signature
95+
~Mixscape.mixscape
96+
~Mixscape.mixscale
97+
~Mixscape.lda
98+
99+
.. automethod:: perturbation_signature
100+
:no-index:
101+
.. automethod:: mixscape
102+
:no-index:
103+
.. automethod:: mixscale
104+
:no-index:
105+
.. automethod:: lda
106+
:no-index:
107+
```

docs/release-notes/0.15.3.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,4 +2,9 @@
22

33
```{rubric} Features
44
```
5+
* Add {class}`~rapids_singlecell.ptg.Mixscape` for GPU-accelerated Mixscape (`perturbation_signature`, `mixscape`, `mixscale`, `lda`) {pr}`688` {smaller}`S Dicks`
6+
7+
```{rubric} Performance
8+
```
9+
* Batched cuSOLVER precision-Cholesky for the full-covariance GMM (`gmm_fit_predict`), ~2-3x faster {pr}`688` {smaller}`S Dicks`
510
* {class}`~rapids_singlecell.ptg.Distance` with ``metric="edistance"`` now accepts sparse CSR input (a sparse layer or ``layer_key="X"``), densified inside the CUDA kernel so the dense matrix is never materialized on the GPU {pr}`689` {smaller}`S Dicks`

src/rapids_singlecell/_cuda/__init__.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,7 @@
3535
"_kde_cuda",
3636
"_ligrec_cuda",
3737
"_mean_var_cuda",
38+
"_mixscale_cuda",
3839
"_nanmean_cuda",
3940
"_nn_descent_cuda",
4041
"_norm_cuda",

0 commit comments

Comments
 (0)