I need to use Dask arrays and out-of-memory operations for all analysis, including pseudobulking.
But if I try to use Dask arrays with get.aggregate, I have the following error:
aggregated = rsc.get.aggregate(adata, by=["lvl_2", 'sample_id'], func=["sum", "count_nonzero"])
Traceback (most recent call last):
File "", line 1, in
File "/opt/mamba/envs/newrapids/lib/python3.12/site-packages/rapids_singlecell/get/_aggregated.py", line 419, in aggregate
_check_gpu_X(data)
File "/opt/mamba/envs/newrapids/lib/python3.12/site-packages/rapids_singlecell/preprocessing/_utils.py", line 277, in _check_gpu_X
raise TypeError(
TypeError: The input is a DaskArray. Rapids-singlecell doesn't support DaskArray in this function, so your input must be a CuPy ndarray or a CuPy sparse matrix.
The major benefit of using RAPIDS is that I can conduct out-of-memory operations with GPU -otherwise, I might as well just use a cheaper CPU instance and parallelize fully the pseudobulk operations.
When will Rapids-singlecell support Dask Arrays fully?
I need to use Dask arrays and out-of-memory operations for all analysis, including pseudobulking.
But if I try to use Dask arrays with get.aggregate, I have the following error:
The major benefit of using RAPIDS is that I can conduct out-of-memory operations with GPU -otherwise, I might as well just use a cheaper CPU instance and parallelize fully the pseudobulk operations.
When will Rapids-singlecell support Dask Arrays fully?