To extend the above functionality to a new array type, you should use the types and implement the interfaces listed on this page. GPUArrays is designed around having two different array types to represent a GPU array: one that exists only on the host, and one that actually can be instantiated on the device (i.e. in kernels). Device functionality is then handled by KernelAbstractions.jl.
You should provide an array type that builds on the AbstractGPUArray supertype, such as:
mutable struct CustomArray{T, N} <: AbstractGPUArray{T, N}
data::DataRef{Vector{UInt8}}
offset::Int
dims::Dims{N}
...
end
This will allow your defined type (in this case JLArray) to use the GPUArrays interface where available.
To be able to actually use the functionality that is defined for AbstractGPUArrays, you need to define the backend, like so:
import KernelAbstractions: Backend
struct CustomBackend <: KernelAbstractions.GPU
KernelAbstractions.get_backend(a::CA) where CA <: CustomArray = CustomBackend()There are numerous examples of potential interfaces for GPUArrays, such as with JLArrays, CuArrays, and ROCArrays.
A sparse array can't share the AbstractGPUArray supertype — that is a DenseArray,
whereas a sparse array must be an AbstractSparseArray — so GPUArrays keeps a parallel
sparse hierarchy with its own generic functionality. Integrating a back-end has three
parts: the storage types it provides, the methods it implements to plug them in, and the
functionality it then gets for free.
One mutable struct per supported format, subtyping the matching abstract type and using the conventional field names (generic code reads them directly):
| supertype | fields |
|---|---|
AbstractGPUSparseVector{Tv,Ti} |
iPtr, nzVal, len, nnz |
AbstractGPUSparseMatrixCSC{Tv,Ti} |
colPtr, rowVal, nzVal, dims, nnz |
AbstractGPUSparseMatrixCSR{Tv,Ti} |
rowPtr, colVal, nzVal, dims, nnz |
AbstractGPUSparseMatrixCOO{Tv,Ti} |
rowInd, colInd, nzVal, dims, nnz |
The pointer/index/value arrays are the back-end's own dense vector type. Provide only the formats you need, but note that several generic operations route through COO.
- Constructors — from component arrays (
MyCSR(rowPtr, colVal, nzVal, dims)), between formats (MyCSR(::MyCOO), …), and to/from hostSparseArrays(MyCSC(::SparseMatrixCSC),SparseMatrixCSC(::MyCSC)). undefconstructors —MyCSC{Tv,Ti}(undef, dims)/MyVec{Tv,Ti}(undef, n), building a structurally-empty array (no stored entries), mirroring denseArray{T}(undef, dims)andSparseArrays'SparseMatrixCSC{Tv,Ti}(undef, m, n). This is the empty-of-a-shape allocation primitive. Note there is no uninitialized-structure analogue: for a sparse arrayundefmeans empty, exactly as inSparseArrays. Implementing these through aspzeros(Tv, Ti, dims…; fmt=…)helper (the value-level analogue ofSparseArrays.spzeros, with a format selector) is recommended — it also serves as a convenient public, format-polymorphic entry point — butspzerositself is not mandated, since its signature is back-end-flavored (format symbols, storage modes) whereas theundefconstructor is uniform.Base.similar— structure-preserving (similar(A),similar(A, ::Type)) and empty-of-a-shape (similar(A, ::Type, dims)), as for dense arrays; generic code allocates its outputs throughsimilar, never by naming a type. The empty-of-a-shape form just delegates to theundefconstructor (threading the source's storage mode), so the constructor is the real primitive.- Format-conversion hooks
GPUArrays.coo_type/csr_type/csc_type— map any of your sparse-matrix types to the type of the named sibling format (coo_type(::Type{<:MyCSC}) = MyCOO); generic code converts withcoo_type(A)(A). These are type-level hooks rather than plainconvert(Dest, A)because a format is the wrapper's identity (distinct structs), not a type parameter — so, unlike an eltype change, there is no generic wrapper→sibling-wrapper operation, and only the back-end knows its sibling types. The cross-formatconvertmethods above are the engine the resulting constructors route through; the identity case (coo_type(coo)(coo)) is your identity constructor. KernelAbstractions.get_backendfor the sparse types (usuallyget_backend(nonzeros(A))).Adapt.adapt_structureconverting each host struct to its device counterpart (GPUArrays.GPUSparseDeviceVector,GPUSparseDeviceMatrixCSC/CSR/COO), so the generic kernels can consume it inside@kernels.GPUArrays._sptranspose/_spadjoint— materialize a (conjugate) transpose; used bykron/triu/trilon lazily wrapped operands.
SparseArrays' accessors (nnz, nonzeros, nonzeroinds, rowvals, getcolptr) come
for free from the field names. Dense↔sparse conversion is generic and on-device:
to_sparse(::Type{ST}, dense) scans into a sparse array (ST a vector or COO type;
CSR/CSC follow via the verbs) and to_dense(A) scatters back to a dense array of the
back-end — so a back-end's MyArray(::MySparse…) and dense→sparse constructors can simply
call them.
Broadcasting; mapreduce and reductions (sum, norm, opnorm); sparse–dense and
sparse–vector multiplication (*, mul!, including transposed/adjoint operands);
findnz, triu/tril/kron/reshape/droptol!; iszero/issymmetric/ishermitian;
scalar and slice indexing; copy/copyto!/collect/Array; and conversion between
formats and to/from dense.
GPUArrays.@cached
GPUArrays.@uncached