Skip to content

Commit 63908f9

Browse files
Sébastien LoiselSébastien Loisel
authored andcommitted
Add SparseMatrixCSR type alias for CSR storage format
- Add SparseMatrixCSR{Tv,Ti} as alias for Transpose{Tv, SparseMatrixCSC{Tv,Ti}} - Add SparseMatrixCSR(::SparseMatrixCSC) constructor for CSC→CSR conversion - Add SparseMatrixCSC(::SparseMatrixCSR) constructor for CSR→CSC conversion - Update SparseMatrixMPI to use SparseMatrixCSR{T,Int} for the A field - Update SparseMatrixMPI_local signature to use SparseMatrixCSR - Optimize triplet-to-CSR construction by building M^T directly (swap I↔J) - Add comprehensive documentation explaining the dual life of Transpose{SparseMatrixCSC} - Update CLAUDE.md, api.md, and getting-started.md with CSR explanations
1 parent 66ca717 commit 63908f9

6 files changed

Lines changed: 214 additions & 66 deletions

File tree

CLAUDE.md

Lines changed: 23 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -40,12 +40,23 @@ Many operations in this module are collective and should not be run on a subset
4040

4141
### Core Data Structures
4242

43+
**SparseMatrixCSR{T,Ti}** (Type Alias)
44+
- `SparseMatrixCSR{T,Ti} = Transpose{T, SparseMatrixCSC{T,Ti}}` - type alias for CSR storage
45+
- In Julia, `Transpose{SparseMatrixCSC}` has a **dual life**:
46+
- **Semantic view**: A lazy transpose of a CSC matrix (what `transpose(A)` returns)
47+
- **Storage view**: Row-major (CSR) access to sparse data
48+
- Use `SparseMatrixCSR` when intent is CSR storage, use `transpose(A)` for mathematical transpose
49+
- `SparseMatrixCSR(A::SparseMatrixCSC)` converts CSC to CSR representing the **same** matrix
50+
- `SparseMatrixCSC(A::SparseMatrixCSR)` converts CSR to CSC representing the **same** matrix
51+
- For `B = SparseMatrixCSR(A)`, `B[i,j] == A[i,j]` (same matrix, different storage)
52+
4353
**SparseMatrixMPI{T}**
4454
- Rows are partitioned across MPI ranks
45-
- `A::Transpose{T,SparseMatrixCSC{T,Int}}`: Local rows wrapped in a Transpose for type clarity
55+
- `A::SparseMatrixCSR{T,Int}`: Local rows in CSR format for efficient row-wise iteration
4656
- `A.parent` is the underlying CSC storage with shape `(length(col_indices), local_nrows)`
47-
- Columns in `A.parent` correspond to local rows; this layout enables efficient row-wise iteration
48-
- Storage is **compressed**: `A.parent.rowval` uses local column indices (1:length(col_indices)), not global
57+
- `A.parent.colptr` acts as row pointers for the CSR format
58+
- `A.parent.rowval` contains LOCAL column indices (1:length(col_indices)), not global
59+
- Storage is **compressed** to avoid hypersparse issues
4960
- `row_partition`: Array of size `nranks + 1` defining which rows each rank owns (1-indexed boundaries)
5061
- `col_partition`: Array of size `nranks + 1` defining column partition (used for transpose operations)
5162
- `col_indices`: Sorted global column indices that appear in the local part (local→global mapping)
@@ -100,11 +111,19 @@ x = F \ b
100111

101112
For efficient construction when data is already distributed:
102113
- `VectorMPI_local(v_local)`: Create from local vector portion
103-
- `SparseMatrixMPI_local(transpose(AT_local))`: Create from local rows
114+
- `SparseMatrixMPI_local(SparseMatrixCSR(local_csc))`: Create from local rows in CSR format
115+
- `SparseMatrixMPI_local(transpose(AT_local))`: Alternative using explicit transpose wrapper
104116
- `MatrixMPI_local(A_local)`: Create from local dense rows
105117

106118
These infer the global partition via MPI.Allgather of local sizes.
107119

120+
When building from triplets (I, J, V), the efficient pattern is to build M^T directly as CSC by swapping indices, then wrap in lazy transpose for CSR:
121+
```julia
122+
AT_local = sparse(local_J, local_I, local_V, ncols, local_nrows) # M^T as CSC
123+
SparseMatrixMPI_local(transpose(AT_local)) # M in CSR format
124+
```
125+
This avoids an unnecessary physical transpose operation.
126+
108127
### Matrix Multiplication Flow
109128

110129
1. **Plan creation** (`MatrixPlan` constructor): Uses `Alltoall` and `Alltoallv` to exchange row requests and sparse structure (colptr, rowval)

docs/src/api.md

Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,52 @@ MatrixMPI
2222
VectorMPI
2323
```
2424

25+
### SparseMatrixCSR
26+
27+
```@docs
28+
SparseMatrixCSR
29+
```
30+
31+
## CSR Storage Format
32+
33+
LinearAlgebraMPI uses CSR (Compressed Sparse Row) format internally for `SparseMatrixMPI` because row-partitioned distributed matrices need efficient row-wise access.
34+
35+
### The Dual Life of Transpose{SparseMatrixCSC}
36+
37+
In Julia, the type `Transpose{T, SparseMatrixCSC{T,Int}}` has two interpretations:
38+
39+
1. **Semantic**: A lazy transpose of a CSC matrix (what you get from `transpose(A)`)
40+
2. **Storage**: Row-major (CSR) access to sparse data
41+
42+
This duality can be confusing. When you call `transpose(A)` on a SparseMatrixCSC, you get a wrapper that represents A^T. But the same wrapper type, when used for storage, provides efficient row iteration.
43+
44+
### The SparseMatrixCSR Type Alias
45+
46+
To clarify intent, LinearAlgebraMPI exports:
47+
48+
```julia
49+
const SparseMatrixCSR{Tv,Ti} = Transpose{Tv, SparseMatrixCSC{Tv,Ti}}
50+
```
51+
52+
Use `SparseMatrixCSR` when you want row-major storage, and `transpose(A)` when you want the mathematical transpose.
53+
54+
### Converting Between CSC and CSR
55+
56+
```julia
57+
# CSC to CSR (same matrix, different storage)
58+
A_csc = sparse([1,2,2], [1,1,2], [1.0, 2.0, 3.0], 2, 2)
59+
A_csr = SparseMatrixCSR(A_csc)
60+
A_csr[1,1] == A_csc[1,1] # true
61+
62+
# CSR to CSC
63+
A_back = SparseMatrixCSC(A_csr)
64+
A_back == A_csc # true
65+
```
66+
67+
### Why CSR for Distributed Matrices?
68+
69+
`SparseMatrixMPI` partitions matrices by rows across MPI ranks. Each rank needs to efficiently iterate over its local rows for operations like matrix-vector multiplication. CSR format provides O(1) access to each row's nonzeros, while CSC would require scanning the entire column pointer array.
70+
2571
## Sparse Matrix Operations
2672

2773
### Arithmetic

docs/src/getting-started.md

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -54,6 +54,16 @@ The matrix is partitioned roughly equally by rows. For example, with 4 ranks and
5454
- Rank 2: rows 51-75
5555
- Rank 3: rows 76-100
5656

57+
### Internal Storage: CSR Format
58+
59+
Internally, each rank stores its local rows in CSR (Compressed Sparse Row) format using the `SparseMatrixCSR` type. This enables efficient row-wise iteration, which is essential for a row-partitioned distributed matrix.
60+
61+
In Julia, `SparseMatrixCSR{T,Ti}` is a type alias for `Transpose{T, SparseMatrixCSC{T,Ti}}`. This type has a dual interpretation:
62+
- **Semantic view**: A lazy transpose of a CSC matrix
63+
- **Storage view**: Row-major (CSR) access to the data
64+
65+
You don't need to worry about this for normal usage - it's handled automatically. But if you're accessing the internal storage (e.g., `A.A.parent`), be aware that it stores the transposed data in CSC format, which gives CSR access through the wrapper.
66+
5767
### Efficient Local-Only Construction
5868

5969
For large matrices, you can avoid replicating data across all ranks by only populating each rank's local portion:

src/LinearAlgebraMPI.jl

Lines changed: 83 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@ import LinearAlgebra
99
import LinearAlgebra: tr, diag, triu, tril, Transpose, Adjoint, norm, opnorm, mul!, ldlt, BLAS, issymmetric, UniformScaling, dot
1010

1111
export SparseMatrixMPI, MatrixMPI, VectorMPI, clear_plan_cache!, uniform_partition
12+
export SparseMatrixCSR # Type alias for Transpose{SparseMatrixCSC} (CSR storage format)
1213
export # Multithreaded sparse matrix multiplication
1314
export VectorMPI_local, MatrixMPI_local, SparseMatrixMPI_local # Local constructors
1415
export mean # Our mean function for SparseMatrixMPI and VectorMPI
@@ -21,6 +22,88 @@ export solve, solve!, finalize!
2122
const Blake3Hash = NTuple{32,UInt8}
2223
const OptionalBlake3Hash = Union{Nothing, Blake3Hash}
2324

25+
# ============================================================================
26+
# SparseMatrixCSR Type Alias and Constructors
27+
# ============================================================================
28+
29+
"""
30+
SparseMatrixCSR{Tv,Ti} = Transpose{Tv, SparseMatrixCSC{Tv,Ti}}
31+
32+
Type alias for CSR (Compressed Sparse Row) storage format.
33+
34+
## The Dual Life of Transpose{SparseMatrixCSC}
35+
36+
In Julia, the type `Transpose{Tv, SparseMatrixCSC{Tv,Ti}}` has two interpretations:
37+
38+
1. **Semantic interpretation**: A lazy transpose wrapper around a CSC matrix.
39+
When you call `transpose(A)` on a SparseMatrixCSC, you get this wrapper that
40+
represents A^T without copying data.
41+
42+
2. **Storage interpretation**: CSR (row-major) access to sparse data.
43+
The underlying CSC stores columns contiguously, but through the transpose wrapper,
44+
we can iterate efficiently over rows instead of columns.
45+
46+
This alias clarifies intent: use `SparseMatrixCSR` when you want row-major storage
47+
semantics, and `transpose(A)` when you want the mathematical transpose.
48+
49+
## CSR vs CSC Storage
50+
51+
- **CSC (Compressed Sparse Column)**: Julia's native sparse format. Efficient for
52+
column-wise operations, matrix-vector products with column access.
53+
- **CSR (Compressed Sparse Row)**: Efficient for row-wise operations, matrix-vector
54+
products with row access, and row-partitioned distributed matrices.
55+
56+
For `SparseMatrixCSR`, the underlying `parent::SparseMatrixCSC` stores the *transposed*
57+
matrix. If `B = SparseMatrixCSR(A)` represents matrix M, then `B.parent` is a CSC
58+
storing M^T. This means:
59+
- `B.parent.colptr` acts as row pointers for M
60+
- `B.parent.rowval` contains column indices for M
61+
- `B.parent.nzval` contains values in row-major order
62+
63+
## Usage Note
64+
65+
Julia will still display this type as `Transpose{Float64, SparseMatrixCSC{...}}`,
66+
not as `SparseMatrixCSR`. The alias improves code clarity but doesn't affect
67+
type printing.
68+
"""
69+
const SparseMatrixCSR{Tv,Ti} = Transpose{Tv, SparseMatrixCSC{Tv,Ti}}
70+
71+
"""
72+
SparseMatrixCSR(A::SparseMatrixCSC{Tv,Ti}) where {Tv,Ti}
73+
74+
Convert a CSC matrix to CSR format representing the **same** matrix.
75+
76+
If A represents matrix M in CSC format, the result represents M in CSR format.
77+
Element access is unchanged: `B[i,j] == A[i,j]`.
78+
79+
Internally, this:
80+
1. Materializes A^T as CSC (physical transpose)
81+
2. Wraps in lazy transpose to get M back, but with row-major storage
82+
83+
# Example
84+
```julia
85+
A_csc = sparse([1,2,2], [1,1,2], [1.0, 2.0, 3.0], 2, 2)
86+
A_csr = SparseMatrixCSR(A_csc) # Same matrix, CSR storage
87+
A_csr[1,1] == A_csc[1,1] # true - same elements
88+
```
89+
"""
90+
function SparseMatrixCSR(A::SparseMatrixCSC{Tv,Ti}) where {Tv,Ti}
91+
return transpose(SparseMatrixCSC(transpose(A)))
92+
end
93+
94+
"""
95+
SparseMatrixCSC(A::SparseMatrixCSR{Tv,Ti}) where {Tv,Ti}
96+
97+
Convert a CSR matrix to CSC format representing the **same** matrix.
98+
99+
This physically transposes the underlying storage to produce a CSC matrix.
100+
Element access is unchanged: the result represents the same matrix as the input.
101+
"""
102+
function SparseArrays.SparseMatrixCSC(A::SparseMatrixCSR{Tv,Ti}) where {Tv,Ti}
103+
# Use sparse() to avoid dispatching back to our method
104+
return sparse(transpose(A.parent))
105+
end
106+
24107
# Cache for memoized MatrixPlans
25108
# Key: (A_hash, B_hash, T) - use full 256-bit hashes
26109
const _plan_cache = Dict{Tuple{Blake3Hash,Blake3Hash,DataType},Any}()

src/blocks.jl

Lines changed: 10 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -136,13 +136,11 @@ function Base.cat(As::SparseMatrixMPI{T}...; dims) where T
136136
end
137137
end
138138

139-
# Step 3: Build local sparse matrix
140-
if isempty(local_I)
141-
AT_local = SparseMatrixCSC(total_cols, local_nrows, ones(Int, local_nrows + 1), Int[], T[])
142-
else
143-
local_sparse = sparse(local_I, local_J, local_V, local_nrows, total_cols)
144-
AT_local = sparse(transpose(local_sparse))
145-
end
139+
# Step 3: Build M^T directly as CSC (swap I↔J), then wrap in lazy transpose for CSR
140+
# This avoids an unnecessary physical transpose operation
141+
AT_local = isempty(local_I) ?
142+
SparseMatrixCSC(total_cols, local_nrows, ones(Int, local_nrows + 1), Int[], T[]) :
143+
sparse(local_J, local_I, local_V, total_cols, local_nrows)
146144

147145
return SparseMatrixMPI_local(transpose(AT_local); comm=comm)
148146
end
@@ -509,13 +507,11 @@ function blockdiag(As::SparseMatrixMPI{T}...) where T
509507
end
510508
end
511509

512-
# Step 4: Build local sparse matrix
513-
if isempty(local_I)
514-
AT_local = SparseMatrixCSC(total_cols, local_nrows, ones(Int, local_nrows + 1), Int[], T[])
515-
else
516-
local_sparse = sparse(local_I, local_J, local_V, local_nrows, total_cols)
517-
AT_local = sparse(transpose(local_sparse))
518-
end
510+
# Step 4: Build M^T directly as CSC (swap I↔J), then wrap in lazy transpose for CSR
511+
# This avoids an unnecessary physical transpose operation
512+
AT_local = isempty(local_I) ?
513+
SparseMatrixCSC(total_cols, local_nrows, ones(Int, local_nrows + 1), Int[], T[]) :
514+
sparse(local_J, local_I, local_V, total_cols, local_nrows)
519515

520516
return SparseMatrixMPI_local(transpose(AT_local); comm=comm)
521517
end

0 commit comments

Comments
 (0)