Add cs_cholesky wrapper and drop module-level docstring (#1)

ChrisRackauckas-Claude · ChrisRackauckas · web-flow · commit b8f2ee8ec1a1 · 2026-05-27T08:02:44.000Z
Wraps the third CXSparse direct factorization, cs_*_schol + cs_*_chol,
for all four (Tv, Ti) ∈ {Float64, ComplexF64} × {Int32, Int64} combos,
exported as `cs_cholesky` returning a `CSCholesky`. Solve via `\` /
`ldiv!`; finalizer frees the symbolic + numeric C-side structs as with
CSQR / CSLU. The solve path is the textbook
`x = P' * (L' \ (L \ (P * b)))`, using new `_cs_ltsolve!` / `_cs_pvec!`
ccall shims and the existing `_cs_lsolve!` / `_cs_ipvec!` ones.

Cholesky tests cover real SPD and complex Hermitian PD inputs at
n ∈ {5, 25, 100} over both index types, plus rejects-non-square,
rejects-non-PD (cs_chol returns NULL on the first non-positive pivot),
and dimension-mismatch checks. The finalizer GC stress test now
exercises the Cholesky path too.

Also drops the module-level docstring on `CXSparse`; the README is now
the single source for the overview, and the README has been updated to
reflect the new solver, add a Cholesky quick-start example, and add a
CHOLMOD column to the comparison table.

Co-authored-by: ChrisRackauckas-Claude &lt;accounts@chrisrackauckas.com&gt;
diff --git a/README.md b/README.md
@@ -2,40 +2,48 @@
 
 Julia wrapper around the [CXSparse](https://github.com/DrTimothyAldenDavis/SuiteSparse/tree/dev/CXSparse) library from SuiteSparse — the lightweight, textbook-style sparse direct solvers from Tim Davis's *Direct Methods for Sparse Linear Systems* (CSparse), extended to support `ComplexF64` values and both 32- and 64-bit indices.
 
-CXSparse is the QR / LU counterpart to **KLU's design philosophy**: a small symbolic phase, very little per-call overhead, well-suited to small-to-medium sparse problems (n up to a few thousand). It complements the multifrontal solvers (UMFPACK, SPQR, CHOLMOD) which pay a heavier symbolic tax in exchange for BLAS-3 speedups on larger fronts.
+CXSparse is the QR / LU / Cholesky counterpart to **KLU's design philosophy**: a small symbolic phase, very little per-call overhead, well-suited to small-to-medium sparse problems (n up to a few thousand). It complements the multifrontal solvers (UMFPACK, SPQR, CHOLMOD) which pay a heavier symbolic tax in exchange for BLAS-3 speedups on larger fronts.
 
-This package wraps the `cs_qr` and `cs_lu` factorizations from `CXSparse_jll` and exposes a Julian API.
+This package wraps the three direct factorizations CXSparse provides — `cs_qr`, `cs_lu`, and `cs_chol` — and exposes a Julian `\` / `ldiv!` API on top.
 
 ## Status
 
-- [x] `cs_qr` for `SparseMatrixCSC{Float64, Int32 | Int64}` and `SparseMatrixCSC{ComplexF64, Int32 | Int64}`
-- [x] `cs_lu` for the same four element/index combinations
-- [x] `LinearAlgebra.ldiv!` and `\` for both factorizations
-- [x] Finalizer-managed C-side memory (no manual `cs_*_sfree` calls required)
-- [ ] `cs_cholsol` for symmetric positive definite — not yet wrapped
+- [x] `cs_qr` — QR factorization (square or overdetermined `m ≥ n`)
+- [x] `cs_lu` — LU factorization (square)
+- [x] `cs_cholesky` — Cholesky factorization (square symmetric/Hermitian positive definite)
+- [x] All four CXSparse element/index combinations: `Float64` / `ComplexF64` × `Int32` / `Int64`
+- [x] `LinearAlgebra.ldiv!` and `\` on every factorization type
+- [x] Finalizer-managed C-side memory (no manual `cs_*_sfree` / `cs_*_nfree` calls)
 
 ## Quick start
 
 ```julia
-using CXSparse, SparseArrays
+using CXSparse, SparseArrays, LinearAlgebra
 
 A = sparse([2.0  1.0  0.0;
             1.0  3.0  1.0;
             0.0  1.0  2.0])
 b = [1.0, 2.0, 3.0]
 
-# QR factorization (works for square or overdetermined m ≥ n):
+# QR factorization (square or overdetermined m ≥ n):
 F = cs_qr(A)
 x = F \ b
-# or in-place:
+# or in-place into a preallocated buffer:
 x_pre = zeros(3)
 ldiv!(x_pre, F, b)
 
-# LU factorization (square only):
+# LU factorization (square):
 G = cs_lu(A)
 y = G \ b
+
+# Cholesky factorization (square symmetric/Hermitian positive definite —
+# only the upper triangle of A is read):
+H = cs_cholesky(A)
+z = H \ b
 ```
 
+The factorization objects (`CSQR`, `CSLU`, `CSCholesky`) own all C-side memory and free it via a finalizer; you can also call `finalize(F)` explicitly to release it eagerly.
+
 ## Supported element/index combinations
 
 | element type   | index type | CXSparse name |
@@ -45,22 +53,24 @@ y = G \ b
 | `ComplexF64`   | `Int32`    | `cs_ci_*`     |
 | `ComplexF64`   | `Int64`    | `cs_cl_*`     |
 
-CXSparse does **not** ship `Float32` variants — `cs_si_*` / `cs_sl_*` don't exist upstream.
+CXSparse does **not** ship `Float32` variants — `cs_si_*` / `cs_sl_*` don't exist upstream. For complex inputs, `cs_cholesky` computes the Hermitian factorization (`L * L' = A`, where `A = A'`).
 
-## When to use this vs SPQR / UMFPACK
+## When to use this vs SPQR / UMFPACK / CHOLMOD
 
-CXSparse `cs_qr` and `cs_lu` are textbook implementations: simple, low symbolic-phase overhead, no BLAS-3. They sit alongside KLU in the "small/medium fast path" niche:
+CXSparse's `cs_qr`, `cs_lu`, and `cs_chol` are textbook implementations: simple, low symbolic-phase overhead, no BLAS-3. They sit alongside KLU in the "small/medium fast path" niche:
 
-| | KLU | UMFPACK | SPQR | CXSparse `cs_qr` / `cs_lu` |
-|---|---|---|---|---|
-| factorization | LU | LU | QR | LU / QR |
-| algorithm | BTF + Gilbert–Peierls partial-pivot LU | multifrontal BLAS-3 LU | multifrontal BLAS-3 QR | textbook Householder QR / Gilbert–Peierls LU |
-| sweet spot | n ≲ few thousand | n ≳ 1k | n ≳ 1k, least-squares | n ≲ few thousand |
-| pivoting | partial | row + column | sparsity-only (rank-revealing if `tol ≥ 0`) | partial (LU), sparsity-only (QR) |
-| BLAS-3 | no | yes | yes | no |
+| | KLU | UMFPACK | SPQR | CHOLMOD | CXSparse `cs_qr` / `cs_lu` / `cs_cholesky` |
+|---|---|---|---|---|---|
+| factorization | LU | LU | QR | Cholesky | LU / QR / Cholesky |
+| algorithm | BTF + Gilbert–Peierls partial-pivot LU | multifrontal BLAS-3 LU | multifrontal BLAS-3 QR | supernodal BLAS-3 Cholesky | textbook Householder QR / Gilbert–Peierls LU / up-looking Cholesky |
+| sweet spot | n ≲ few thousand | n ≳ 1k | n ≳ 1k, least-squares | n ≳ 1k SPD | n ≲ few thousand |
+| pivoting | partial | row + column | sparsity-only (rank-revealing if `tol ≥ 0`) | symmetric (no numerical pivoting) | partial (LU), sparsity-only (QR), symmetric (Cholesky) |
+| BLAS-3 | no | yes | yes | yes | no |
 
 On 199×199 sparse matrices we observed `cs_qr` running in ~325 µs vs SPQR's ~570 µs (1.7× faster). For rank-deficient inputs, however, `cs_qr` is **not rank-revealing** — `x` may contain non-finite entries from the back-solve dividing through near-zero `R` diagonals. If you need a clean solution on rank-deficient systems, fall back to `qr(A)` (SPQR) or LAPACK's column-pivoted QR on `Matrix(A)`.
 
+`cs_cholesky` returns an error if the matrix is not positive definite (CXSparse aborts the factorization as soon as it hits a non-positive pivot). For semi-definite or indefinite symmetric systems, use `LDLFactorizations.jl` or fall back to `cs_lu`.
+
 ## Internals
 
 CXSparse uses 0-based indexing. The wrapper allocates 0-based `colptr` / `rowval` buffers when constructing the `cs_*_sparse` view; `nzval` is shared with the user's `SparseMatrixCSC` directly (CXSparse doesn't mutate it during factorization of a CSC matrix).
diff --git a/src/CXSparse.jl b/src/CXSparse.jl
@@ -1,49 +1,19 @@
-"""
-    CXSparse
-
-Julia wrapper around the CXSparse library from SuiteSparse — the lightweight,
-textbook-style sparse direct solver from Tim Davis's *Direct Methods for
-Sparse Linear Systems* (CSparse), extended to support `ComplexF64` values and
-both 32- and 64-bit indices.
-
-CXSparse is the QR / LU / Cholesky counterpart to KLU's design philosophy: a
-small symbolic-phase cost and very little per-call overhead, well-suited to
-small-to-medium sparse problems (n up to a few thousand). It complements the
-multifrontal solvers (UMFPACK, SPQR, CHOLMOD) which pay a heavier symbolic
-tax in exchange for BLAS-3 speedups on larger fronts.
-
-Supported element / index combinations:
-
-| element type   | index type | CXSparse name |
-|----------------|------------|---------------|
-| `Float64`      | `Int32`    | `cs_di_*`     |
-| `Float64`      | `Int64`    | `cs_dl_*`     |
-| `ComplexF64`   | `Int32`    | `cs_ci_*`     |
-| `ComplexF64`   | `Int64`    | `cs_cl_*`     |
-
-Public API:
-  - `cs_qr(A)` — symbolic + numeric QR factorization; returns a `CSQR`.
-  - `F \\ b`, `ldiv!(x, F, b)` — least-squares solve using a `CSQR`.
-  - `cs_lu(A)` — symbolic + numeric LU factorization; returns a `CSLU`.
-  - `F \\ b`, `ldiv!(x, F, b)` — solve using a `CSLU`.
-
-C-side memory is freed when the Julia factorization object is garbage
-collected; explicit `finalize(F)` is also supported.
-"""
 module CXSparse
 
 using CXSparse_jll: libcxsparse
 using LinearAlgebra
 using SparseArrays: SparseArrays, SparseMatrixCSC, getcolptr, rowvals, nonzeros
 
-export cs_qr, cs_lu, CSQR, CSLU
+export cs_qr, cs_lu, cs_cholesky, CSQR, CSLU, CSCholesky
 
 # CXSparse uses 0-based column orderings; pick COLAMD-ish ordering for QR.
 # The CSparse `order` argument: 0 = natural, 1 = AMD(A+A'), 2 = AMD(S'S),
 # 3 = AMD(A'A). For QR we pass 3 (AMD on A'A).
 const CS_ORDER_QR = Int32(3)
 # For LU: 2 = AMD(S'S) — Davis's recommended choice for unsymmetric LU.
 const CS_ORDER_LU = Int32(2)
+# For Cholesky: 1 = AMD(A+A') — symmetric ordering.
+const CS_ORDER_CHOL = Int32(1)
 # LU pivoting tolerance: 1.0 = strict partial pivoting; smaller values relax it.
 const CS_LU_TOL = 1.0
 
@@ -519,4 +489,151 @@ function Base.:\(F::CSLU{Tv,Ti,T_sp}, b::AbstractVector{Tv}) where {Tv,Ti,T_sp}
     return ldiv!(x, F, b)
 end
 
+# ---------------------------------------------------------------------------
+# Cholesky (symmetric/Hermitian positive definite)
+# ---------------------------------------------------------------------------
+# CXSparse's `cs_*_chol` produces a lower-triangular L with L*L' = P*A*P',
+# where P is the AMD(A+A') fill-reducing permutation stored in S->pinv.
+# Only the upper triangle of A is read; the matrix is assumed symmetric
+# (for Float64) or Hermitian (for ComplexF64).
+"""
+    CSCholesky{Tv,Ti}
+
+Symbolic + numeric CXSparse Cholesky factorization of a square symmetric
+(real) or Hermitian (complex) positive-definite `SparseMatrixCSC{Tv,Ti}`.
+Holds opaque pointers to the C-side `cs_*s` symbolic and `cs_*n` numeric
+structs; freed by a finalizer or explicit `finalize`.
+"""
+mutable struct CSCholesky{Tv,Ti,T_sp}
+    view::_CSView{T_sp,Tv,Ti}
+    S::Ptr{Cvoid}
+    N::Ptr{Cvoid}
+    n::Int
+end
+
+for (Tv, Ti, tag) in (
+        (Float64, Int32, :di),
+        (Float64, Int64, :dl),
+        (ComplexF64, Int32, :ci),
+        (ComplexF64, Int64, :cl),
+    )
+    sparse_ty = Symbol("cs_$(tag)")
+    schol_sym = "cs_$(tag)_schol"
+    chol_sym = "cs_$(tag)_chol"
+    ltsolve_sym = "cs_$(tag)_ltsolve"
+    pvec_sym = "cs_$(tag)_pvec"
+
+    @eval begin
+        function _cs_schol(view::_CSView{$sparse_ty,$Tv,$Ti})
+            return @ccall libcxsparse.$schol_sym(
+                CS_ORDER_CHOL::$Ti,
+                Ref(view.sparse)::Ref{$sparse_ty},
+            )::Ptr{Cvoid}
+        end
+        function _cs_chol(view::_CSView{$sparse_ty,$Tv,$Ti}, S::Ptr{Cvoid})
+            return @ccall libcxsparse.$chol_sym(
+                Ref(view.sparse)::Ref{$sparse_ty},
+                S::Ptr{Cvoid},
+            )::Ptr{Cvoid}
+        end
+        function _cs_ltsolve!(::Type{$Ti}, L::Ptr{Cvoid}, b::AbstractVector{$Tv})
+            @ccall libcxsparse.$ltsolve_sym(
+                L::Ptr{Cvoid},
+                pointer(b)::Ptr{$Tv},
+            )::$Ti
+        end
+        function _cs_pvec!(
+                p::Ptr{$Ti},
+                b::AbstractVector{$Tv},
+                x::AbstractVector{$Tv},
+                n::Integer,
+            )
+            @ccall libcxsparse.$pvec_sym(
+                p::Ptr{$Ti},
+                pointer(b)::Ptr{$Tv},
+                pointer(x)::Ptr{$Tv},
+                $Ti(n)::$Ti,
+            )::$Ti
+        end
+    end
+end
+
+"""
+    cs_cholesky(A::SparseMatrixCSC) -> CSCholesky
+
+Compute the symbolic and numeric CXSparse Cholesky factorization of `A`.
+`A` must be square and symmetric (real) or Hermitian (complex) positive
+definite. Only the upper triangle of `A` is read.
+
+Throws if the factorization fails (typically: `A` is not positive definite,
+or numerical breakdown encountered a non-positive pivot).
+
+Solve with `F \\ b` or `ldiv!(x, F, b)`.
+"""
+function cs_cholesky end
+
+for (Tv, Ti, tag) in (
+        (Float64, Int32, :di),
+        (Float64, Int64, :dl),
+        (ComplexF64, Int32, :ci),
+        (ComplexF64, Int64, :cl),
+    )
+    sparse_ty = Symbol("cs_$(tag)")
+    @eval function cs_cholesky(A::SparseMatrixCSC{$Tv,$Ti})
+        m, n = size(A)
+        m == n || error(
+            "CXSparse cs_cholesky requires a square matrix; got $(size(A))")
+        view = _csview(A)
+        S = _cs_schol(view)
+        S == C_NULL && error("CXSparse cs_*_schol returned NULL")
+        N = _cs_chol(view, S)
+        if N == C_NULL
+            _cs_sfree($Tv, $Ti, S)
+            error("CXSparse cs_*_chol returned NULL — matrix not positive " *
+                  "definite (or not symmetric/Hermitian)?")
+        end
+        F = CSCholesky{$Tv,$Ti,$sparse_ty}(view, S, N, n)
+        finalizer(_finalize_chol!, F)
+        return F
+    end
+end
+
+function _finalize_chol!(F::CSCholesky{Tv,Ti,T_sp}) where {Tv,Ti,T_sp}
+    _cs_sfree(Tv, Ti, F.S); _cs_nfree(Tv, Ti, F.N)
+    F.S = C_NULL; F.N = C_NULL
+    return nothing
+end
+
+Base.size(F::CSCholesky) = (F.n, F.n)
+Base.size(F::CSCholesky, d::Integer) = d == 1 || d == 2 ? F.n : 1
+
+function LinearAlgebra.ldiv!(
+        x::AbstractVector{Tv},
+        F::CSCholesky{Tv,Ti,T_sp},
+        b::AbstractVector{Tv},
+    ) where {Tv,Ti,T_sp}
+    n = F.n
+    length(b) == n || throw(DimensionMismatch(
+        "rhs has length $(length(b)), expected $n"))
+    length(x) == n || throw(DimensionMismatch(
+        "solution has length $(length(x)), expected $n"))
+    work = Vector{Tv}(undef, n)
+    Spinv = _cs_S_pinv(Tv, Ti, F.S)
+    L = _cs_N_L(Tv, Ti, F.N)
+    # work = P*b
+    _cs_ipvec!(Spinv, b, work, n)
+    # work = L \ work
+    _cs_lsolve!(Ti, L, work)
+    # work = L' \ work
+    _cs_ltsolve!(Ti, L, work)
+    # x = P' * work
+    _cs_pvec!(Spinv, work, x, n)
+    return x
+end
+
+function Base.:\(F::CSCholesky{Tv,Ti,T_sp}, b::AbstractVector{Tv}) where {Tv,Ti,T_sp}
+    x = Vector{Tv}(undef, F.n)
+    return ldiv!(x, F, b)
+end
+
 end # module
diff --git a/test/runtests.jl b/test/runtests.jl
@@ -107,14 +107,66 @@ end
     end
 end
 
+@testset "CXSparse cs_cholesky" begin
+    Random.seed!(0xCC55_F00D)
+
+    @testset "SPD ($Tv, $Ti, n=$n)" for Tv in ELTYPES,
+            Ti in IDXTYPES, n in (5, 25, 100)
+
+        # SPD (real) / Hermitian PD (complex) via M*M' + n*I.
+        Mraw = Tv <: Complex ? complex.(randn(n, n), randn(n, n)) : randn(n, n)
+        Adense = Mraw * Mraw' + Tv(n) * I
+        # Adense is dense Float64 / ComplexF64 either way; sparsify and convert.
+        A = _convert(sparse(Adense), Tv, Ti)
+        b_ = Tv <: Complex ? complex.(randn(n), randn(n)) : randn(n)
+        b = Vector{Tv}(b_)
+
+        F = cs_cholesky(A)
+        @test size(F) == (n, n)
+        @test size(F, 1) == n
+        @test size(F, 2) == n
+
+        x = F \ b
+        @test length(x) == n
+        @test norm(A * x - b) < 1e-8 * norm(b)
+
+        x_pre = zeros(Tv, n)
+        @test ldiv!(x_pre, F, b) === x_pre
+        @test x_pre ≈ x
+    end
+
+    @testset "rejects non-square ($Tv, $Ti)" for Tv in ELTYPES, Ti in IDXTYPES
+        A = _convert(sparse(randn(3, 5)), Tv, Ti)
+        @test_throws ErrorException cs_cholesky(A)
+    end
+
+    @testset "rejects non-positive-definite" begin
+        # Negative-definite — cs_chol returns NULL on the first non-positive
+        # pivot, which we surface as an error.
+        A = sparse(Float64[-1.0 0.0; 0.0 -1.0])
+        @test_throws ErrorException cs_cholesky(A)
+    end
+
+    @testset "dimension checks" begin
+        A = sparse(Float64[2.0 0.0; 0.0 2.0])
+        F = cs_cholesky(A)
+        @test_throws DimensionMismatch (F \ [1.0, 2.0, 3.0])
+        @test_throws DimensionMismatch ldiv!(zeros(3), F, [1.0, 2.0])
+    end
+end
+
 @testset "Finalizer doesn't crash" begin
     # Force GC of factorizations and ensure we don't double-free.
     for _ in 1:50
         A = sparse(randn(10, 10) + 10 * I)
+        Asym = Matrix(A) * Matrix(A)' + 10 * I
+        Asym = sparse(Asym)
         F = cs_qr(A)
         F2 = cs_lu(A)
+        F3 = cs_cholesky(Asym)
         _ = F \ randn(10)
         _ = F2 \ randn(10)
+        _ = F3 \ randn(10)
     end
     GC.gc()
     GC.gc()