Skip to content

Add LAGraph_Matrix_Sum and LAGraph_Matrix_Binary_Sum utilities#407

Draft
michelp wants to merge 3 commits into
stablefrom
matrix-sum
Draft

Add LAGraph_Matrix_Sum and LAGraph_Matrix_Binary_Sum utilities#407
michelp wants to merge 3 commits into
stablefrom
matrix-sum

Conversation

@michelp

@michelp michelp commented Jun 12, 2026

Copy link
Copy Markdown
Member

Summary

Adds two src/utility/ functions that combine an array of GrB_Matrix objects into a single matrix, using a binary operator to resolve duplicate (i,j) entries. They compute the same result via two different techniques, so they can be compared:

  • LAGraph_Matrix_Sum — concatenate-and-build: extract every matrix's tuples into one shared buffer, then a single GrB_Matrix_build.
  • LAGraph_Matrix_Binary_Sum — pairwise binary reduction tree of GrB_eWiseAdd (per the GraphChallenge "Anonymized Network Sensing" paper, Fig. 3 "Binary Summation of Traffic Matrices").

With dup = GrB_PLUS_FP64 (for example) either computes the element-wise sum of all the matrices; other operators generalize it (max, times, etc.). All input matrices must have identical dimensions and the same built-in type; C is created with that type and dimensions. User-defined types return GrB_NOT_IMPLEMENTED.

LAGraph_Matrix_Sum (concatenate + build)

  1. Compute the total nvals across all inputs and a per-matrix prefix-sum offset.
  2. Allocate one tuple buffer (I, J, X) large enough for every entry.
  3. Extract each matrix's tuples into its disjoint buffer region — parallelized across LG_nthreads_outer threads, since the regions never overlap.
  4. GrB_Matrix_build with the provided dup operator to combine duplicate coordinates.

LAGraph_Matrix_Binary_Sum (pairwise binary reduction)

At each level the matrices are summed in disjoint adjacent pairs with GrB_eWiseAdd using dup; an unpaired (odd) trailing matrix is carried up to the next level unchanged, until a single matrix remains. The independent pair-sums within a level run concurrently across LG_nthreads_outer threads. Working on the smaller intermediate matrices of the tree keeps the active data in faster memory: adding two matrices each with N entries yields a matrix with fewer than 2N entries, so the relative work shrinks as the matrices grow.

All intermediate matrices are tracked in a single pre-NULLed pool with deterministic per-pair slots, so cleanup is a simple sweep that is correct on any failure path. Unlike LAGraph_Matrix_Sum, dup must be non-NULL, since GrB_eWiseAdd requires a binary operator (NULL dup returns GrB_NULL_POINTER).

Changes

  • src/utility/LAGraph_Matrix_Sum.c — concatenate/build implementation.
  • src/utility/LAGraph_Matrix_Binary_Sum.c — binary-reduction implementation.
  • Config/LAGraph.h.in — public declarations (include/LAGraph.h is generated from this template).
  • src/test/test_Matrix_Sum.c, src/test/test_Matrix_Binary_Sum.c — tests.

Testing

ctest -R Matrix_Sum and ctest -R Matrix_Binary_Sum pass; clean builds with OpenMP on and off. The test binaries cover:

  • correctness (vs GrB_eWiseAdd, single-matrix copy, empty-matrix inclusion)
  • every built-in type branch
  • odd counts exercising the binary-reduction carry path
  • parallel reduction/extraction with multiple outer threads
  • error handling (NULL args, nmatrices == 0, NULL array entry, dimension/type mismatch, UDT, and NULL dup for the binary variant)
  • brutal malloc-failure variants exercising the cleanup paths

LAGraph_Matrix_Sum combines an array of GrB_Matrix objects into a single
matrix using a binary operator to resolve duplicate entries.  It computes
the total number of entries across all inputs, allocates one shared tuple
buffer (I, J, X), extracts the tuples of each input matrix into the buffer,
and calls GrB_Matrix_build with the dup operator to combine duplicate (i,j)
coordinates.  With dup = GrB_PLUS_FP64 (for example) this computes the
element-wise sum of all the matrices.

All input matrices must have identical dimensions and the same built-in
type; user-defined types return GrB_NOT_IMPLEMENTED.

Includes a test (src/test/test_Matrix_Sum.c) covering correctness, all
built-in type branches, error handling, and a brutal malloc-failure variant.
@michelp michelp marked this pull request as draft June 12, 2026 01:26
The per-matrix extraction loop precomputes an offset (prefix-sum) array so
each matrix's tuples occupy a disjoint region of the shared (I, J, X) buffer.
That removes the loop-carried offset dependency and lets the extraction run
across LG_nthreads_outer threads with OpenMP; each GrB_Matrix_extractTuples
is still parallelized internally by GraphBLAS with LG_nthreads_inner threads
(the two-level nested model).  The public signature is unchanged: the thread
count follows the usual LAGraph convention via LAGraph_SetNumThreads.

Because GRB_TRY cannot return out of an OpenMP region, the first extraction
error is captured under a critical section and checked after the loop.

Adds test_Matrix_Sum_parallel, which sums many overlapping matrices with
multiple outer threads and compares against an independently accumulated
result.
Adds LAGraph_Matrix_Binary_Sum, which combines an array of matrices into
a single matrix with the same interface and result as LAGraph_Matrix_Sum
but a different technique: a pairwise binary reduction tree built from
GrB_eWiseAdd (per the GraphChallenge "Anonymized Network Sensing" paper,
"Binary Summation of Traffic Matrices"), rather than one large
extract/build.

At each level the matrices are summed in disjoint adjacent pairs using
the dup operator; an unpaired (odd) trailing matrix is carried up to the
next level unchanged, until a single matrix remains. The independent
pair-sums within a level are parallelized across LG_nthreads_outer
threads. Working on the smaller intermediate matrices of the tree keeps
the active data in faster memory.

All intermediate matrices are tracked in a single pre-NULLed Pool array
with deterministic per-pair slots, so cleanup is a simple sweep that is
correct on any failure path. Unlike LAGraph_Matrix_Sum, dup must be
non-NULL since GrB_eWiseAdd requires a binary operator.

Includes tests for correctness, odd counts (carry path), all built-in
types, parallel reduction with multiple outer threads, error handling,
and a brutal malloc-failure variant.
@michelp michelp changed the title Add LAGraph_Matrix_Sum utility Add LAGraph_Matrix_Sum and LAGraph_Matrix_Binary_Sum utilities Jun 25, 2026
// funding and support from the U.S. Government (see Acknowledgments.txt file).
// DM22-0790

// Contributed by Michel Pelletier.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Original idea by Tim Davis.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant