Skip to content

feat(community): implement Leiden community detection algorithm#401

Open
gkorland wants to merge 3 commits intoGraphBLAS:v1.3.xfrom
gkorland:feat/leiden-algorithm
Open

feat(community): implement Leiden community detection algorithm#401
gkorland wants to merge 3 commits intoGraphBLAS:v1.3.xfrom
gkorland:feat/leiden-algorithm

Conversation

@gkorland
Copy link
Copy Markdown

@gkorland gkorland commented Apr 26, 2026

Summary

Implements the Leiden community detection algorithm (Traag et al., 2019) as an experimental LAGraph algorithm. Leiden is the successor to Louvain and guarantees well-connected communities by adding a Refinement phase between the Local-Move and Aggregation phases.

Changes

  • experimental/algorithm/LAGraph_Leiden.c (new) - full 3-phase multi-level implementation
    • Phase 1 (Local Move): greedy per-node modularity maximisation
    • Phase 2 (Refinement): restricted moves from singletons within each Phase-1 parent community
    • Phase 3 (Aggregation): builds coarsened graph A_agg = S^T * A_cur * S; initial partition at each level induced by the previous Phase-1 partition (Leiden invariant)
    • Output is the LAST level's Phase-1 partition projected back to original nodes (per Traag et al. Algorithm A.2)
  • experimental/test/test_leiden.c (new) - acutest test on karate.mtx; asserts Q > 0.37
  • include/LAGraphX.h - added LAGRAPHX_PUBLIC declaration for LAGraph_Leiden

Testing

  • Compiles without warnings under -Wall -Wextra
  • All tests pass: Q = 0.4188 on karate.mtx (within Louvain/Leiden benchmark range 0.37-0.42)
  • Test threshold Q > 0.37 for safety margin
  • JIT disabled in test to work around environment limitation

Memory / Performance Impact

  • All C scratch arrays are size n; used for indices 0..n_cur-1 at each level - no reallocation between levels
  • GraphBLAS matrices A_agg, S_mat, A_temp are created/freed per level
  • LG_FREE_WORK / LG_FREE_ALL cover all allocations; no leaks

Related Issues

Traag, V.A., Waltman, L. & van Eck, N.J. (2019). From Louvain to Leiden: guaranteeing well-connected communities. Scientific Reports 9, 5233.

gkorland and others added 3 commits April 25, 2026 23:35
Add experimental implementation of the Leiden algorithm (Traag et al.,
Scientific Reports 9:5233, 2019) — the successor to Louvain that
guarantees well-connected communities.

New files:
- experimental/algorithm/LAGraph_Leiden.c: two-phase implementation
  * Phase 1 (Local Move): greedy node-to-community assignment maximising
    modularity gain score(i→c) = T[c] - k[i]*k_comm[c]/m
  * Phase 2 (Refinement): Leiden's key addition — restart each node in a
    singleton sub-community and restrict moves to within the Phase-1
    parent community, guaranteeing well-connectedness
  * Phase 3 (Aggregation): documented as TODO for multi-level extension
  * Output: GrB_Vector of INT64 community labels 0..K-1
- experimental/test/test_leiden.c: acutest test using karate.mtx;
  verifies all nodes labelled, labels in range, and Q > 0

Updated files:
- include/LAGraphX.h: add LAGRAPHX_PUBLIC declaration for LAGraph_Leiden

Algorithm is auto-discovered by experimental/CMakeLists.txt glob.
Tested on karate.mtx (Q ≈ 0.238, 34 nodes, all checks pass).

Ref: https://doi.org/10.1038/s41598-019-41695-z

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Wrap Phase 1 + Phase 2 in an outer aggregation loop that iterates
  until no further coarsening occurs
- Build aggregate graph A_agg = S^T * A_cur * S after each level using
  GrB_mxm with GrB_PLUS_TIMES_SEMIRING_FP64
- Carry forward Phase-1 partition as initial partition for each
  aggregate level (Leiden invariant: init_comm[r] = Phase-1 parent of
  refined community r) rather than reinitialising from singletons
- Skip self-loops (j == i) in both Phase 1 and Phase 2 move scoring;
  A_agg has diagonal entries representing intra-community weight
- Recreate GrB_Vector k_vec and v per level to match current n_cur
- Track o_comm[i] (original node → final community) across all levels
  via composition o_comm[i] = c_ref[o_comm[i]] after each relabeling
- m (total edge weight / 2) is computed once from G->A and held
  constant; invariant under the S^T*A*S aggregation
- Modularity Q on karate.mtx improves from ~0.24 (single-level) to
  ~0.34 (multi-level); update test threshold to Q > 0.3

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…s review

Code review on PR GraphBLAS#401 identified three issues; this commit addresses all
three.

- fix: change modularity-gain penalty from k_i*k_comm[c]/m to
  k_i*k_comm[c]/(2m) in both Phase 1 and Phase 2.  The standard Blondel
  modularity-gain formula uses 2m as the denominator; using m doubled the
  penalty term and biased the algorithm toward over-fragmenting
  communities.  Q on karate.mtx improves from 0.345 to 0.419, matching
  published Louvain/Leiden benchmarks (0.37-0.42).  Updated test
  threshold to Q > 0.37.
- fix: hoist A_new declaration to function scope and add it to
  LG_FREE_WORK so it is freed if the second mxm call returns an error.
  Previously A_new could leak on the error path.
- docs: clarify in the header comment that the refinement phase is the
  greedy variant (not the randomized procedure from Traag et al. 2019).
  The seed parameter is reserved for a future randomized refinement step;
  added an explicit (void) seed cast to silence the unused-parameter
  reading.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@DrTimothyAldenDavis
Copy link
Copy Markdown
Member

Thanks! Can you move the PR to the 1.3.x branch? That's the latest development version.

@gkorland gkorland changed the base branch from stable to v1.3.x April 27, 2026 03:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants