feat: restore BigClam compatibility with networkx>=3.0 (no karateclub library dependency) by NouamaneA · Pull Request #257 · GiulioRossetti/cdlib

NouamaneA · 2026-04-20T11:55:57Z

Problem

BigClam was commented out after the support for karateclub ceased. The former internal implementation did not support networkx>=3.0 because it relied on nx.to_numpy_matrix(), which was removed then.

Related Issue #245

Solution

The use of karateclub library was not considered here since it does not seem to be maintained anymore.

Replace nx.to_numpy_matrix() -> nx.to_numpy_array()
Review necessary changes in consequence -> The rest of the internal implementation is compatible with ndarray
Computation of the log-likelihood was commented out since it is not necessary for the gradient ascent
Add faster computation of the gradient: eq. 4 in Yang & Leskovec (2013)
Add option to get actual overlapping communities: treshold method presented in section 5 of the paper. The treshold is based on the graph density as mentioned by the authors.

Testing

Tested with networkx 3.6.1 and Python 3.13. The test function is implemented in the test_community_discovery_models.py file.

… library dependency)

Copilot

Pull request overview

Restores algorithms.big_clam using CDlib’s internal BigClam implementation compatible with NetworkX ≥ 3.0, removing the prior karateclub dependency and re-enabling API/test coverage for the algorithm.

Changes:

Re-enable big_clam in overlapping_partition and route it to the internal BIGCLAM implementation.
Update internal BigClam to use nx.to_numpy_array(), add a “fast” gradient variant, and add a threshold-based affiliation mode for overlapping outputs.
Add/restore a unit test calling algorithms.big_clam().

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 9 comments.

File	Description
`cdlib/algorithms/overlapping_partition.py`	Reintroduces public `big_clam()` API and exports it via `__all__`.
`cdlib/algorithms/internal/BIGCLAM.py`	Updates adjacency conversion for NX3+, adds faster gradient + threshold affiliation option.
`cdlib/test/test_community_discovery_models.py`	Adds a basic runtime/type test for `algorithms.big_clam()`.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-21T13:40:23Z

+            "naive": naive,
+            "affiliation_method": affiliation_method,
+        },
+        overlap=True,


NodeClustering(overlap=True) is always set, but with affiliation_method="argmax" the output is a disjoint partition. This flag is used by downstream evaluation/serialization, so it should reflect the actual result (e.g., overlap=(affiliation_method == "threshold")).

Suggested change

overlap=True,

overlap=(affiliation_method == "threshold"),

Copilot · 2026-04-21T13:40:24Z

+        n, m = graph.number_of_nodes(), graph.number_of_edges()
+        epsilon = 2 * m / (n * (n - 1))


In the threshold affiliation path, epsilon = 2*m/(n*(n-1)) will divide by zero when n < 2, and epsilon can hit 1 (complete graph) leading to log(0) and delta=inf. Please guard small graphs and clamp epsilon to a safe range (or use nx.density() and cap to < 1.0) to avoid runtime errors/inf thresholds.

Suggested change

n, m = graph.number_of_nodes(), graph.number_of_edges()

epsilon = 2 * m / (n * (n - 1))

n = graph.number_of_nodes()

if n < 2:

epsilon = 0.0

else:

epsilon = nx.density(graph)

epsilon = min(max(epsilon, 0.0), np.nextafter(1.0, 0.0))

Copilot · 2026-04-21T13:40:24Z

    list_communities = []
    for com in dict_communities:
        list_communities.append(dict_communities[com])


get_communities() currently returns a fixed-length list with potentially many empty communities (especially with argmax when some communities get no assignments, or with threshold when delta is high). Consider filtering out empty communities before returning to match other algorithms in this module and avoid downstream metrics having to handle empty clusters.

Suggested change

list_communities = []

for com in dict_communities:

list_communities.append(dict_communities[com])

list_communities = [members for members in dict_communities.values() if members]

Copilot · 2026-04-21T13:40:24Z

+                self.assertEqual(type(com), list)
+                if len(com) > 0:
+                    self.assertEqual(type(com[0]), int)



big_clam now has new behavior knobs (naive, affiliation_method incl. the overlapping 'threshold' mode), but the test only exercises the default settings and does not cover the threshold-based overlapping output or invalid affiliation_method values. Adding focused assertions for these branches would help prevent regressions.

Suggested change

threshold_coms = algorithms.big_clam(

g, naive=True, affiliation_method="threshold"

)

self.assertEqual(type(threshold_coms.communities), list)

if len(threshold_coms.communities) > 0:

for com in threshold_coms.communities:

self.assertEqual(type(com), list)

if len(com) > 0:

self.assertEqual(type(com[0]), int)

with self.assertRaises(ValueError):

algorithms.big_clam(g, affiliation_method="unsupported")

Copilot · 2026-04-21T13:40:25Z

+    coms = big_clam_communities(
+        g_original,
+        number_communities=dimensions,
+        iterations=iterations,
+        learning_rate=learning_rate,
+        naive=naive,
+        affiliation_method=affiliation_method,
+    )


big_clam advertises support for networkx/igraph inputs, but it passes g_original directly into big_clam_communities(), which calls nx.to_numpy_array() and will fail for igraph graphs. Convert the input with convert_graph_formats(g_original, nx.Graph) (and use the converted graph for the internal call) to keep behavior consistent with the other algorithms in this module.

Copilot · 2026-04-21T13:40:25Z

+    grad = np.zeros((C,))
+    for nb in neighbours:
+        dotproduct = F[nb].dot(F[i])
+        grad += F[nb] * (1 + sigm(dotproduct))
+    grad -= np.sum(F, axis=0)
+    grad += F[i]


gradient_fast() recomputes np.sum(F, axis=0) for every node update, which is O(N*C) per node and can erase the intended speedup on sparse graphs. Consider computing the global sum once per outer iteration (or maintaining it incrementally as F updates) and passing it into gradient_fast.

Copilot · 2026-04-21T13:40:25Z

+            for com in np.nonzero(membership)[0].tolist():
+                dict_communities[com].append(node)
+    else:
+        raise ValueError("Method not supported")


The raised ValueError("Method not supported") is hard to act on and drops the invalid value. Include the provided method and the allowed values (e.g., 'argmax'/'threshold') so callers can quickly correct their input.

Suggested change

raise ValueError("Method not supported")

raise ValueError(

f"Method '{method}' not supported. Allowed values are: 'argmax', 'threshold'."

)

Copilot · 2026-04-21T13:40:26Z

+            F[person] += learning_rate * grad

-            F[person] = np.maximum(0.001, F[person])  # F should be nonnegative
-        log_likelihood(F, A)
+            F[person] = np.maximum(0.00001, F[person])  # F should be nonnegative
+        # log_likelihood(F, A)


Lowering the non-negativity floor from 0.001 to 1e-5 increases the chance that F_u^T F_v becomes extremely small, which makes sigm() blow up (division by ~0) and can produce inf/unstable gradients. Consider keeping the previous floor or making sigm() numerically safe (e.g., via expm1 and clipping) to prevent divergence.

Copilot · 2026-04-21T13:40:26Z

-    #     if len(coms.communities) > 0:
-    #         self.assertEqual(type(coms.communities[0]), list)
-    #         self.assertEqual(type(coms.communities[0][0]), int)
+    def test_bigClam(self):


The test name test_bigClam is inconsistent with the rest of this file’s snake_case test naming (e.g., test_markov_clustering, test_multicom). Renaming to test_big_clam improves consistency and discoverability in test reports.

Suggested change

def test_bigClam(self):

def test_big_clam(self):

feat: restore BigClam compatibility with networkx>=3.0 (no karateclub…

ffe9ec5

… library dependency)

GiulioRossetti self-requested a review April 21, 2026 13:34

GiulioRossetti assigned GiulioRossetti and Copilot Apr 21, 2026

GiulioRossetti requested a review from Copilot April 21, 2026 13:35

Copilot AI reviewed Apr 21, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: restore BigClam compatibility with networkx>=3.0 (no karateclub library dependency)#257

feat: restore BigClam compatibility with networkx>=3.0 (no karateclub library dependency)#257
NouamaneA wants to merge 1 commit intoGiulioRossetti:masterfrom
NouamaneA:bigclam

NouamaneA commented Apr 20, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 21, 2026

Uh oh!

Copilot AI Apr 21, 2026

Uh oh!

Copilot AI Apr 21, 2026

Uh oh!

Copilot AI Apr 21, 2026

Uh oh!

Copilot AI Apr 21, 2026

Uh oh!

Copilot AI Apr 21, 2026

Uh oh!

Copilot AI Apr 21, 2026

Uh oh!

Copilot AI Apr 21, 2026

Uh oh!

Copilot AI Apr 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

		n, m = graph.number_of_nodes(), graph.number_of_edges()
		epsilon = 2 * m / (n * (n - 1))

-        n, m = graph.number_of_nodes(), graph.number_of_edges()
-        epsilon = 2 * m / (n * (n - 1))
+        n = graph.number_of_nodes()
+        if n < 2:
+            epsilon = 0.0
+        else:
+            epsilon = nx.density(graph)
+        epsilon = min(max(epsilon, 0.0), np.nextafter(1.0, 0.0))

+        threshold_coms = algorithms.big_clam(
+            g, naive=True, affiliation_method="threshold"
+        )
+        self.assertEqual(type(threshold_coms.communities), list)
+        if len(threshold_coms.communities) > 0:
+            for com in threshold_coms.communities:
+                self.assertEqual(type(com), list)
+                if len(com) > 0:
+                    self.assertEqual(type(com[0]), int)
+        with self.assertRaises(ValueError):
+            algorithms.big_clam(g, affiliation_method="unsupported")

Conversation

NouamaneA commented Apr 20, 2026

Problem

Solution

Testing

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants