Skip to content

⚡️ Speed up method Graph.topologicalSort by 63%#886

Closed
codeflash-ai[bot] wants to merge 1 commit into
mainfrom
codeflash/optimize-Graph.topologicalSort-mhpg8x3o
Closed

⚡️ Speed up method Graph.topologicalSort by 63%#886
codeflash-ai[bot] wants to merge 1 commit into
mainfrom
codeflash/optimize-Graph.topologicalSort-mhpg8x3o

Conversation

@codeflash-ai
Copy link
Copy Markdown
Contributor

@codeflash-ai codeflash-ai Bot commented Nov 7, 2025

📄 63% (0.63x) speedup for Graph.topologicalSort in code_to_optimize/topological_sort.py

⏱️ Runtime : 9.76 milliseconds 5.97 milliseconds (best of 20 runs)

📝 Explanation and details

The optimization achieves a 63% speedup by replacing expensive O(n) list insertions with efficient O(1) appends, then performing a single reverse operation.

Key optimization:

  • Original: Used stack.insert(0, v) which requires shifting all existing elements, costing O(n) per insertion
  • Optimized: Uses stack.append(v) (O(1)) followed by stack.reverse() once at the end

Why this works:
In topological sorting, nodes need to be added in reverse post-order (nodes with no outgoing edges first). The original code achieved this by prepending each node, but this is inefficient. The optimized version appends nodes during traversal, then reverses the entire list once - transforming O(n²) insertion complexity to O(n).

Performance impact by graph type:

  • Large graphs benefit most: Test cases with 1000 nodes show 60-75% speedups (e.g., test_large_disconnected_graph: 208μs → 124μs)
  • Small graphs see minimal impact: Basic test cases show 0-4% variance, as the overhead dominates for small inputs
  • Dense graphs with many recursive calls: Show significant gains due to reduced per-insertion cost

The line profiler confirms this: stack.insert(0, v) took 1.7% of total time with high per-hit cost (374.4ns), while stack.append(v) takes only 1.2% with lower per-hit cost (318.3ns), despite the added reverse operation.

This optimization is particularly valuable for algorithms processing large graphs where topological sorting is called frequently.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 8 Passed
🌀 Generated Regression Tests 72 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 2 Passed
📊 Tests Coverage 100.0%
⚙️ Existing Unit Tests and Runtime
Test File::Test Function Original ⏱️ Optimized ⏱️ Speedup
test_topological_sort.py::test_topological_sort 6.38μs 6.71μs -4.96%⚠️
test_topological_sort.py::test_topological_sort_2 14.2μs 15.8μs -10.1%⚠️
test_topological_sort.py::test_topological_sort_3 7.84ms 4.75ms 65.0%✅
🌀 Generated Regression Tests and Runtime
import uuid
from collections import defaultdict

# imports
import pytest  # used for our unit tests
from code_to_optimize.topological_sort import Graph

# unit tests

# ----------- BASIC TEST CASES ------------

def test_simple_linear_graph():
    # Graph: 0 -> 1 -> 2 -> 3
    g = Graph(4)
    g.graph[0].append(1)
    g.graph[1].append(2)
    g.graph[2].append(3)
    result, sort_id = g.topologicalSort() # 5.88μs -> 6.00μs (2.08% slower)

def test_simple_branching_graph():
    # Graph: 0 -> 1, 0 -> 2, 1 -> 3, 2 -> 3
    g = Graph(4)
    g.graph[0].append(1)
    g.graph[0].append(2)
    g.graph[1].append(3)
    g.graph[2].append(3)
    result, _ = g.topologicalSort() # 5.67μs -> 5.67μs (0.000% faster)

def test_disconnected_graph():
    # Graph: 0 -> 1, 2 -> 3 (two disconnected components)
    g = Graph(4)
    g.graph[0].append(1)
    g.graph[2].append(3)
    result, _ = g.topologicalSort() # 5.54μs -> 5.54μs (0.000% faster)

def test_graph_with_no_edges():
    # Graph with 3 nodes, no edges
    g = Graph(3)
    result, _ = g.topologicalSort() # 5.33μs -> 5.38μs (0.781% slower)

def test_graph_with_multiple_edges():
    # Graph: 0 -> 1, 0 -> 2, 1 -> 2
    g = Graph(3)
    g.graph[0].append(1)
    g.graph[0].append(2)
    g.graph[1].append(2)
    result, _ = g.topologicalSort() # 5.50μs -> 5.33μs (3.13% faster)

# ----------- EDGE TEST CASES ------------

def test_empty_graph():
    # Graph with 0 nodes
    g = Graph(0)
    result, _ = g.topologicalSort() # 4.38μs -> 4.42μs (0.951% slower)

def test_single_node_graph():
    # Graph with 1 node, no edges
    g = Graph(1)
    result, _ = g.topologicalSort() # 5.04μs -> 4.88μs (3.43% faster)

def test_graph_with_cycle():
    # Graph: 0 -> 1 -> 2 -> 0 (cycle)
    g = Graph(3)
    g.graph[0].append(1)
    g.graph[1].append(2)
    g.graph[2].append(0)
    # The current implementation does NOT detect cycles,
    # so it will return some order, but it's not a valid topological sort.
    # We check that all nodes are present, but cannot guarantee order.
    result, _ = g.topologicalSort() # 5.29μs -> 5.25μs (0.781% faster)

def test_graph_with_self_loop():
    # Graph: 0 -> 0 (self loop)
    g = Graph(1)
    g.graph[0].append(0)
    result, _ = g.topologicalSort() # 4.83μs -> 4.62μs (4.52% faster)

def test_graph_with_duplicate_edges():
    # Graph: 0 -> 1 (twice)
    g = Graph(2)
    g.graph[0].append(1)
    g.graph[0].append(1)
    result, _ = g.topologicalSort() # 5.17μs -> 4.96μs (4.20% faster)

def test_graph_with_isolated_nodes():
    # Graph: 0 -> 1, 2 (isolated), 3 (isolated)
    g = Graph(4)
    g.graph[0].append(1)
    result, _ = g.topologicalSort() # 5.42μs -> 5.33μs (1.56% faster)

def test_graph_with_reverse_edges():
    # Graph: 1 -> 0
    g = Graph(2)
    g.graph[1].append(0)
    result, _ = g.topologicalSort() # 5.12μs -> 5.12μs (0.000% faster)

# ----------- LARGE SCALE TEST CASES ------------

def test_large_linear_graph():
    # Graph: 0 -> 1 -> 2 -> ... -> 999
    N = 1000
    g = Graph(N)
    for i in range(N - 1):
        g.graph[i].append(i + 1)
    result, _ = g.topologicalSort()

def test_large_disconnected_graph():
    # Graph: 0->1, 2->3, 4->5, ..., 998->999
    N = 1000
    g = Graph(N)
    for i in range(0, N, 2):
        if i + 1 < N:
            g.graph[i].append(i + 1)
    result, _ = g.topologicalSort() # 208μs -> 124μs (66.7% faster)
    # Each even index before its odd index
    for i in range(0, N, 2):
        if i + 1 < N:
            pass

def test_large_graph_with_multiple_components():
    # 3 chains: 0->1->2->...->332, 333->334->...->665, 666->...->999
    N = 1000
    g = Graph(N)
    for start in [0, 333, 666]:
        for i in range(start, start + 332):
            g.graph[i].append(i + 1)
    result, _ = g.topologicalSort() # 218μs -> 136μs (59.6% faster)
    # Each chain is in order
    for start in [0, 333, 666]:
        for i in range(start, start + 332):
            pass

def test_large_graph_with_no_edges():
    # 1000 nodes, no edges
    N = 1000
    g = Graph(N)
    result, _ = g.topologicalSort() # 198μs -> 119μs (66.4% faster)

def test_large_graph_with_branching():
    # 0 -> 1, 0 -> 2, ..., 0 -> 999
    N = 1000
    g = Graph(N)
    for i in range(1, N):
        g.graph[0].append(i)
    result, _ = g.topologicalSort() # 215μs -> 129μs (65.9% faster)
    # 0 must come before all others
    for i in range(1, N):
        pass

def test_large_graph_with_duplicate_edges():
    # 0 -> 1 (1000 times)
    N = 1000
    g = Graph(2)
    for _ in range(N):
        g.graph[0].append(1)
    result, _ = g.topologicalSort() # 20.8μs -> 15.0μs (38.2% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
import uuid
from collections import defaultdict

# imports
import pytest
from code_to_optimize.topological_sort import Graph

# unit tests

# ----------- BASIC TEST CASES -----------

def test_simple_dag():
    # Test a simple DAG: 0 -> 1 -> 2
    g = Graph(3)
    g.graph[0].append(1)
    g.graph[1].append(2)
    result, sort_id = g.topologicalSort() # 5.50μs -> 5.71μs (3.64% slower)
    # sorting_id should be a valid UUID string
    uuid.UUID(sort_id)

def test_two_independent_chains():
    # Test two independent chains: 0->1, 2->3
    g = Graph(4)
    g.graph[0].append(1)
    g.graph[2].append(3)
    result, _ = g.topologicalSort() # 5.62μs -> 5.71μs (1.45% slower)

def test_multiple_edges():
    # Test a graph where one node has multiple outgoing edges
    # 0->1, 0->2, 1->3, 2->3
    g = Graph(4)
    g.graph[0].extend([1,2])
    g.graph[1].append(3)
    g.graph[2].append(3)
    result, _ = g.topologicalSort() # 5.38μs -> 5.42μs (0.775% slower)

def test_disconnected_nodes():
    # Test a graph with disconnected nodes
    g = Graph(5)
    g.graph[0].append(1)
    # Nodes 2,3,4 are disconnected
    result, _ = g.topologicalSort() # 5.67μs -> 5.71μs (0.718% slower)

def test_already_sorted():
    # Test a graph that's already sorted: 0->1->2->3
    g = Graph(4)
    g.graph[0].append(1)
    g.graph[1].append(2)
    g.graph[2].append(3)
    result, _ = g.topologicalSort() # 5.21μs -> 5.00μs (4.18% faster)

# ----------- EDGE TEST CASES -----------

def test_single_node():
    # Test a graph with a single node
    g = Graph(1)
    result, _ = g.topologicalSort() # 4.96μs -> 5.04μs (1.65% slower)

def test_empty_graph():
    # Test an empty graph (no vertices)
    g = Graph(0)
    result, _ = g.topologicalSort() # 4.33μs -> 4.42μs (1.88% slower)
    
def test_no_edges():
    # Test a graph with multiple nodes and no edges
    g = Graph(3)
    result, _ = g.topologicalSort() # 5.29μs -> 5.29μs (0.019% faster)

def test_cycle_detection():
    # The function does not detect cycles, but let's check the result
    # 0->1, 1->2, 2->0 (cycle)
    g = Graph(3)
    g.graph[0].append(1)
    g.graph[1].append(2)
    g.graph[2].append(0)
    # The function will recurse forever if cycles exist, so we should expect RecursionError
    with pytest.raises(RecursionError):
        g.topologicalSort()

def test_self_loop():
    # Node with a self-loop
    g = Graph(2)
    g.graph[0].append(0)
    # The function will recurse forever if cycles exist, so we should expect RecursionError
    with pytest.raises(RecursionError):
        g.topologicalSort()

def test_non_integer_vertices():
    # Graph with string keys (should not work with this implementation)
    g = Graph(2)
    g.graph['a'].append('b')
    # The function expects integer vertices, so it should not visit these
    result, _ = g.topologicalSort() # 6.54μs -> 6.50μs (0.631% faster)

def test_duplicate_edges():
    # Multiple identical edges
    g = Graph(3)
    g.graph[0].append(1)
    g.graph[0].append(1)
    g.graph[1].append(2)
    result, _ = g.topologicalSort() # 5.96μs -> 5.50μs (8.33% faster)

def test_large_disconnected_graph():
    # Large graph with all nodes disconnected
    N = 1000
    g = Graph(N)
    result, _ = g.topologicalSort() # 201μs -> 123μs (63.4% faster)

# ----------- LARGE SCALE TEST CASES -----------

def test_large_chain():
    # Large chain: 0->1->2->...->999
    N = 1000
    g = Graph(N)
    for i in range(N-1):
        g.graph[i].append(i+1)
    result, _ = g.topologicalSort()

def test_large_tree():
    # Large binary tree (DAG): each node i has edges to 2*i+1 and 2*i+2
    N = 1000
    g = Graph(N)
    for i in range(N):
        left = 2*i + 1
        right = 2*i + 2
        if left < N:
            g.graph[i].append(left)
        if right < N:
            g.graph[i].append(right)
    result, _ = g.topologicalSort() # 222μs -> 136μs (63.7% faster)
    # Parent before children
    for i in range(N):
        left = 2*i + 1
        right = 2*i + 2
        if left < N:
            pass
        if right < N:
            pass

def test_large_sparse_dag():
    # Large sparse DAG: 0->1, 2->3, ..., 998->999
    N = 1000
    g = Graph(N)
    for i in range(0, N-1, 2):
        g.graph[i].append(i+1)
    result, _ = g.topologicalSort() # 196μs -> 113μs (73.9% faster)
    # For each pair, i before i+1
    for i in range(0, N-1, 2):
        pass

def test_large_dense_dag():
    # Large dense DAG: for i in 0..N-1, for j in i+1..N-1, i->j
    N = 100
    g = Graph(N)
    for i in range(N):
        for j in range(i+1, N):
            g.graph[i].append(j)
    result, _ = g.topologicalSort() # 103μs -> 70.1μs (48.2% faster)

def test_performance_large_graph():
    # Test that topologicalSort runs efficiently on a large graph
    N = 1000
    g = Graph(N)
    # Random sparse edges: i->i+1 for i in even numbers
    for i in range(0, N-1, 2):
        g.graph[i].append(i+1)
    import time
    start = time.time()
    result, _ = g.topologicalSort() # 194μs -> 112μs (73.2% faster)
    duration = time.time() - start
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
from code_to_optimize.topological_sort import Graph

def test_Graph_topologicalSort():
    Graph.topologicalSort(Graph(1))
🔎 Concolic Coverage Tests and Runtime
Test File::Test Function Original ⏱️ Optimized ⏱️ Speedup
codeflash_concolic_ofw5g32s/tmpvz1r45nf/test_concolic_coverage.py::test_Graph_topologicalSort 5.17μs 5.46μs -5.35%⚠️

To edit these changes git checkout codeflash/optimize-Graph.topologicalSort-mhpg8x3o and push.

Codeflash Static Badge

The optimization achieves a **63% speedup** by replacing expensive O(n) list insertions with efficient O(1) appends, then performing a single reverse operation.

**Key optimization:**
- **Original**: Used `stack.insert(0, v)` which requires shifting all existing elements, costing O(n) per insertion
- **Optimized**: Uses `stack.append(v)` (O(1)) followed by `stack.reverse()` once at the end

**Why this works:**
In topological sorting, nodes need to be added in reverse post-order (nodes with no outgoing edges first). The original code achieved this by prepending each node, but this is inefficient. The optimized version appends nodes during traversal, then reverses the entire list once - transforming O(n²) insertion complexity to O(n).

**Performance impact by graph type:**
- **Large graphs benefit most**: Test cases with 1000 nodes show 60-75% speedups (e.g., `test_large_disconnected_graph`: 208μs → 124μs)
- **Small graphs see minimal impact**: Basic test cases show 0-4% variance, as the overhead dominates for small inputs
- **Dense graphs with many recursive calls**: Show significant gains due to reduced per-insertion cost

The line profiler confirms this: `stack.insert(0, v)` took 1.7% of total time with high per-hit cost (374.4ns), while `stack.append(v)` takes only 1.2% with lower per-hit cost (318.3ns), despite the added reverse operation.

This optimization is particularly valuable for algorithms processing large graphs where topological sorting is called frequently.
@codeflash-ai codeflash-ai Bot requested a review from aseembits93 November 7, 2025 22:51
@codeflash-ai codeflash-ai Bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Nov 7, 2025
@aseembits93 aseembits93 closed this Nov 7, 2025
@codeflash-ai codeflash-ai Bot deleted the codeflash/optimize-Graph.topologicalSort-mhpg8x3o branch November 7, 2025 23:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant