Skip to content
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 8 additions & 3 deletions src/carpanzano_tearing.jl
Original file line number Diff line number Diff line change
Expand Up @@ -92,7 +92,8 @@ function find_single_solvable_eq!(
) where {F}
(; graph, solvable_graph) = structure
nbors = nbors_buffer
for ieq in active_eqs
# Sort active_eqs iteration for deterministic equation selection regardless of hash order
for ieq in sort(collect(active_eqs))
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would also be resolved if active_eqs is made an OrderedSet{Int} instead of Set{Int}, since then the equations are always iterated over in insertion order. This approach avoids two allocations from collect and sort.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for reviewing, Aayush!
I considered that direction as well, but in this code path active_eqs / active_vars are populated from SCC data that itself comes from Set/Dict-backed structures, so their insertion order is already hash-dependent. If we simply switch to OrderedSet{Int}, we'd just be preserving that hash-dependent insertion order and would still have the same non-determinism problem.

To make OrderedSet a drop-in replacement here, we'd also need to ensure those sets are constructed in a deterministic way (e.g. inserting elements in sorted order during SCC decomposition), which is a larger refactor touching the callers and the SCC building logic.

Given that, I went with the explicit sort(collect(...)) as the minimal, local fix that:

  • removes the hash-order dependence unconditionally, and
  • makes the tie-breaking behavior (lowest index first) explicit and obvious at the call site.

An OrderedSet-based approach would be a worthwhile follow-up optimization, but only after auditing and sorting all insertion sites in the SCC decomposition.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went back and traced the SCC construction more carefully, and I think your suggestion is actually safe to apply directly. I commited the suggested changes.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, we sort SCCs

empty!(nbors)
append!(nbors, Iterators.filter(in(active_vars), 𝑠neighbors(graph, ieq)))
length(nbors) == 1 || continue
Expand Down Expand Up @@ -161,7 +162,8 @@ function carpanzano_tear_scc!(
# is the corresponding number of incident edges.
empty!(enodes_with_min_incidence)
min_incidence_cnt = typemax(Int)
for ieq in active_eqs
# Sort active_eqs for deterministic tie-breaking regardless of hash order
for ieq in sort(collect(active_eqs))
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can also be removed if active_eqs = OrderedSet{Int}()

cnt = count(in(active_vars), 𝑠neighbors(graph, ieq))
cnt > min_incidence_cnt && continue
if cnt == min_incidence_cnt
Expand Down Expand Up @@ -206,11 +208,14 @@ function carpanzano_tear_scc!(
alg_var = 0
max_incidence_cnt = typemin(Int)
min_solvable_cnt = typemax(Int)
for ivar in active_vars
# Sort active_vars for deterministic selection; also fix missing max/min updates
for ivar in sort(collect(active_vars))
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same applies for active_vars

cnt = count(in(active_eqs), 𝑑neighbors(graph, ivar))
solvable_cnt = count(in(active_eqs), 𝑑neighbors(solvable_graph, ivar))
if iszero(alg_var) || cnt > max_incidence_cnt || cnt == max_incidence_cnt && solvable_cnt < min_solvable_cnt
alg_var = ivar
max_incidence_cnt = cnt
min_solvable_cnt = solvable_cnt
end
end

Expand Down
61 changes: 61 additions & 0 deletions test/carpanzano_tearing.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
# Tests for carpanzano_tearing.jl - determinism and Heuristic 2 correctness

using BipartiteGraphs
import Graphs: add_edge!

struct TestSystemStructure <: StateSelection.SystemStructure
graph::BipartiteGraph{Int,Nothing}
solvable_graph::BipartiteGraph{Int,Nothing}
var_to_diff::DiffGraph
eq_to_diff::DiffGraph
end

@testset "carpanzano_tearing" begin
@testset "Heuristic 2 picks variable with maximum incidence" begin
# Graph: 3 equations (source vertices), 3 variables (destination vertices).
# Every edge is also present in the solvable_graph (all variables solvable in
# all equations they appear in), so Heuristic 1 never finds a non-solvable
# variable and falls through to Heuristic 2.
#
# Incidence of each variable (number of equations it appears in):
# v1 (idx 1): eq1, eq2 => 2
# v2 (idx 2): eq1, eq2, eq3 => 3 # maximum -> must be torn (algebraic)
# v3 (idx 3): eq1, eq3 => 2
#
# Before the fix, max_incidence_cnt was never updated inside the Heuristic 2
# loop, so every variable satisfied cnt > typemin(Int) and the last variable
# in hash-iteration order was chosen -- non-deterministic and wrong.
graph = BipartiteGraph(3, 3)
for (eq, var) in [(1,1),(1,2),(1,3),(2,1),(2,2),(3,2),(3,3)]
add_edge!(graph, eq, var)
end
solvable_graph = BipartiteGraph(3, 3)
for (eq, var) in [(1,1),(1,2),(1,3),(2,1),(2,2),(3,2),(3,3)]
add_edge!(solvable_graph, eq, var)
end

structure = TestSystemStructure(
graph, solvable_graph,
DiffGraph(3), DiffGraph(3)
)
alg = StateSelection.CarpanzanoTearing()
# complete() makes invview(var_eq_matching) available, which the algorithm uses
# internally to find the variable matched to a solved equation.
var_eq_matching = complete(
Matching{Union{Unassigned, SelectedState}}(3), 3
)
active_vars = Set{Int}([1, 2, 3])
active_eqs = Set{Int}([1, 2, 3])

StateSelection.carpanzano_tear_scc!(
alg, structure, var_eq_matching, active_vars, active_eqs
)

# v2 has the highest incidence (3 equations) -- Heuristic 2 must choose it as
# the algebraic (torn) variable, leaving it unmatched.
@test var_eq_matching[2] === unassigned
# v1 and v3 must each be matched to some equation.
@test var_eq_matching[1] isa Int
@test var_eq_matching[3] isa Int
end
end
1 change: 1 addition & 0 deletions test/runtests.jl
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ using SparseArrays
using Test

include("bareiss.jl")
include("carpanzano_tearing.jl")

@testset "`get_new_mm`" begin
mm = SSel.CLIL.SparseMatrixCLIL(
Expand Down
Loading