-
Notifications
You must be signed in to change notification settings - Fork 87
perf: pre-allocate hash maps #732
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
|
|
@@ -5,7 +5,7 @@ use std::collections::{HashMap, HashSet}; | |||||
| use std::hash::Hash; | ||||||
| use std::marker::PhantomData; | ||||||
|
|
||||||
| use rustc_hash::{FxHashMap, FxHashSet}; | ||||||
| use rustc_hash::{FxBuildHasher, FxHashMap, FxHashSet}; | ||||||
|
|
||||||
| /// A connected component implementation for various generic types. | ||||||
| /// | ||||||
|
|
@@ -60,7 +60,9 @@ where | |||||
| #[must_use] | ||||||
| pub fn separate_components(groups: &[It]) -> (HashMap<&N, usize>, Vec<usize>) { | ||||||
| let mut table = (0..groups.len()).collect::<Vec<_>>(); | ||||||
| let mut indices = HashMap::new(); | ||||||
| // Pre-size the hash map to reduce reallocations | ||||||
| let estimated_capacity = groups.iter().map(|g| g.into_iter().count()).sum(); | ||||||
| let mut indices = HashMap::with_capacity(estimated_capacity); | ||||||
| for (mut group_index, group) in groups.iter().enumerate() { | ||||||
| let mut is_empty = true; | ||||||
| for element in group { | ||||||
|
|
@@ -103,7 +105,10 @@ where | |||||
| #[must_use] | ||||||
| pub fn components(groups: &[It]) -> C2 { | ||||||
| let (_, gindices) = Self::separate_components(groups); | ||||||
| let mut gb: FxHashMap<usize, FxHashSet<N>> = FxHashMap::default(); | ||||||
| // Pre-size the hash map to reduce reallocations | ||||||
| let estimated_capacity = gindices.iter().filter(|&&n| n != usize::MAX).count(); | ||||||
| let mut gb: FxHashMap<usize, FxHashSet<N>> = | ||||||
| FxHashMap::with_capacity_and_hasher(estimated_capacity, FxBuildHasher); | ||||||
|
||||||
| FxHashMap::with_capacity_and_hasher(estimated_capacity, FxBuildHasher); | |
| FxHashMap::with_capacity(estimated_capacity); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The capacity estimation here iterates through all groups twice - once to count the elements for pre-allocation (line 64) and again in the main loop (line 66). The
into_iter().count()call on line 64 will consume or clone each iterator, potentially causing performance overhead that negates the benefit of pre-allocation.Consider storing the group sizes during the first pass or accepting that the pre-allocation might not be worth the extra iteration cost in all cases. Alternatively, if groups are cheap to iterate, document that this optimization assumes iteration is inexpensive.