-
Notifications
You must be signed in to change notification settings - Fork 16
Implement sorting by hash #112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from 22 commits
dbbc00e
f6f4c21
540b308
0efe8e0
34dad1a
170870e
e73f642
b02de16
2f2c7a3
c12cac8
0a9dd24
21359c1
9e61374
c11ba84
beab417
7c9c89b
7e3931d
280aa74
1a35afa
ca37af7
f969990
3a4e54e
41c4e9f
775e1ed
6af2de3
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -42,45 +42,24 @@ keys, which means: | |
|
|
||
| ## Benchmark results | ||
|
|
||
| All benchmarks insert 1000 random trigram hashes (scrambled with | ||
| `folded_multiply`) into maps with various configurations. Measured on Apple | ||
| M-series (aarch64). | ||
|
|
||
| ### Insert 1000 trigrams — pre-sized, no growth | ||
|
|
||
| | Rank | Map | Time (µs) | vs best | | ||
| |------|-----|-----------|---------| | ||
| | 🥇 | FoldHashMap | 2.44 | — | | ||
| | 🥈 | FxHashMap | 2.61 | +7% | | ||
| | 🥉 | hashbrown::HashMap | 2.67 | +9% | | ||
| | 4 | **HashSortedMap** | **2.71** | +11% | | ||
| | 5 | hashbrown+Identity | 2.74 | +12% | | ||
| | 6 | std::HashMap+FNV | 3.27 | +34% | | ||
| | 7 | AHashMap | 3.22 | +32% | | ||
| | 8 | std::HashMap | 8.49 | +248% | | ||
|
|
||
| ### Re-insert same keys (all overwrites) | ||
|
|
||
| | Map | Time (µs) | | ||
| |-----|-----------| | ||
| | **HashSortedMap** | **2.36** ✅ | | ||
| | hashbrown+Identity | 2.58 | | ||
|
|
||
| ### Growth from small (`with_capacity(128)`, 3 resize rounds) | ||
|
|
||
| | Map | Time (µs) | Growth penalty | | ||
| |-----|-----------|----------------| | ||
| | **HashSortedMap** | **4.85** | +2.14 | | ||
| | hashbrown+Identity | 9.77 | +7.03 | | ||
|
|
||
| ### Key takeaways | ||
|
|
||
| - **HashSortedMap matches the fastest hashbrown configurations** on pre-sized | ||
| first-time inserts and is **the fastest for overwrites**. | ||
| - **Growth is ~2× faster** than hashbrown thanks to the optimized | ||
| `insert_for_grow` path that skips duplicate checking and uses raw copies. | ||
| - The remaining gap to FoldHashMap (~11%) comes from foldhash's extremely | ||
| efficient hash function that pipelines well with hashbrown's SIMD scan. | ||
| Latest local Criterion snapshot from this repository's | ||
| `target/criterion` outputs (lower is better): | ||
|
|
||
| | Scenario | HashSortedMap | Comparison | Result | | ||
| | :------------------------------------------- | ------------: | :------------------------------------- | :---------- | | ||
| | Insert 1000 trigrams (pre-sized) | 7.34 µs | hashbrown::HashMap: 12.88 µs | ~43% faster | | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This doesn't agree with the other file, which puts us 32% slower than hashbrown on the same microbenchmark. Different architecture?
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It would be best to have all the benchmarks in one place and up to date, and preferably all on Intel if that's what we think most cloud servers have.
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. reran on codespace intel CPU. We might want to retest on a dedicated production machine at some point |
||
| | Grow from capacity 128 | 20.54 µs | hashbrown+Identity: 23.17 µs | ~11% faster | | ||
| | Count 4000 trigrams (`entry().or_default()`) | 12.70 µs | hashbrown+Identity `entry()`: 13.53 µs | ~6% faster | | ||
| | Iterate 1000 trigrams (`iter()`) | 3.93 µs | hashbrown+Identity `iter()`: 2.87 µs | ~37% slower | | ||
| | Sort 100000 trigrams by hash | 1.83 ms | `Vec::sort_unstable`: 2.09 ms | ~12% faster | | ||
| | Merge 100 sorted maps + final sort | 161.93 ms | hashbrown merge + vec sort: 234.70 ms | ~31% faster | | ||
|
|
||
| Key takeaways: | ||
|
|
||
| - `HashSortedMap` is strongest on insert-heavy and merge/sort-heavy paths. | ||
| - Iteration throughput is currently behind `hashbrown+Identity`. | ||
| - In workloads that need deterministic hash-order serialization, the merge and | ||
| sort advantages can outweigh the iteration gap. | ||
|
|
||
| ## Running | ||
|
|
||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -21,3 +21,4 @@ ahash = "0.8" | |
| hashbrown = "0.15" | ||
| foldhash = "0.1" | ||
| fnv = "1" | ||
| itertools = "0.14" | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes so much sense. And removing it from lookup paths explains why we can sort the map and it's still a map. Pretty great outcome!
How valuable is it for resizing? Even if it usually hits, surely it's equally fast to find the first empty slot in the group with SIMD, and that will hit even more often.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed the grow code as well. As a result all occupied slots are at the beginning, so no more special treatment when sorting...