Skip to content

Commit 9171444

Browse files
committed
some more docu + better choose_k implementation
1 parent 1dde97c commit 9171444

5 files changed

Lines changed: 555 additions & 111 deletions

File tree

crates/consistent-hashing/README.md

Lines changed: 23 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ where `N` is the number of nodes and `R` is the number of replicas.
1717

1818
| Algorithm | Lookup per key<br>(no replication) | Node add/remove | Memory | Lookup with replication |
1919
|-------------------------|--------------------------------------------------------------------------|-----------------|----------------|-----------------------------------------------|
20-
| Hash ring (with vnodes) | O(log N): binary search over N points; O(1): with specialized structures | O(log N) | O(N) | O(log N + R): Take next R distinct successors |
20+
| Hash ring (with vnodes) | O(log(V·N)): binary search; V = 100–200 virtual nodes per physical node | O(log(V·N)) | O(N) | O(log(V·N) + R): walk to next R distinct nodes |
2121
| Rendezvous | O(N): max score | O(1) | O(N) node list | O(N log R): pick top R scores |
2222
| Jump consistent hash | O(log(N)) expected | 0 | O(1) | O(R log N) |
2323
| AnchorHash | O(1) expected | O(1) | O(N) | Not native |
@@ -37,6 +37,28 @@ Why replication matters
3737
- Distributes read/write load across multiple owners, reducing hotspots.
3838
- Enables fast recovery and higher tail-latency resilience.
3939

40+
## Applications beyond replication
41+
42+
The `ConsistentChooseK` iterator produces a per-key ranking of all `n` nodes in priority order — consistently and with zero memory overhead. This ranking is a strict superset of simple replication and enables drop-in replacements for several well-known algorithms that traditionally require maintaining expensive data structures such as hash rings.
43+
44+
### Bounded-load consistent hashing
45+
46+
[Consistent Hashing with Bounded Loads](https://research.google/pubs/pub46580/) (Mirrokni et al., 2018) caps the maximum load any single node may receive. When a key's preferred node is full, it overflows to the next candidate. Classic implementations walk a hash ring to find successors, requiring O(V·N) memory for the ring where V is the number of virtual nodes per physical node (typically V > 100–200 for acceptable load variance). Lookups cost O(log(V·N)) via binary search.
47+
48+
With `ConsistentChooseK`, the ranking iterator directly yields each key's preference list on the fly — no ring required. Assignment becomes: iterate tokens round by round, and for each token advance its ranking iterator until a node with remaining capacity is found. This achieves the same bounded-load guarantees with O(k) for k keys and O(k) time to extract the k-th key.
49+
50+
See [`examples/bounded_load.rs`](examples/bounded_load.rs) for a working implementation.
51+
52+
### Power of two choices
53+
54+
The [power of two choices](https://www.eecs.harvard.edu/~michaelm/postscripts/mythesis.pdf) paradigm (Mitzenmacher, 2001; Azar et al., 1999) assigns each key to the least-loaded of two (or d) randomly chosen nodes. This reduces maximum load from O(log n / log log n) to O(log log n / log d) with high probability.
55+
56+
Traditionally this requires drawing d independent random nodes per key. However, the original algorithm ignores the corner case where multiple independent hash functions collide on the same node, effectively reducing the number of distinct choices below d. With `ConsistentChooseK`, the first d elements from the ranking iterator are guaranteed to be distinct nodes. The choices are also consistent across time — the same key always considers the same d candidates — so reassignment only happens when a node actually joins or leaves.
57+
58+
### Priority-based failover
59+
60+
In active-passive or tiered architectures, each key needs a deterministic failover order. The ranking iterator provides exactly this: the first node is the primary, the second is the hot standby, and so on. When a node fails, the next node in the ranking takes over — consistently for all keys that had the failed node at the same rank position, and without any coordination or ring rebalancing.
61+
4062
## ConsistentChooseK algorithm
4163

4264
The following functions summarize the core algorithmic innovation as a minimal Rust excerpt.

crates/consistent-hashing/benchmarks/performance.rs

Lines changed: 31 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
use std::{
2-
hash::{DefaultHasher, Hash, Hasher},
2+
hash::{DefaultHasher, Hash},
33
hint::black_box,
44
time::Duration,
55
};
@@ -36,12 +36,11 @@ fn throughput_benchmark(c: &mut Criterion) {
3636
b.iter_batched(
3737
|| &keys,
3838
|keys| {
39-
let mut res = Vec::with_capacity(k);
4039
for key in keys {
4140
let mut h = DefaultHasher::default();
4241
key.hash(&mut h);
4342
black_box(
44-
ConsistentChooseKHasher::new(h, k).prev_with_vec(*n + k, &mut res),
43+
ConsistentChooseKHasher::new_with_k(h, *n + k, k),
4544
);
4645
}
4746
},
@@ -53,13 +52,41 @@ fn throughput_benchmark(c: &mut Criterion) {
5352
group.finish();
5453
}
5554

55+
fn append_vs_new_with_k(c: &mut Criterion) {
56+
let mut group = c.benchmark_group("append_vs_new_with_k");
57+
group.plot_config(PlotConfiguration::default().summary_scale(AxisScale::Logarithmic));
58+
for n in [10usize, 100, 1000, 10000] {
59+
for k in [2, 3, 10, 100] {
60+
group.bench_function(
61+
BenchmarkId::new(format!("new_with_k/k_{k}"), n),
62+
|b| {
63+
b.iter(|| {
64+
let h = DefaultHasher::default();
65+
black_box(ConsistentChooseKHasher::new_with_k(h, n + k, k));
66+
})
67+
},
68+
);
69+
group.bench_function(BenchmarkId::new(format!("append/k_{k}"), n), |b| {
70+
b.iter(|| {
71+
let h = DefaultHasher::default();
72+
let mut iter = ConsistentChooseKHasher::new(h, n + k);
73+
black_box(for _ in 0..k {
74+
iter.grow_k();
75+
})
76+
})
77+
});
78+
}
79+
}
80+
group.finish();
81+
}
82+
5683
criterion_group!(
5784
name = benches;
5885
config = Criterion::default()
5986
.warm_up_time(Duration::from_millis(500))
6087
.measurement_time(Duration::from_millis(4000))
6188
.nresamples(1000);
6289

63-
targets = throughput_benchmark,
90+
targets = throughput_benchmark, append_vs_new_with_k,
6491
);
6592
criterion_main!(benches);
Lines changed: 140 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,140 @@
1+
//! Bounded-load consistent hashing example.
2+
//!
3+
//! Pure consistent hashing selects each node with equal probability, but for
4+
//! small workloads (e.g. 64 tokens across 24 machines) random variance causes
5+
//! highly skewed assignments. This example layers a capacity cap on top of
6+
//! ConsistentChooseK to enforce near-perfect balance.
7+
//!
8+
//! Assignment uses round-robin over replicas: first assign every token's
9+
//! most-preferred machine, then every token's second-preferred, etc. This
10+
//! ensures all tokens compete fairly for each replica round.
11+
//!
12+
//! Run with: cargo run --example bounded_load
13+
14+
use std::hash::{DefaultHasher, Hash};
15+
16+
use consistent_hashing::ConsistentChooseKHasher;
17+
18+
/// Round-robin bounded-load assignment.
19+
///
20+
/// For each replica round r = 0..k, iterate over all tokens and assign each
21+
/// to its next most-preferred node that still has capacity. This gives every
22+
/// token equal priority within each round.
23+
fn bounded_load_assign(
24+
rankings: &[Vec<usize>],
25+
k: usize,
26+
n: usize,
27+
max_load: usize,
28+
) -> (Vec<Vec<usize>>, Vec<usize>) {
29+
let mut load = vec![0usize; n];
30+
let num_tokens = rankings.len();
31+
let mut assignments = vec![Vec::with_capacity(k); num_tokens];
32+
let mut cursors = vec![0usize; num_tokens];
33+
34+
for _round in 0..k {
35+
for (token, ranking) in rankings.iter().enumerate() {
36+
while cursors[token] < ranking.len() {
37+
let node = ranking[cursors[token]];
38+
cursors[token] += 1;
39+
if load[node] < max_load {
40+
load[node] += 1;
41+
assignments[token].push(node);
42+
break;
43+
}
44+
}
45+
}
46+
}
47+
(assignments, load)
48+
}
49+
50+
fn main() {
51+
let num_tokens: usize = 64;
52+
let k: usize = 2; // replicas per token
53+
let n: usize = 24; // machines
54+
let total = num_tokens * k;
55+
let cap = total.div_ceil(n); // ceil(128/24) = 6
56+
57+
println!("Parameters: {num_tokens} tokens, k={k} replicas, {n} machines");
58+
println!("Total assignments: {total}, capacity cap per machine: {cap}");
59+
println!(
60+
"Perfect balance: {}×{} + {}×{}\n",
61+
n - total % n,
62+
total / n,
63+
total % n,
64+
total / n + 1
65+
);
66+
67+
// ── Unbounded ────────────────────────────────────────────────────────
68+
let unbounded: Vec<Vec<usize>> = (0..num_tokens as u64)
69+
.map(|key| {
70+
let mut h = DefaultHasher::default();
71+
key.hash(&mut h);
72+
ConsistentChooseKHasher::new(h, n).take(k).collect()
73+
})
74+
.collect();
75+
let mut unbounded_load = vec![0usize; n];
76+
for a in &unbounded {
77+
for &node in a {
78+
unbounded_load[node] += 1;
79+
}
80+
}
81+
82+
// ── Bounded (round-robin) ────────────────────────────────────────────
83+
let rankings: Vec<Vec<usize>> = (0..num_tokens as u64)
84+
.map(|key| {
85+
let mut h = DefaultHasher::default();
86+
key.hash(&mut h);
87+
ConsistentChooseKHasher::new(h, n).collect()
88+
})
89+
.collect();
90+
let (bounded, bounded_load) = bounded_load_assign(&rankings, k, n, cap);
91+
92+
// ── Display ──────────────────────────────────────────────────────────
93+
println!("{:<12} {:>10} {:>10}", "Machine", "Unbounded", "Bounded");
94+
println!("{:-<12} {:->10} {:->10}", "", "", "");
95+
for i in 0..n {
96+
println!(
97+
"{:<12} {:>10} {:>10}",
98+
i, unbounded_load[i], bounded_load[i]
99+
);
100+
}
101+
102+
let ub_min = *unbounded_load.iter().min().unwrap();
103+
let ub_max = *unbounded_load.iter().max().unwrap();
104+
let b_min = *bounded_load.iter().min().unwrap();
105+
let b_max = *bounded_load.iter().max().unwrap();
106+
println!("{:-<12} {:->10} {:->10}", "", "", "");
107+
println!(
108+
"{:<12} {:>10} {:>10}",
109+
"spread",
110+
ub_max - ub_min,
111+
b_max - b_min
112+
);
113+
114+
// ── Consistency check: what happens when we add one machine? ─────────
115+
let n2 = n + 1;
116+
let cap2 = (num_tokens * k).div_ceil(n2);
117+
let rankings2: Vec<Vec<usize>> = (0..num_tokens as u64)
118+
.map(|key| {
119+
let mut h = DefaultHasher::default();
120+
key.hash(&mut h);
121+
ConsistentChooseKHasher::new(h, n2).collect()
122+
})
123+
.collect();
124+
let (bounded2, _) = bounded_load_assign(&rankings2, k, n2, cap2);
125+
126+
let mut changes = 0;
127+
for (before, after) in bounded.iter().zip(bounded2.iter()) {
128+
for node in before {
129+
if !after.contains(node) {
130+
changes += 1;
131+
}
132+
}
133+
}
134+
println!("\nConsistency: adding machine {n} → {n2}");
135+
println!(
136+
" {changes}/{total} assignments changed ({:.1}%), ideal ≈ {:.1}%",
137+
changes as f64 / total as f64 * 100.0,
138+
k as f64 / n2 as f64 * 100.0
139+
);
140+
}

0 commit comments

Comments
 (0)