You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
title = "Multiplicative Hashing Functions -- Notes on Primes, Golden Ratio, and Evil"
7
-
draft = true
7
+
draft = false
8
8
9
9
# Optional featured image (relative to `static/img/` folder).
10
10
[header]
@@ -15,14 +15,12 @@ caption = ""
15
15
16
16
## Introduction
17
17
18
-
Distribution of partitions amongst slices has a huge impact on the performance of XIV systems.
19
-
Different solutions for this problem were discussed and implemented in Gen2 and Gen3.
20
-
These approaches are not perfect; in fact, any solution for this problem has some drawbacks.
21
-
Our goal is to find a "good" solution having not too much impact on customers and having acceptable cost in development and QA and have acceptable performance (as less collisions as possible).
22
-
Due to great importance of this problem, we would like present a new approach not yet discussed.
18
+
Mapping partitions (or keys) to slices (or buckets) in a distributed or sharded system has a large impact on performance.
19
+
Different hash-based solutions for this problem exist; each has drawbacks.
20
+
The goal is to choose a hash function that is simple to implement and gives acceptable performance with as few collisions as possible.
23
21
24
22
The problem is defined as follows. Given a logical partition number $P$, compute the corresponding slice number $S = s(P)$.
25
-
Where $0 ≤ P < 2^{32}$, $0 ≤ S < M$, and $M$ denotes the table size, e.g. in Gen2, $M = 2^{14}$.
23
+
Where $0 ≤ P < 2^{32}$, $0 ≤ S < M$, and $M$ denotes the table size (e.g. $M = 2^{14}$).
26
24
In our "binary" world, assumptions that input data (partition numbers) have uniform distribution are not always correct.
27
25
Therefore, the hash function $s: P \to S$ must be designed very carefully.
28
26
In addition to providing uniform distribution of hash values (slice numbers), it has to add some randomness.
@@ -32,13 +30,15 @@ usually explaining in two paragraphs what the book is saying just in one sentenc
32
30
33
31
## Basic Ideas
34
32
35
-
In Gen2 we used the following hash function:
33
+
A first approach is the division-remainder method with $M$ a power of 2:
34
+
36
35
$s(P) = P \bmod M$,
37
36
38
-
where $M = 2^{14}$ -- the table size or the number of slices. In Gen3 it was proposed to change the number of slices from a power of 2 to a prime number:
37
+
where $M = 2^{14}$ is the table size (number of slices). A variant is to take $M$ prime instead:
38
+
39
39
$s(P) = P \bmod M$,
40
40
41
-
where $M = 16411$ is a prime number $> 2^{14}$. Such form of hashing is called "Division Remainder Method". The main idea of this Notes is to demonstrate another hashing method called "Multiplicative Method":
41
+
where $M = 16411$ is a prime $> 2^{14}$. Both forms are called "Division Remainder Method". These notes focus on another method, the "Multiplicative Method":
42
42
43
43
$$f(P) = A \cdot P \bmod W$$
44
44
@@ -107,7 +107,7 @@ uint32_t slice(uint32_t P) {
107
107
}
108
108
```
109
109
110
-
It really works well for any input data (partitions) and allows to use the same number of slices as in Gen2: $2^{14}$. As a read can see, the last function has only two operation, one is multiplication and other is logical shift. On some architectures this function may be faster then the second one finding a modulo!
110
+
It works well for arbitrary input data and allows using the same number of slices $M = 2^{14}$. As the reader can see, the multiplicative version uses only a multiplication and a logical shift; on some architectures it can be faster than computing a modulo.
111
111
112
112
Details for Math Fans
113
113
@@ -137,7 +137,11 @@ Actually, this condition on $M$ is too strong.
137
137
For satisfying this property, it is sufficient that $M ≠ 2^i$ will hold.
138
138
For example, $M = 15 · 12 · 97 = 17460$ (15 modules, 12 disks, 97 is a prime) is also "good".
139
139
140
-
Since $M$ is prime, it seems that the following pattern is not common: $P = S + M · i$. But actually, primes that are close to a power of 2 are also not good. Knuth recommends to choose such $M$ that the following condition will not hold for any small integers $a$ and $j$: $r^j ≡ ± a \pmod M$. Where $r$ denotes the base of computation. From explanation of Knuth, the meaning of $r$ for our case is not too clear: whether $r=2$, $r=16$, or $r=256$? It seems the answer very depends on the type of input data. By Knuth, if $r=2$, the chosen $M$ is not so good, since $M = 16411 = 2^{14} + 27$, and hence $2^{14} ≡ -27 \pmod M$. For $r=16$, we get that $16^ ≡ -108 \pmod M$. For $r=256$, $16^2 ≡ -108 \pmod M$. Knuth explains that such $M$ may produce a hash code that is a simple composition of key digits (in $r$ base system). Instead of trying to understand this explanation, we will give some intuition. Working with numbers, a programmer usually chooses powers of 2 for sizes of structures and buffers (e.g., $2^{10}$ bytes). Then he defines the format of such data and introduces headers (e.g. the header size = 20 bytes). Hence, the size of data without the header becomes very close to the power of 2 (in our example, $2^{10} - 20 = 1004$). On the other hand, embedding this structure to an outer packet (assume the size of this outer header is 30 bytes) leads to the total size being also close to the power of 2 ($2^{10} + 30 = 1054$). As result, most of numbers in our "binary" world are either powers of 2 or close to them. Therefore, such choice of $M$ increases collisions. In other words, not only powers of 2 are *evil*, but primes closing to them are *evil* too.
140
+
Since $M$ is prime, it seems that the following pattern is not common: $P = S + M · i$. But actually, primes that are close to a power of 2 are also not good. Knuth recommends to choose such $M$ that the following condition will not hold for any small integers $a$ and $j$: $r^j ≡ ± a \pmod M$. Where $r$ denotes the base of computation. From explanation of Knuth, the meaning of $r$ for our case is not too clear: whether $r=2$, $r=16$, or $r=256$? It seems the answer very depends on the type of input data.
141
+
142
+
By Knuth, if $r=2$, the chosen $M$ is not so good, since $M = 16411 = 2^{14} + 27$, and hence $2^{14} ≡ -27 \pmod M$. For $r=16$, we get that $16^4 ≡ -108 \pmod M$. For $r=256$, $256^2 ≡ -108 \pmod M$.
143
+
144
+
Knuth explains that such $M$ may produce a hash code that is a simple composition of key digits (in $r$ base system). Instead of trying to understand this explanation, we will give some intuition. Working with numbers, a programmer usually chooses powers of 2 for sizes of structures and buffers (e.g., $2^{10}$ bytes). Then he defines the format of such data and introduces headers (e.g. the header size = 20 bytes). Hence, the size of data without the header becomes very close to the power of 2 (in our example, $2^{10} - 20 = 1004$). On the other hand, embedding this structure to an outer packet (assume the size of this outer header is 30 bytes) leads to the total size being also close to the power of 2 ($2^{10} + 30 = 1054$). As result, most of numbers in our "binary" world are either powers of 2 or close to them. Therefore, such choice of $M$ increases collisions. In other words, not only powers of 2 are *evil*, but primes closing to them are *evil* too.
141
145
142
146
As an example of a "good" prime, let's consider $M = 24571$. It is a bit smaller then the middle of $2^{14}$ and $2^{15}$.
143
147
@@ -172,7 +176,7 @@ We show the implementation of $p()$ in C code for the multiplicative hashing onl
<h2id="perfect-distribution-gcd-in-disguise">Perfect Distribution: GCD in Disguise</h2>
1087
-
<p>We discuss an algorithm that distributes $a$ ones among $n$ positions so that the gaps between consecutive ones differ by at most one—a <strong>perfect distribution</strong>. I developed it while working on profiling, stress, and negative testing of a system that needed exactly this kind of uniform spread. I am not aware of prior art; if you know of related work, I would be interested to hear.</p>
1086
+
<h2id="introduction">Introduction</h2>
1087
+
<p>Mapping partitions (or keys) to slices (or buckets) in a distributed or sharded system has a large impact on performance.
1088
+
Different hash-based solutions for this problem exist; each has drawbacks.
1089
+
The goal is to choose a hash function that is simple to implement and gives acceptable performance with as few collisions as possible.</p>
1090
+
<p>The problem is defined as follows. Given a logical partition number $P$, compute the corresponding slice number $S = s(P)$.
1091
+
Where $0 ≤ P < 2^{32}$, $0 ≤ S < M$, and $M$ denotes the table size (e.g. $M = 2^{14}$).
1092
+
In our “binary” world, assumptions that input data (partition numbers) have uniform distribution are not always correct.
1093
+
Therefore, the hash function $s: P \to S$ must be designed very carefully.
1094
+
In addition to providing uniform distribution of hash values (slice numbers), it has to add some randomness.
1095
+
Luckily, this field is well studied: well-known textbooks of Corman’s and Knuth’s have a good introduction to this field.
1096
+
The last one has more detail explanation; therefore, without hesitation, we make use of Knuth’s book (Section 6.4.):
1097
+
usually explaining in two paragraphs what the book is saying just in one sentence.</p>
1088
1098
</div>
1089
1099
</a>
1090
1100
@@ -1108,7 +1118,7 @@ <h2 id="perfect-distribution-gcd-in-disguise">Perfect Distribution: GCD in Disgu
1108
1118
1109
1119
1110
1120
1111
-
Aug 1, 2017
1121
+
Aug 2, 2017
1112
1122
</span>
1113
1123
1114
1124
@@ -1117,7 +1127,7 @@ <h2 id="perfect-distribution-gcd-in-disguise">Perfect Distribution: GCD in Disgu
1117
1127
1118
1128
<spanclass="middot-divider"></span>
1119
1129
<spanclass="article-reading-time">
1120
-
7 min read
1130
+
12 min read
1121
1131
</span>
1122
1132
1123
1133
@@ -1165,15 +1175,14 @@ <h2 id="perfect-distribution-gcd-in-disguise">Perfect Distribution: GCD in Disgu
<p>In the <ahref="/post/efficient-implementation-non-adjacent-selection/">previous post</a>, we implemented the closed form $F_{n,m} = \binom{n-m+1}{m}$ using Python’s <code>math.factorial</code>, and with <code>scipy</code> and <code>sympy</code>. Here we cover the common competitive-programming case: computing the answer <strong>modulo a large prime</strong> $M$ (e.g. $M = 10^9+7$).</p>
1175
-
<h2id="why-modulo">Why modulo?</h2>
1176
-
<p>In counting problems, the result can be huge even for moderate input. Often the problem asks for the answer modulo a big prime so that it fits in a standard integer type. We could compute the full number and then take the remainder, but that forces expensive long-integer arithmetic. Computing <strong>everything</strong> modulo $M$ from the start is much faster.</p>
1184
+
<h2id="perfect-distribution-gcd-in-disguise">Perfect Distribution: GCD in Disguise</h2>
1185
+
<p>We discuss an algorithm that distributes $a$ ones among $n$ positions so that the gaps between consecutive ones differ by at most one—a <strong>perfect distribution</strong>. I developed it while working on profiling, stress, and negative testing of a system that needed exactly this kind of uniform spread. I am not aware of prior art; if you know of related work, I would be interested to hear.</p>
<p>In the <ahref="/post/two-var-recursive-func/">previous post</a>, we derived the closed form for the non-adjacent selection problem:</p>
1264
-
<p>$$ F_{n, m} = {n - m + 1 \choose m} $$</p>
1265
-
<p>Now we discuss how to implement this efficiently in Python—from a simple factorial-based solution to library implementations. For the common case of computing the answer <strong>modulo a large prime</strong> (e.g. in competitive programming), see the <ahref="/post/binomial-modulo-prime/">next post</a>.</p>
1266
-
<h2id="fast-solutions-based-on-binomials">Fast Solutions Based on Binomials</h2>
1267
-
<p>We can reflect the closed form in very trivial Python code:</p>
1272
+
<p>In the <ahref="/post/efficient-implementation-non-adjacent-selection/">previous post</a>, we implemented the closed form $F_{n,m} = \binom{n-m+1}{m}$ using Python’s <code>math.factorial</code>, and with <code>scipy</code> and <code>sympy</code>. Here we cover the common competitive-programming case: computing the answer <strong>modulo a large prime</strong> $M$ (e.g. $M = 10^9+7$).</p>
1273
+
<h2id="why-modulo">Why modulo?</h2>
1274
+
<p>In counting problems, the result can be huge even for moderate input. Often the problem asks for the answer modulo a big prime so that it fits in a standard integer type. We could compute the full number and then take the remainder, but that forces expensive long-integer arithmetic. Computing <strong>everything</strong> modulo $M$ from the start is much faster.</p>
1268
1275
</div>
1269
1276
</a>
1270
1277
@@ -1288,7 +1295,7 @@ <h2 id="fast-solutions-based-on-binomials">Fast Solutions Based on Binomials</h2
1288
1295
1289
1296
1290
1297
1291
-
Jul 7, 2017
1298
+
Jul 8, 2017
1292
1299
</span>
1293
1300
1294
1301
@@ -1297,7 +1304,7 @@ <h2 id="fast-solutions-based-on-binomials">Fast Solutions Based on Binomials</h2
1297
1304
1298
1305
<spanclass="middot-divider"></span>
1299
1306
<spanclass="article-reading-time">
1300
-
4 min read
1307
+
2 min read
1301
1308
</span>
1302
1309
1303
1310
@@ -1345,19 +1352,17 @@ <h2 id="fast-solutions-based-on-binomials">Fast Solutions Based on Binomials</h2
<p>In this post, we return back to the combinatorial problem discussed in <ahref="/post/intro-to-dp/">Introduction to Dynamic Programming and Memoization</a> post.
1355
-
We will show that generating functions may work great not only for single variable case (see <ahref="/post/gen-func-art/">The Art of Generating Functions</a>),
1356
-
but also could be very useful for hacking two-variable relations (and of course, in general for multivariate case too).</p>
1357
-
<p>For making the post self-contained, we repeat the problem definition here.</p>
1358
-
<h2id="the-problem">The Problem</h2>
1359
-
<blockquote>
1360
-
<p>Compute the number of ways to choose $m$ elements from $n$ elements such that selected elements in one combination are not adjacent.</p>
1361
+
<p>In the <ahref="/post/two-var-recursive-func/">previous post</a>, we derived the closed form for the non-adjacent selection problem:</p>
1362
+
<p>$$ F_{n, m} = {n - m + 1 \choose m} $$</p>
1363
+
<p>Now we discuss how to implement this efficiently in Python—from a simple factorial-based solution to library implementations. For the common case of computing the answer <strong>modulo a large prime</strong> (e.g. in competitive programming), see the <ahref="/post/binomial-modulo-prime/">next post</a>.</p>
1364
+
<h2id="fast-solutions-based-on-binomials">Fast Solutions Based on Binomials</h2>
1365
+
<p>We can reflect the closed form in very trivial Python code:</p>
0 commit comments