Skip to content

Commit 9b9e561

Browse files
committed
new port
1 parent 33a26e3 commit 9b9e561

41 files changed

Lines changed: 7344 additions & 250 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

404.html

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -570,6 +570,8 @@ <h2>Latest</h2>
570570

571571
<li><a href="/post/">Posts</a></li>
572572

573+
<li><a href="/post/perfect-distribution/">Perfect Distribution Based on GCD</a></li>
574+
573575
<li><a href="/post/binomial-modulo-prime/">Binomial Coefficients Modulo a Prime: Fermat&#39;s Theorem and the Non-Adjacent Selection Problem</a></li>
574576

575577
<li><a href="/post/efficient-implementation-non-adjacent-selection/">Efficient Implementation of the Non-Adjacent Selection Formula</a></li>

content/post/perfect-distribution.md

Lines changed: 60 additions & 54 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ math = true
44
highlight = true
55
tags = ["math","python", "recursion", "recursive functions", "gcd", "uniform distribution"]
66
title = "Perfect Distribution Based on GCD"
7-
draft = true
7+
draft = false
88

99
# Optional featured image (relative to `static/img/` folder).
1010
[header]
@@ -13,47 +13,51 @@ caption = ""
1313

1414
+++
1515

16-
## Perfect Distribution Based On GCD
16+
## Perfect Distribution Based on GCD
1717

18-
We discuss an algorithm that has property of uniform distribution.
19-
The examples below are Software Engineering used cases of this algorithm.
18+
We discuss an algorithm that distributes $a$ ones among $n$ positions so that the gaps between consecutive ones differ by at most one—a **perfect distribution**. The examples below are software engineering use cases for this algorithm.
2019

21-
### Software Engineering Used Case 0
20+
### Use Case 0: Text spacing
2221

23-
Given a string of characters $s$, and the given number $n$ that is greater then the length of $s$.
24-
Extend the string $s$ to the length $n$ by inserting spaces between the words of $s$.
25-
They key requirement is that the distances between two consecutive words will be uniform.
22+
Given a string of characters $s$ and a number $n$ greater than the length of $s$,
23+
extend the string $s$ to length $n$ by inserting spaces between the words.
24+
The key requirement is that the distances between two consecutive words be uniform.
2625

27-
### Software Engineering Used Case 1
26+
### Use Case 1: Stress testing
2827

29-
Assume we want to test a server for stress.
28+
Assume we want to stress-test a server.
3029
Our test contains positive and negative requests.
31-
Assume we want to sends $1000$ request that $300$ of them are negative.
30+
Suppose we want to send $1000$ requests, $300$ of which are negative.
3231
Our test tool might send requests simultaneously or sequentially,
33-
but what we required is that the negative requests will arrive the server with uniform distribution.
34-
I.e., we want to prevent the following: sending 300 negative following by $700$ positive requests.
35-
Using the algorithm, we simple distribute $300$ request by calling ${\cal PD}(300, 1000)$,
36-
containing $1$ in cell $i$ corresponds to the request $i$ to be negative.
37-
Then the test tool will use the array to sends request that now might be sent even sequentially.
38-
39-
### Software Engineering Used Case 2
40-
41-
Assume the test tool can also run performance tests, that is, creates stress on the server and measure performance -- the number of requests per second processed by the server.
42-
We want to extend the test tool by profiling feature: to specify the number of requests being sent per second.
43-
E.g. a performance test shows that the server performance is $1000$ requests per second.
44-
We want test tool will send requests according to specified speed, say $500$, or $200$, or $20$, or even $0.2$ requests per second (i.e., one request per $5$ second).
45-
Such profiling tool allows to explore the state of the server under various stress conditions: utility of CPU, Memory, Threads in depending on various number of requests received.
46-
The test tool may divide $1$ seconds, say, on $100$ parts, i.e. every part is $20$ milliseconds.
47-
Assume $230$ is specified, then during every part, $2.3$ request must be send.
48-
In other words, the test tool must send 2 requests in most of parts and $3$ request in some parts.
49-
Calling ${\cal PD}(30, 100)$ yields an array defining the parts in which 3 requests will be sent.
50-
51-
We present the general solution ${\cal PD}$ to the problem -- for any specified input in above examples, ${\cal PD}$ produces the exact solution.
52-
Any probabilistic or approximation solutions do not achieve the required exactness and usually are much more complicated.
53-
The algorithm ${\cal PD}$ is very simple, clear to prove correctness and to find time complexity.
54-
Moreover, it has a nice relation to well-known Euclid's algorithm of computing Greatest Common Divisor, which is denoted here by ${\cal GCD}$.
55-
56-
Euqlid’s Algorithm $\cal GCD$
32+
but we require that the negative requests arrive at the server with uniform distribution.
33+
I.e., we want to avoid sending 300 negative followed by 700 positive requests.
34+
Using the algorithm, we simply distribute the 300 negative requests by calling ${\cal PD}(300, 1000)$:
35+
cell $i$ is $1$ if and only if request $i$ is negative.
36+
The test tool then uses the array to send requests (possibly sequentially).
37+
38+
### Use Case 2: Rate limiting / profiling
39+
40+
Assume the test tool can run performance tests—creating stress and measuring throughput (requests per second).
41+
We want to extend the test tool with a profiling feature: specify the number of requests sent per second.
42+
E.g., a performance test shows the server handles $1000$ requests per second.
43+
We want the test tool to send requests at a specified rate, say $500$, $200$, $20$, or even $0.2$ requests per second (one request every 5 seconds).
44+
Such a profiling tool allows exploring the server state under various load conditions: CPU, memory, threads depending on the number of requests received.
45+
46+
The test tool may divide 1 second into $100$ parts, each $20$ milliseconds.
47+
Suppose we specify $230$ requests per second: then during every part, $2.3$ requests must be sent on average.
48+
In other words, the tool must send 2 requests in most parts and 3 in some parts.
49+
We need to choose which $30$ parts get the extra request; ${\cal PD}(30, 100)$ yields that array (a $1$ in cell $i$ means part $i$ gets 3 requests).
50+
51+
We present the general solution ${\cal PD}$ to the problem—for any specified input in the above examples, ${\cal PD}$ produces the exact solution.
52+
Any probabilistic or approximation solutions do not achieve the required exactness and are usually more complicated.
53+
The algorithm ${\cal PD}$ is simple, has a clear correctness proof, and admits $O(n)$ time and space complexity.
54+
Moreover, it has a direct relation to Euclid's algorithm for computing the Greatest Common Divisor, denoted here by ${\cal GCD}$.
55+
56+
### Notation
57+
58+
*POST* denotes a condition that the procedure guarantees will hold after the call.
59+
60+
### Euclid's Algorithm ${\cal GCD}$
5761

5862
* POST: $res = {\cal GCD}(a, n)$.
5963

@@ -68,12 +72,14 @@ def gcd(a, n):
6872
return res
6973
```
7074

71-
Deterministic algorithm of perfect distribution $\cal PD$
75+
### Deterministic algorithm of perfect distribution ${\cal PD}$
76+
77+
The recurrence mirrors Euclid's algorithm: ${\cal PD}(a, n)$ calls ${\cal PD}(n \bmod a, a)$, so the recursion depth is $O(\log \min(a, n))$ and total time is $O(n)$.
7278

7379
* POST: size of $res$ is $n$.
74-
* POST: $\forall i, res[i] \{0, 1\}$.
75-
* POST: $\sum_{i=1}^n res[i] = a$
76-
* POST: $res$ is perfectly distributed.
80+
* POST: $\forall i,\, res[i] \in \{0, 1\}$.
81+
* POST: $\sum_{i=0}^{n-1} res[i] = a$ (exactly $a$ ones).
82+
* POST: $res$ is perfectly distributed (gaps between consecutive ones differ by at most 1).
7783

7884
```python
7985
def pd(a, n):
@@ -91,17 +97,13 @@ def pd(a, n):
9197
return res
9298
```
9399

94-
## Notation.
95-
96-
* POST means a condition that the procedure guarantees will hold after the call.
97-
98-
### Example. ${\cal PD}(2, 10): a = 2, b = 8$
99-
We want to mix $2$ ones and $8$ zeros up.
100+
### Example. ${\cal PD}(2, 10)$: $a = 2$, $n = 10$, so $n - a = 8$ zeros
101+
We want to mix $2$ ones and $8$ zeros.
100102
The recursive call ${\cal PD}(0, 2)$ returns $[0, 0] = 0^2$, which is assigned to $arr$.
101103
Then the array $res$ is filled as follows: $[(1, 0, 0, 0, 0), (1, 0, 0, 0, 0)] = (1, 0^4)^2$, which is returned by ${\cal PD}(2, 10)$.
102104

103-
### Example. ${\cal PD}(8, 10): a = 8, b = 2$
104-
We want to mix $8$ ones and $2$ zeros up.
105+
### Example. ${\cal PD}(8, 10)$: $a = 8$, $n = 10$, so $n - a = 2$ zeros
106+
We want to mix $8$ ones and $2$ zeros.
105107
The recursive call ${\cal PD}(2, 8)$ returns $[1, 0, 0, 0, 1, 0, 0, 0] = (1, 0^3)^2$, which is assigned to $arr$.
106108
Then the array $res$ is filled as follows: $[(1, 0, 1, 1, 1), (1, 0, 1, 1, 1)] = (1, 0, 1^3)^2$, which is returned by ${\cal PD}(8, 10)$.
107109

@@ -116,18 +118,22 @@ Then the array $res$ is filled as follows: $[(1, 0, 1, 1, 1), (1, 0, 1, 1, 1)] =
116118

117119
* ${\cal PD}(0, 2) : [0, 0] = 0^2$
118120
* ${\cal PD}(2, 6) : [1, 0, 0, 1, 0, 0] = 1, 0^2 , 1, 0^2 = (1, 0^2)^2$
119-
* ${\cal PD}(6, 8) : [1, 0, 1, 1, 1, 0, 1, 1] = 1, 0, 13 , 0, 12 = (1, 0, 12)^2$
121+
* ${\cal PD}(6, 8) : [1, 0, 1, 1, 1, 0, 1, 1] = 1, 0, 1^3, 0, 1^2 = (1, 0, 1^2)^2$
120122
* ${\cal PD}(8, 14) : [1, 0, 1, 1, 0, 1, 0, 1, 0, 1, 1, 0, 1, 0] = (1, 0, 1, 1, 0, 1, 0)^2$ $ = ((1, 0, 1)^2 , 0)^2$
121123

122-
### Slow version of $\cal GCD$.
124+
### Alternative formulation: subtraction-based (slow) versions
125+
126+
The same recurrence can be expressed using subtraction instead of modulo, which makes the symmetry with Euclid's algorithm more explicit. These versions have the same recursion structure but do more work per step.
127+
128+
**Slow GCD (subtraction-based):**
123129

124130
```python
125131
def gcd(a, n):
126132
assert 0 <= a <= n
127133

128134
res = n
129135
if a > 0:
130-
b = n a
136+
b = n - a
131137
if b > a:
132138
res = gcd(a, b)
133139
else:
@@ -136,7 +142,7 @@ def gcd(a, n):
136142
return res
137143
```
138144

139-
### Slow versions of $\cal PD$
145+
**Slow PD (subtraction-based):**
140146

141147
```python
142148
def pd(a, n):
@@ -145,7 +151,7 @@ def pd(a, n):
145151
res = [0] * n
146152
if a > 0:
147153
ofs = 0
148-
b = n a
154+
b = n - a
149155
if b > a:
150156
arr = pd(a, b)
151157
for i in range(b):
@@ -155,8 +161,8 @@ def pd(a, n):
155161
else:
156162
arr = pd(b, a)
157163
for i in range(a):
158-
res[ofs] = 1 arr[i]
159-
ofs += 2 arr[i]
164+
res[ofs] = 1 - arr[i]
165+
ofs += 2 - arr[i]
160166

161167
return res
162168
```

0 commit comments

Comments
 (0)