Skip to content

Commit 8ace46c

Browse files
committed
Optimised best_ratio() using one of the laws of logarithms!
Also improved the comments for this bit of code a great deal by explaining more clearly the maths that is going on.
1 parent c149894 commit 8ace46c

File tree

1 file changed

+14
-7
lines changed

1 file changed

+14
-7
lines changed

basest/core/best_ratio.py

Lines changed: 14 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -23,15 +23,22 @@ def _encoding_ratio(base_from, base_to, chunk_sizes):
2323
if not isinstance(s, int):
2424
raise TypeError('chunk sizes must be list of ints')
2525
'''
26-
base_from ** s is the total number of values represented by the input
27-
base and chunk size
26+
We need to work out how many digits in the output base are needed to
27+
represent a number s digits long in the input base.
2828
29-
base_to logarithm of this number, rounded to ceiling is the minimum
30-
number of symbols required in the output ratio to store this number of
31-
values (it might be able to store more than needed, but that doesn't
32-
matter)
29+
The number of values represented by an s digit long number in the input
30+
base is `base_from ** s`
31+
32+
The number of digits in base x needed to represent n values is
33+
`ceil(logx(n))`
34+
35+
Altogether this is `ceil(logx(base_from ** s))`
36+
37+
This can be simplified using the law `n log(x) = log(x ** n)`
38+
39+
To become the following:
3340
'''
34-
match = ceil(log(base_from ** s, base_to))
41+
match = ceil(s * log(base_from, base_to))
3542
# the efficiency ratio is input:output
3643
ratio = (float(s), match)
3744
# ratio efficiences can be compared by dividing them like fractions

0 commit comments

Comments
 (0)