Skip to content

Commit 8238b47

Browse files
authored
Merge pull request #21 from saxbophone/develop
v0.6.2
2 parents 0c20dc7 + 4e04665 commit 8238b47

File tree

4 files changed

+40
-195
lines changed

4 files changed

+40
-195
lines changed

README.md

Lines changed: 0 additions & 162 deletions
Original file line numberDiff line numberDiff line change
@@ -189,165 +189,3 @@ Returns tuples containing an integer as the first item (representing the output
189189
>>> basest.core.best_ratio(input_base=256, output_bases=range(2, 334), chunk_sizes=range(1, 256))
190190
(333, (243, 232))
191191
```
192-
193-
## Further Examples
194-
195-
#### Base-78, using emoji as output (just for fun)
196-
> **Note:** This example is aimed at Python 3 and may not work on Python 2 without some modification (or at all).
197-
198-
Unicode character ranges `0x1F601` through to `0x1F64F` are allocated for *emoticon emoji*. This range provides us with 78 characters to play with.
199-
200-
First of all, let's find us some appropriate encoding ratios within given ranges:
201-
202-
```py
203-
>>> from basest.core import best_ratio
204-
>>> best_ratio(256, [78], range(2, 1024))
205-
(78, (1019, 1297)) # hmm, maybe a bit too big
206-
>>> best_ratio(256, [78], range(2, 16))
207-
(78, (7, 9)) # we could probably go a bit larger but this will do
208-
```
209-
210-
Now, let's choose a padding character from one of the other Unicode emoji codepages. I decided to choose the `bear face` emoji (:bear: / 🐻), codepoint `0x1F43B`.
211-
212-
With these chosen parameters and a body of input data (will use text for this example), we can put it all together:
213-
214-
```py
215-
>>> from basest.core import encode
216-
>>> # input data variable
217-
>>> message = ...
218-
>>> output = encode(
219-
... 256, [chr(i) for i in range(256)], # input base and symbol table
220-
... 78, [chr(0x1F601 + o) for o in range(78)], # output base and symbol table
221-
... chr(0x1F43B), # padding character
222-
... 7, 9, # encoding ratio
223-
... message
224-
... )
225-
```
226-
227-
Given this input message (in ASCII):
228-
229-
```
230-
Fourscore and seven years ago our fathers brought forth on this
231-
continent a new nation, conceived in liberty and dedicated to the
232-
proposition that all men are created equal.
233-
Now we are engaged in a great civil war, testing whether that nation
234-
or any nation so conceived and so dedicated can long endure. We are
235-
met on a great battle field of that war. We have come to dedicate a
236-
portion of that field, as a final resting place for those who here
237-
gave their lives that that nation might live. It is altogether
238-
fitting and proper that we should do this.
239-
But, in a larger sense, we can not dedicate - we can not consecrate
240-
- we can not hallow - this ground. The brave men, living and dead,
241-
who struggled here, have consecrated it, far above our poor power to
242-
add or detract. The world will little note, nor long remember, what
243-
we say here, but it can never forget what they did here. It is for
244-
us the living, rather, to be dedicated here to the unfinished work
245-
which they who fought here have thus far so nobly advanced. It is
246-
rather for us to be here dedicated to the great task remaining
247-
before us - that from these honored dead we take increased devotion
248-
to that cause for which they gave the last full measure of devotion
249-
- that we here highly resolve that these dead shall not have died in
250-
vain - that this nation, under God, shall have a new birth of
251-
freedom - and that government of the people, by the people, for the
252-
people, shall not perish from the earth.
253-
```
254-
255-
We get this output:
256-
257-
😃😉😳😿😷😤😿😺🙆😗🙆🙃😢😼🙊🙋😧😡😇😴🙎🙉😋😧😲😑😍😙🙊😖😿😰😿😂😼😤😖😔😤
258-
259-
😜😅😬😃😝😉😖😃😭😷🙇😥😅😗😰😇😳🙊😎😟😔😌😝🙍😘🙃🙂😬😧😻😟😠😏😇😴😻😬😹😆
260-
261-
😌🙊😈😘😲😣😐😜😣😐😆😨😗😶😂😔😟🙎😃😗🙍😗😶😂😡😘😜😓😛😩😖😴😩😰😸😩😈😜😊
262-
263-
😗😵🙆😙😗🙌😹😼😃😇😴😞😕😼😟😲🙊😡😕🙂😰🙀😫😊😼😗😗😕😭😤😕😝🙃🙇😽😔😕😂😲
264-
265-
😹😺😏😎😬😂😇😵😅🙄😚😎😛😑😣😗🙆😹🙄😦😃😂😝😾😗🙆😮😯😘🙍🙃🙂😏😇😳🙅😎😱😈
266-
267-
😛🙌😼😗😱😷😊😄😐😎😵🙀😘😨😉😭😇😧🙁🙇😝😕🙂😫🙋😅🙌😺🙀🙍😑😉🙄😹😞😕😣😟😅
268-
269-
😕😂😨🙋😶😯😨😟😈😕😁🙁😔🙎😡😾😅😓😇😳🙃😹😪😥😞🙍😖😘🙃🙂😧🙎🙊😈😦😃😇😵😔
270-
271-
😞😥😒😗😎😏😕🙂😵😷😖😤😣😧🙍😙😪😡😻😄😓😟😄😱😇😵😅🙄😘😫😩😖😩😕😂😲😿😻😣
272-
273-
😆😦🙆😘😣😾😶😄😓😓😨😅😕😂😲😿😻😣😇😡😌😗🙁😹😄😨😝🙌😡😊😖😴🙋😱😢😷😹🙈😍
274-
275-
😕😭😤😫😺😼😲🙇😖😕😲😂😴😎😟😯😭😅😇😳🙎😹🙈😩🙆😨🙊😗😶😊😩🙉😲🙍😃😲😘😨😈
276-
277-
😮😹😸🙍😩😙😕😂😨🙋😬😲🙁😆😇😇😴😻😬😹😃😧😓🙄😘😨😉😭😇😞😘😭😦😘🙉😈😾😼😧
278-
279-
😵😭😒😕🙂😓😑😕😺😱😁😩😘🙈😛🙎😭😰🙅😖😄😘😤😳😾😽🙂😆🙋😱😕😂😼😦🙎😝😨😊🙄
280-
281-
😕😽😦😤😺😔😳😁😆😕😲😂😴😎😟😯😬😏😔🙊😁😠😮😷😥😺😼😗🙆😮😯😖😸😽😭🙅😖😣😱
282-
283-
😦😧😺😭😲😐😗😕🙆😙🙆😨😇😫😥😔🙋😞😱🙃😟😬😪😓😇😴🙊😄😗😹😆😧😇😖😏😪😍😽😾
284-
285-
😊😯😦😇😴😏😳😔😋😃😝😹😗🙆🙈😙😯😢😬😪😮😇😴😙😒😂😿🙈😑😬😕😂😼😦🙎😟🙂🙋😖
286-
287-
😖😴😶😽😪😠🙅😘😼😘😳🙀🙉😽😬😌🙃😱😘🙈😛🙎😭😰🙄🙆😨😘🙈😡😌😏😿😚😾😠😖😔😃
288-
289-
😼😽😟🙊🙎😭😕😾😛😑😊😰😱😢🙄😘😳🙀😭😪😽😙😑😸😕🙂😺😜😒😷😒😹😓😖😴🙂😌😊😸
290-
291-
😹😮😓😕😂😕😠🙆🙀😏😌😦😘😈😆😈🙅😉😅🙄😈😘🙃🙂🙅😹😲😕😖😻😗🙇😄😒😅😑😺🙍🙄
292-
293-
😇😵😅🙄😜😎😑🙃😇😎😳🙋😕🙄😴😒😌😹😇😳🙃😹😬😷🙀😟😈😕🙂😯😑😫😼😇😗😠😕😾😑
294-
295-
😣😺😫😶😴🙆😕😂😔😊😏🙈😸😔😳😕😱😽😌😠😔😞😥😑😕😽😥😈😴😕😅😟😽😕😡😧🙈😄😗
296-
297-
😈🙇😥😇😳🙎🙎😻😋😝😦😾😘😧🙄😟🙋😟😪😴😃😙😪😑😽🙀🙈😐😭🙅😗😶😳😏😈😵😝🙂😟
298-
299-
😗😖😯😤😫😼😄😳🙅😖😤😊😩😱😔😶😰😢😙😊😻😓😈😈😳😚🙋😕😽😦😉🙍😋😞🙅🙊😇😴😱
300-
301-
😲😋😉😦😙😉😖😴🙋😷😣😧😚😭😦😗😵🙉🙅😣🙂😵😢😯😊😄😹😙😌😵🙀🙄😾😘🙈🙍😐😨🙊
302-
303-
😑😉😮😕😭😤😛😚😂😋😃😡😇😴😙😌😈😧😄🙆😲😗🙆😰😎😶🙈😟😙😢😘🙈😍😠😆😣😔😢😳
304-
305-
😇😴😏😞😡😵😁😝😛😗🙇😈🙌😆😻😍🙂😕😇😴🙀😥😧😛🙉😣😥😗🙇😍🙃😞😑🙂😂🙎😃😋😛
306-
307-
😡😫😑😨😗🙃😇😴😅😶😹🙌😏🙃🙈😘🙄😷😭😽😟😚😕😕😙😪🙄😎😊😋😟😁😠😖😴😚🙉😏🙀
308-
309-
😧🙄😋😘🙈😯😯😛😮😦🙈🙇😕😾😑😣😶😦🙉😤😯😗😖😯😗😭😱😕😧😂😗😥🙎😛😘😥😐😣😅
310-
311-
😇😵😔😨😽🙅😑😗😄😕😽😦😣😎😃😝🙁😹😕🙂😰😩😽😹🙉😫🙊😘🙃🙂😰🙍🙌😫😛🙆😗😱😷
312-
313-
😝😘😼😠😧🙋😇😴😏😳😔😔🙁😫😑😇😵😔😨😽🙅😒😑😑😖😣🙅😉😓😵😢😱😁😇😴😙😒😂😿
314-
315-
🙉😸😿😐😈😄😂🙍😡😭😇😃😗🙆🙁😷🙆😨😝🙌😓😖😣🙃😡😗😐😅🙁😲😗😶😊😻😞😼🙍😺😯
316-
317-
😖😣🙄🙌🙆😼😗😌😦😇😳🙉🙈😮🙍😠😨😷😖😳😼😽🙅😩🙂😘😛😖😣🙄🙍😒😯😤😋😸😇😵😅
318-
319-
🙄😚😑😞😠😙😖😄😆😳😯😗😣😿🙊😕😭😤😱😷😄😕😉😃😙😪😡🙀🙆😴😱😲😯😖😣🙅😉😓😸
320-
321-
😅😓😼😇😴😏😳😕😰🙆😤😯😇😴😙😒😂😿🙉😋😽😕😂😼😦🙎😟🙂🙋😦😘😳🙀😴🙇🙊🙌😆🙊
322-
323-
😗🙁😹😔🙄😸😰😼😬😇😳🙅😂😽😦😖🙆😒😕🙁😹😊😚😛🙈😖😿😖😴😻😓😴🙁🙄😥😵😕🙂😯
324-
325-
😑😥🙅🙄😠🙃😙😋😄😜😥🙊😐😸😍😕😽😦😒🙁😥😬😥😽😕😱😽😌😠😔😞😥😑😕🙁😸🙃🙎😂
326-
327-
🙀🙍😉😖😣🙃😡😔🙊😪😇😂😘🙃🙂🙁😓🙄😮😲😷😘😨😉😾🙁😗😪🙍😮😗😶😊😉😏😫😆🙈😪
328-
329-
😘😨😈😚😟🙁🙄😸😠😇😵😅🙄😘😫😩😖😡😘😨😺😰😸😖😺😬🙀😘😸😊😑🙁😟😏😰😖😘😨😉
330-
331-
😱😅😩😷😆😄😕😭😤😱😲😝😠😓😺😗😅🙉😇😾😐🙌😢😡😕🙁😫😾🙆😰🙀😎😻😕🙂🙄😔😞😯
332-
333-
😧😐😅😃😌😪😟😑😚😋😡😂😘🙃🙂😧🙋😙😋😜😤😇😴😏😳😔😋😃😧😭😖😳😼🙇😿😸😐😭😚
334-
335-
😙🙅🙌😃😢😸😷😝😩😘🙈😜😆😃😿🙉😫😸😘🙃🙂😬😪😤😵🙎🙃😗😥🙎😉😫😿😬😄😠😇😴😻
336-
337-
😠🙀😽😝😐🙅😗🙆🙍😖🙁😿😆😵😒😇😵😅🙄😘😫😩😖😲😕😽😦😒🙁😥😬😥😽😖😤😊😘😏😧
338-
339-
🙍😾🙋😘😨😉🙇🙁🙂😛🙉😖😇😵😅🙄😘😫😩😖😯😖😣🙄🙎😸😲😂😊😜😕😁😱😗🙋😺🙀🙊😛
340-
341-
😗😑😳😮😜😞😽😋😅😕😂😼😦🙎😝😲😳😖😕😭😤😜🙃🙊😪😨😤😖😴😣😓🙃🙋😓😆😋😕😂😱
342-
343-
😡😏😗😕🙂😦😇😴😶😣😄😔😛🙍😇😊😆😈😶😟😅🙀😞😁😇😲😔😗😬😆😇😋😈😖😣😱😛😃😿
344-
345-
😑😱😒😙😚😏🙆😘😣😇😽🙆😙😥🙈😍😃😕😕😩😩😇😴😻😠😶😿🙄😎😤😕🙁😺😝😥😙😵😘😏
346-
347-
😕😂😕😠🙆🙀😹🙀😘😘🙃🙂😭🙍😾😨😩😏😗😶😩😋😮😴🙄😟🙈😕🙍😨😜😐😴😇😵😉😕🙂😢
348-
349-
😈😃🙆😴🙎😇😕😒🙋😸🙀🙋😪😳🙁😘😈😆😄🙅😔😫😗😉😇😴😏😳😔😋😃😝😹😕😼😈🙌😫🙍
350-
351-
😓😅🙍😕😾😑😣😸😺😹😚🙇😗😑😳😮😜😞😽😋😍😕🙂😰😰😌😎😔😙😣😘😨😺😰😸😖😺😬🙀
352-
353-
😇😴😊😧😯😞😄😎😵😃😅😓🐻🐻🐻🐻🐻🐻

basest/core/decode.py

Lines changed: 12 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -18,21 +18,24 @@ def decode_raw(input_base, output_base, input_ratio, output_ratio, input_data):
1818
base64 input would be in the range 0-63).
1919
"""
2020
# create a 'workon' copy of the input data so we don't end up changing it
21-
before = list(input_data)
21+
input_workon = list(input_data)
2222
# count number of padding symbols
23-
padding_length = before.count(input_base)
23+
padding_length = input_workon.count(input_base)
2424
# now, replace all padding symbols with the maximmum symbol
2525
'''
2626
Explanation: This solution is for bases that don't match up exactly, given
2727
their chosen ratios. It was inspired by the same technique that is used in
2828
base85/ascii85 decoding and does not negatively impact 'perfect' aligning
2929
bases such as base64.
3030
'''
31-
before = [(s if s != input_base else input_base - 1) for s in before]
31+
input_workon = [
32+
(s if s != input_base else input_base - 1) for s in input_workon
33+
]
3234
# use the encode_raw function to convert the data
3335
output_data = encode_raw(
3436
input_base=input_base, output_base=output_base,
35-
input_ratio=input_ratio, output_ratio=output_ratio, input_data=before
37+
input_ratio=input_ratio, output_ratio=output_ratio,
38+
input_data=input_workon
3639
)
3740
# strip off the unnecessary padding symbols if there was padding
3841
[output_data.pop() for _ in range(padding_length)]
@@ -53,11 +56,14 @@ def decode(
5356
"""
5457
# create workon copy of input data and convert symbols to raw ints
5558
# NOTE: input symbol table here includes the padding character
56-
before = symbols_to_ints(input_data, input_symbol_table + [input_padding])
59+
input_workon = symbols_to_ints(
60+
input_data, input_symbol_table + [input_padding]
61+
)
5762
# use decode_raw() to decode the data
5863
output_data = decode_raw(
5964
input_base=input_base, output_base=output_base,
60-
input_ratio=input_ratio, output_ratio=output_ratio, input_data=before
65+
input_ratio=input_ratio, output_ratio=output_ratio,
66+
input_data=input_workon
6167
)
6268
# convert raw output data back to symbols using output symbol table
6369
return ints_to_symbols(output_data, output_symbol_table)

basest/core/encode.py

Lines changed: 27 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,20 @@
77
from .utils import ints_to_symbols, symbols_to_ints
88

99

10+
def _nearest_length(input_length, input_ratio):
11+
"""
12+
Returns the nearest data length from the input data that is divisible by
13+
the input ratio, using overlap if there is any.
14+
"""
15+
# calculate the amount of overlap (if any)
16+
overlap = input_length % input_ratio
17+
# calculate the nearest input length that can contain our length
18+
return (
19+
input_length if overlap == 0
20+
else ((((input_length - overlap) // input_ratio) + 1) * input_ratio)
21+
)
22+
23+
1024
def encode_raw(input_base, output_base, input_ratio, output_ratio, input_data):
1125
"""
1226
Given an input base, an output base, input ratio, output ratio and input
@@ -17,39 +31,25 @@ def encode_raw(input_base, output_base, input_ratio, output_ratio, input_data):
1731
output would be in the range 0-63).
1832
"""
1933
# create a 'workon' copy of the input data so we don't end up changing it
20-
before = list(input_data)
34+
input_workon = list(input_data)
2135
# store length of input data for future reference
22-
input_length = len(before)
23-
# calculate the amount of overlap (if any)
24-
overlap = input_length % input_ratio
25-
'''
26-
get the nearest data length from the input data that is divisible by
27-
the input ratio, using overlap if there is any
28-
'''
29-
input_nearest_length = (
30-
input_length if overlap == 0
31-
else (
32-
(
33-
(
34-
(input_length - overlap) // input_ratio
35-
) + 1
36-
) * input_ratio
37-
)
38-
)
36+
input_length = len(input_workon)
37+
# get nearest data length that the input data fits in
38+
input_nearest_length = _nearest_length(input_length, input_ratio)
3939
# calculate the amount of padding needed
4040
padding_length = (input_nearest_length - input_length)
4141
# get the output length, based on nearest divisible input length
4242
output_length = (input_nearest_length // input_ratio) * output_ratio
4343
# create a new list for the output data
4444
output_data = [0] * output_length
4545
# extend the input_data to the nearest divisible length (for padding)
46-
before.extend([0] * padding_length)
46+
input_workon.extend([0] * padding_length)
4747
# encode the data - store each group of input_ratio symbols in a number
4848
for i in range(0, input_nearest_length, input_ratio):
4949
store = 0
5050
for j in range(0, input_ratio):
5151
# store value of symbol
52-
symbol = before[i + j]
52+
symbol = input_workon[i + j]
5353
# upscale it if neccessary, in a little-endian manner
5454
symbol *= (input_base ** (input_ratio - j - 1))
5555
# add to store
@@ -58,15 +58,15 @@ def encode_raw(input_base, output_base, input_ratio, output_ratio, input_data):
5858
now that store contains the value of a number of symbols, separate this
5959
out to the output symbols
6060
'''
61-
for j in range(0, output_ratio):
61+
for k in range(0, output_ratio):
6262
# convert output array index
63-
index = ((i // input_ratio) * output_ratio) + j
63+
index = ((i // input_ratio) * output_ratio) + k
6464
# re-interpret the number in terms of output base
65-
symbol = store // (output_base ** (output_ratio - j - 1))
65+
symbol = store // (output_base ** (output_ratio - k - 1))
6666
# store at the calculated position
6767
output_data[index] = symbol
6868
# decrement the store variable, having now encoded part of it
69-
store -= (symbol * (output_base ** (output_ratio - j - 1)))
69+
store -= (symbol * (output_base ** (output_ratio - k - 1)))
7070
# set padding bytes to padding symbol, if needed
7171
for i in range(output_length - padding_length, output_length):
7272
output_data[i] = output_base
@@ -86,11 +86,12 @@ def encode(
8686
symbol.
8787
"""
8888
# create workon copy of input data and convert symbols to raw ints
89-
before = symbols_to_ints(input_data, input_symbol_table)
89+
input_workon = symbols_to_ints(input_data, input_symbol_table)
9090
# use encode_raw() to encode the data
9191
output_data = encode_raw(
9292
input_base=input_base, output_base=output_base,
93-
input_ratio=input_ratio, output_ratio=output_ratio, input_data=before
93+
input_ratio=input_ratio, output_ratio=output_ratio,
94+
input_data=input_workon
9495
)
9596
# convert raw output data back to symbols using output symbol table
9697
# NOTE: output symbol table here includes the padding character

setup.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@ def retrieve_deps(filepath):
3030

3131
setup(
3232
name='basest',
33-
version='0.6.1',
33+
version='0.6.2',
3434
description=(
3535
'Converts symbols from any number base to any other number base'
3636
),

0 commit comments

Comments
 (0)