|
| 1 | +# Adding a Codec |
| 2 | + |
| 3 | +1. Categorize accordingly ; categories are the folder names in `src/codext` (further folder references are relative to this). When a category cannot be put in one of these folders, it shall be put by default in `others`. |
| 4 | + |
| 5 | +2. Add the `.py` file in the relevant category folder, named with the short name of the new codec. |
| 6 | + |
| 7 | +3. Respect the typical structure of a codec's `.py` file according to the following template (double-bracketed enclosures indicate codec parameters, double-arrowed enclosures indicate instructions that may refer to further steps of this guideline): |
| 8 | + |
| 9 | + ```python |
| 10 | + # -*- coding: UTF-8 -*- |
| 11 | + """{{codec_long_name}} Codec - {{codec_short_name}} content encoding. |
| 12 | +
|
| 13 | + {{codec_description}} |
| 14 | +
|
| 15 | + This codec: |
| 16 | + - en/decodes strings from str to str |
| 17 | + - en/decodes strings from bytes to bytes |
| 18 | + - decodes file content to str (read) |
| 19 | + - encodes file content from str to bytes (write) |
| 20 | +
|
| 21 | + Reference: {{codec_source_hyperlink}} |
| 22 | + """ |
| 23 | + from ..__common__ import * |
| 24 | + |
| 25 | + |
| 26 | + __examples__ = {<<dictionary of examples with, as keys, a special format detailed hereafter and, as values, a dictionary mapping source to destination values (see 7.)>>} |
| 27 | + <<optional list of valid codec names to be used with the guessing mode (see 8.), in format "__guess__ = [...]">>] |
| 28 | + |
| 29 | + |
| 30 | + <<constants here, including ENCMAP if the codec is a simple mapping (see 6.)>> |
| 31 | + <<functions here, if the codec requires some additional logic, i.e. when it is not a mapping (see 6.)>> |
| 32 | + |
| 33 | + |
| 34 | + <<put the right add function (see 4.) here with its relevant parameters (see 5.)>> |
| 35 | + ``` |
| 36 | + |
| 37 | +4. Choose the right add function |
| 38 | + |
| 39 | + If the codec is a simple mapping, use the `add_map` function. |
| 40 | + |
| 41 | + Examples: `languages/braille`, `languages/morse`, `languages/southpark` |
| 42 | + |
| 43 | + In some cases, an algorithm can even be equivalent to one or a number of mappings and can then be defined as a dynamic generation of `ENCMAP`. |
| 44 | + |
| 45 | + Examples: `stegano/resistor`, `crypto/barbie` |
| 46 | + |
| 47 | + When the codec is more complex than a mapping, use the `add` function. |
| 48 | + |
| 49 | +5. Configure the add function |
| 50 | + |
| 51 | + Refer to the relevant function signature in `__common__.py`. |
| 52 | + |
| 53 | +6. Write the codec logic |
| 54 | + |
| 55 | + If the codec is a mapping, at least `ENC_MAP` should be defined and refered in the parameters of the `add_map` function. |
| 56 | + |
| 57 | + Examples: `stegano/rick`, `stegano/klopf` |
| 58 | + |
| 59 | + If the codec is not a mapping, the logic can be written in the following order: the encoding function first, then the decoding function. |
| 60 | + |
| 61 | + Examples: `stegano/whitespace`, `crypto/railfence` |
| 62 | + |
| 63 | +7. Write some examples |
| 64 | + |
| 65 | + Examples are used during the automated test generation. They should then be carefully written to also cover some edge cases. A set of 3-8 examples is generally a must. |
| 66 | + |
| 67 | +8. Specify the names to be used with the guessing mode |
| 68 | + |
| 69 | + The `__guess__` list of codec names is used to limit the possibilities in the tree search from the guessing mode. Especially when the codec is dynamic and may have a large (or even infinite) number of dynamic names, it is necessary to set a limited number, generally maximum 16 as a best practice. This list, when relevant, shall be used with due care. |
0 commit comments