Skip to content

Commit 48c7dc2

Browse files
committed
initial commit
0 parents  commit 48c7dc2

4 files changed

Lines changed: 592 additions & 0 deletions

File tree

.gitignore

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
.vscode/*
2+
encoder.exe
3+
encoder.ilk
4+
encoder.obj
5+
encoder.pdb
6+
vc140.pdb
7+
open-vscode-workspace.bat

README.md

Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,52 @@
1+
# jRLE
2+
## A Simple Run-Length Encoder
3+
Simple run-length encoder written in C++, written for an employment test.
4+
5+
## How To Download And Use
6+
Download the latest release from [the releases tab](https://github.com/Trimatix/cpp-run-length-encoder/releases).
7+
8+
To use, invoke jRLE.exe from the command line with the function as your first argument, and your file path as the second argument.
9+
- Function must be either `-e` for encoding, or `-d` for decoding.
10+
- File must have the extension ".txt" and use ASCII encoding.
11+
12+
## Terminology
13+
### Tokens
14+
In this program, the term "token" refers to a description of a string containing one or more of a single character.
15+
Note: While unencoded and encoded tokens are generally distinguished, it is reasonable to encode an already encoded string. However, a string must be encoded at least once before it can be decoded.
16+
17+
+ An example of an unencoded token ("dToken"): aaa
18+
+ An example of an encoded token ("eToken"): 3a
19+
20+
### Encoding Format
21+
- For character sequences of length less than 9, the encoding is the character count followed by the character.
22+
23+
E.g: "aaa" -> "3a"
24+
25+
- For character sequences of length more than 9, the encoding is prefixed by a '#' character.
26+
27+
E.g: "aaaaaaaaaa" -> "#10a"
28+
29+
- For sequences of digit characters, the string is postfixed by a '#' character.
30+
31+
E.g: "111" -> "31#"
32+
33+
- For sequences of # characters, the string is postfixed by a '#' character.
34+
35+
E.g: "###" -> "3##"
36+
37+
### Long Sequences
38+
"Long sequence" refers to a string containing 10 or more of a single character.
39+
40+
+ An example of an unencoded long sequence: aaaaaaaaaa
41+
+ An example of an encoded long sequence: #10a
42+
43+
### #-Cases
44+
1. "#-case a" refers to the special case where a token is a long sequence
45+
2. "#-case b" where a token consists of '#' chars
46+
3. "#-case-c" where a token consists of a number
47+
48+
#-case c is an issue, as the token must be separated from the next, lest the token be confused with the next one's char count.
49+
50+
Written by Jasper Law 2020
51+
52+
https://github.com/Trimatix/cpp-run-length-encoder

0 commit comments

Comments
 (0)