Skip to content

Commit 930cb2d

Browse files
committed
Add docs, fix minor issues found while writing docs
1 parent 0d97899 commit 930cb2d

5 files changed

Lines changed: 182 additions & 10 deletions

File tree

README.md

Lines changed: 170 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,170 @@
1-
# PyXIE
2-
A lightweight tracking pixel service written in Python
1+
## PyXIE
2+
### About
3+
A lightweight [Tracking Pixel](https://en.wikipedia.org/wiki/Tracking_Pixel?wprov=srpw1_0) service written in Python.
4+
5+
## Installation
6+
<!-- ### A quick note about the Docker image and port numbers
7+
The Docker container runs PyXIE using the WSGI tool Gnunicorn. If you are running PyXIE as a Docker container:
8+
- Do:
9+
- Configure the environment variable `LISTEN_PORT` (default is `8000`)
10+
- You can also configure the environment variable `LISTEN_IP` but it is unlikely you will want to change this (default is `0.0.0.0`)
11+
- Not recommended:
12+
- Setting `LISTEN_PORT` in `config.yaml` (if you set it, leave it as the default value, `5000`)
13+
- Setting `LISTEN_IP` in `config.yaml` -->
14+
15+
### Quickstart using Docker
16+
#### Pull the image from Dockerhub
17+
```bash
18+
user@shell> docker pull devdull/pyxie:latest
19+
latest: Pulling from devdull/pyxie
20+
21+
> snip <
22+
23+
Status: Downloaded newer image for devdull/pyxie:latest
24+
docker.io/devdull/pyxie:latest
25+
```
26+
27+
#### Create a directory to store PyXIE's data
28+
```bash
29+
user@shell> mkdir data
30+
```
31+
32+
#### Create your configuration file
33+
When running PyXIE as a Docker image, it is recommended to set the `DATABASE_FILE` value in `config.yaml` to ensure that data is persisted between container restarts. Below is a minimal example.
34+
35+
`config.yaml`:
36+
```yaml
37+
DATABASE_FILE: /app/data/uadb.json
38+
API_KEYS:
39+
- your-api-key-here
40+
- a-different-api-key-here
41+
- Another API key with spaces and a comma, but this might be hard to use later.
42+
```
43+
44+
#### Run the image, mounting the data path and configuration file:
45+
```bash
46+
user@shell> docker run -d --mount type=bind,src="./config.yaml",dst="/app/config.yaml" --mount type=bind,src="./data",dst="/app/data" -p 5000:5000 devdull/pyxie:latest
47+
```
48+
49+
#### Test the instance
50+
```bash
51+
user@shell> curl -X POST -H 'X-Api-Key: your-api-key-here' -d 'id=foo' 'http://localhost:5000/register'
52+
Success
53+
user@shell> ls -l data/ # Confirm the data file exists in the bound directory
54+
total 8
55+
-rw-r--r-- 1 user staff 2043 Jul 8 11:57 uadb.json
56+
```
57+
58+
#### Stuff the average user can ignore
59+
The service inside the container is run using Gunicorn. To configure the bind IP and port, you can set the environment variables `LISTEN_IP` and `LISTEN_PORT`. These should not be confused for the configuration items used by Flask which can be defined in `config.yaml`.
60+
61+
### Manual install using Flask (or Gunicorn)
62+
#### Install the app requirements
63+
```bash
64+
user@shell> python3 -m venv .venv
65+
user@shell> source .venv/bin/activate
66+
user@shell> pip3 install -r requirements.txt
67+
```
68+
69+
You should now be able to start PyXIE using Flask with the command `python3 pyxie.py` (listens on `127.0.0.1:5000`) or `gunicorn pyxie:pyxie` (listens on to `0.0.0.0:8000`)
70+
71+
## Usage
72+
### Configuration
73+
Below is a minimal configuration file which lists out API keys. These keys should be long and difficult to guess.
74+
75+
`config.yaml`:
76+
```yaml
77+
API_KEYS:
78+
- your-api-key-here
79+
- a-different-api-key-here
80+
- Another API key with spaces and a comma, but this might be hard to use later.
81+
```
82+
83+
Below is a complete list of user configurable settings:
84+
|Configuration item|Default value|Details|
85+
|---|---|---|
86+
|`LISTEN_IP`|`127.0.0.1`|The IP address to listen on when running with Flask (omit for Docker, Gunicorn)|
87+
|`LISTEN_PORT`|`5000`|The port number to listen on when running with Flask (omit for Docker, Gunicorn)|
88+
|`API_KEYS`|`[]` (empty list)|A list of API keys that should be considered valid by PyXIE|
89+
|`LOG_LEVEL`|`WARNING`|The logging level. Valid values are, `CRITICAL`, `ERROR`, `WARNING`, `INFO`, and `DEBUG`|
90+
|`DATABASE_FILE`|`uadb.json`|The file that stores all pixel tracking data|
91+
|`RRD_MAX_SIZE`|`10000`|Planned to be deprecated! The maximum number of records to keep for each `id`|
92+
93+
### Register a new `id`
94+
The purpose of an `id` is to enable the user to differentiate between the various places a tracking pixel has been embedded. For example, you would want a different `id` for tracking if a user saw an email versus tracking embedded into a specific webpage.
95+
96+
Make a `POST` request to the `/register` endpoint which specifies your new `id` as a parameter using an API key specified in your configuration as the value for a `X-Api-Key` header.
97+
98+
Here is an example that registers an `id` of `testing` for the service when it is running locally:
99+
```bash
100+
user@shell> curl -Ss -X POST -H 'X-Api-Key: your-api-key-here' -d 'id=testing' 'http://127.0.0.1:5000/register'
101+
Success
102+
```
103+
104+
If no `Success` message appears, nothing was registered. Double check your API key, your URL, and your port number.
105+
106+
Using your registered `id` as a `GET` parameter, you should now be able to navigate to the tracking pixel in your browser. For the `id` of `testing` like in the above call, the URL to the tracking pixel would be `http://127.0.0.1:5000/?id=testing`. Any unregistered IDs will result in a "Not Found" message and a `404` status code.
107+
108+
### Embed your tracking pixel
109+
How you embed your pixel will depend on the document format, but here's an example for an HTML page:
110+
```html
111+
<img src="http://127.0.0.1:5000/?id=testing" width="1" height="1" />
112+
```
113+
114+
Because the image is a transparent PNG a single pixel in size, it is unlikely to significantly interfere with the formatting of any website, but placing it at the bottom of a page should minimize any potential formatting issues. Specifying the width and height (like in the example or using CSS) should mitigate the likelihood of a broken image icon on your page should PyXIE go offline, or the `id` to be unregistered.
115+
116+
### View or collect stats
117+
Statistics are only viewable to individuals who have a valid API key, and can be accessed using the `/stats` endpoint.
118+
119+
for example:
120+
```bash
121+
user@shell> curl -Ss -H 'X-Api-Key: your-api-key-here' 'http://127.0.0.1:5000/stats' | jq
122+
{
123+
"browser_family_counts": {
124+
"foo": {
125+
"192.168.1.99": {
126+
"Firefox": 1,
127+
"curl": 1
128+
}
129+
},
130+
"testing": {
131+
"127.0.0.1": {
132+
"Firefox": 3
133+
}
134+
}
135+
},
136+
"os_family_counts": {
137+
"foo": {
138+
"192.168.1.99": {
139+
"Mac OS X": 1,
140+
"Unknown": 1
141+
}
142+
},
143+
"testing": {
144+
"127.0.0.1": {
145+
"Mac OS X": 3
146+
}
147+
}
148+
},
149+
"referrer_counts": {
150+
"foo": {
151+
"192.168.1.99": {
152+
"Unknown": 2
153+
}
154+
},
155+
"testing": {
156+
"127.0.0.1": {
157+
"Unknown": 3
158+
}
159+
}
160+
}
161+
}
162+
```
163+
164+
The data is structured in the following format (examples are from the first block in the above):
165+
- Name of the data (e.g. `browser_family_counts`)
166+
- an `id` you registered (e.g. `foo`)
167+
- The IP address of the individual who viewed the tracking pixel (e.g. `192.168.1.99`)
168+
- The value of the viewer data and the number of times that value has been seen (`Firefox` has been seen `1` time and `curl` has been seen `1` time)
169+
170+
To put all of that together: One or more user at the IP address `192.168.1.99` saw a tracking pixel with an `id` of `foo`. Once with a "browser family" of `Firefox`, and another with `curl`.

constfig.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,8 @@ def __init__(self):
1313
self.LISTEN_PORT = 5000
1414
self.API_KEYS = []
1515
self.LOG_LEVEL = "WARNING"
16+
self.DATABASE_FILE = "uadb.json"
17+
self.RRD_MAX_SIZE = 10000 # Maximum number of records in the database
1618

1719
# Load user config (override defaults above)
1820
self.load_config()

ddb.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -150,7 +150,7 @@ def _get_id(self):
150150
return request.args.get("id")
151151

152152
def register(self):
153-
id = self._get_id()
153+
id = request.form.get("id")
154154
if id in self:
155155
raise KeyError(f"ID {id} already registered")
156156
super().__setitem__(id, _DDB(max_size=self._max_size))
@@ -181,13 +181,13 @@ def _cleanup(self):
181181
for v in self.values():
182182
v._cleanup()
183183

184-
def dump(self, filename="uadb.json"):
184+
def dump(self, filename=C.DATABASE_FILE):
185185
with open(filename, "w") as fout:
186186
json.dump(self, fout, indent=2)
187187
fout.flush()
188188
fout.truncate()
189189

190-
def load(self, filename="uadb.json"):
190+
def load(self, filename=C.DATABASE_FILE):
191191
try:
192192
with open(filename, "r") as fin:
193193
data = json.load(fin)

pyxie.py

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -10,9 +10,7 @@
1010

1111
def _validate_api_key():
1212
api_key = request.headers.get(C.HTTP_HEADER_X_API_KEY)
13-
if api_key in C.API_KEYS:
14-
return True
15-
return False
13+
return api_key in C.API_KEYS
1614

1715

1816
@pyxie.route("/register", methods=[C.HTTP_METHOD_POST])
@@ -52,7 +50,11 @@ def metrics():
5250

5351
@pyxie.route("/", methods=[C.HTTP_METHOD_GET])
5452
def root():
55-
_data()
53+
try:
54+
_data()
55+
except KeyError as e:
56+
return "Not Found", 404
57+
5658
return Response(C.ONE_BY_ONE, mimetype=C.HTTP_MIME_TYPE_PNG)
5759

5860

run.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ if [ -z "$LISTEN_IP" ]; then
55
fi
66

77
if [ -z "$LISTEN_PORT" ]; then
8-
export LISTEN_PORT=8000
8+
export LISTEN_PORT=5000 # Set to 5000 to match Flask's default and avoid confusion in the docs
99
fi
1010

1111
gunicorn --bind $LISTEN_IP:$LISTEN_PORT pyxie:pyxie

0 commit comments

Comments
 (0)