|
1 | | -# PyXIE |
2 | | -A lightweight tracking pixel service written in Python |
| 1 | +## PyXIE |
| 2 | +### About |
| 3 | +A lightweight [Tracking Pixel](https://en.wikipedia.org/wiki/Tracking_Pixel?wprov=srpw1_0) service written in Python. |
| 4 | + |
| 5 | +## Installation |
| 6 | +<!-- ### A quick note about the Docker image and port numbers |
| 7 | +The Docker container runs PyXIE using the WSGI tool Gnunicorn. If you are running PyXIE as a Docker container: |
| 8 | +- Do: |
| 9 | + - Configure the environment variable `LISTEN_PORT` (default is `8000`) |
| 10 | + - You can also configure the environment variable `LISTEN_IP` but it is unlikely you will want to change this (default is `0.0.0.0`) |
| 11 | +- Not recommended: |
| 12 | + - Setting `LISTEN_PORT` in `config.yaml` (if you set it, leave it as the default value, `5000`) |
| 13 | + - Setting `LISTEN_IP` in `config.yaml` --> |
| 14 | + |
| 15 | +### Quickstart using Docker |
| 16 | +#### Pull the image from Dockerhub |
| 17 | +```bash |
| 18 | +user@shell> docker pull devdull/pyxie:latest |
| 19 | +latest: Pulling from devdull/pyxie |
| 20 | + |
| 21 | +> snip < |
| 22 | + |
| 23 | +Status: Downloaded newer image for devdull/pyxie:latest |
| 24 | +docker.io/devdull/pyxie:latest |
| 25 | +``` |
| 26 | + |
| 27 | +#### Create a directory to store PyXIE's data |
| 28 | +```bash |
| 29 | +user@shell> mkdir data |
| 30 | +``` |
| 31 | + |
| 32 | +#### Create your configuration file |
| 33 | +When running PyXIE as a Docker image, it is recommended to set the `DATABASE_FILE` value in `config.yaml` to ensure that data is persisted between container restarts. Below is a minimal example. |
| 34 | + |
| 35 | +`config.yaml`: |
| 36 | +```yaml |
| 37 | +DATABASE_FILE: /app/data/uadb.json |
| 38 | +API_KEYS: |
| 39 | + - your-api-key-here |
| 40 | + - a-different-api-key-here |
| 41 | + - Another API key with spaces and a comma, but this might be hard to use later. |
| 42 | +``` |
| 43 | +
|
| 44 | +#### Run the image, mounting the data path and configuration file: |
| 45 | +```bash |
| 46 | +user@shell> docker run -d --mount type=bind,src="./config.yaml",dst="/app/config.yaml" --mount type=bind,src="./data",dst="/app/data" -p 5000:5000 devdull/pyxie:latest |
| 47 | +``` |
| 48 | + |
| 49 | +#### Test the instance |
| 50 | +```bash |
| 51 | +user@shell> curl -X POST -H 'X-Api-Key: your-api-key-here' -d 'id=foo' 'http://localhost:5000/register' |
| 52 | +Success |
| 53 | +user@shell> ls -l data/ # Confirm the data file exists in the bound directory |
| 54 | +total 8 |
| 55 | +-rw-r--r-- 1 user staff 2043 Jul 8 11:57 uadb.json |
| 56 | +``` |
| 57 | + |
| 58 | +#### Stuff the average user can ignore |
| 59 | +The service inside the container is run using Gunicorn. To configure the bind IP and port, you can set the environment variables `LISTEN_IP` and `LISTEN_PORT`. These should not be confused for the configuration items used by Flask which can be defined in `config.yaml`. |
| 60 | + |
| 61 | +### Manual install using Flask (or Gunicorn) |
| 62 | +#### Install the app requirements |
| 63 | +```bash |
| 64 | +user@shell> python3 -m venv .venv |
| 65 | +user@shell> source .venv/bin/activate |
| 66 | +user@shell> pip3 install -r requirements.txt |
| 67 | +``` |
| 68 | + |
| 69 | +You should now be able to start PyXIE using Flask with the command `python3 pyxie.py` (listens on `127.0.0.1:5000`) or `gunicorn pyxie:pyxie` (listens on to `0.0.0.0:8000`) |
| 70 | + |
| 71 | +## Usage |
| 72 | +### Configuration |
| 73 | +Below is a minimal configuration file which lists out API keys. These keys should be long and difficult to guess. |
| 74 | + |
| 75 | +`config.yaml`: |
| 76 | +```yaml |
| 77 | +API_KEYS: |
| 78 | + - your-api-key-here |
| 79 | + - a-different-api-key-here |
| 80 | + - Another API key with spaces and a comma, but this might be hard to use later. |
| 81 | +``` |
| 82 | +
|
| 83 | +Below is a complete list of user configurable settings: |
| 84 | +|Configuration item|Default value|Details| |
| 85 | +|---|---|---| |
| 86 | +|`LISTEN_IP`|`127.0.0.1`|The IP address to listen on when running with Flask (omit for Docker, Gunicorn)| |
| 87 | +|`LISTEN_PORT`|`5000`|The port number to listen on when running with Flask (omit for Docker, Gunicorn)| |
| 88 | +|`API_KEYS`|`[]` (empty list)|A list of API keys that should be considered valid by PyXIE| |
| 89 | +|`LOG_LEVEL`|`WARNING`|The logging level. Valid values are, `CRITICAL`, `ERROR`, `WARNING`, `INFO`, and `DEBUG`| |
| 90 | +|`DATABASE_FILE`|`uadb.json`|The file that stores all pixel tracking data| |
| 91 | +|`RRD_MAX_SIZE`|`10000`|Planned to be deprecated! The maximum number of records to keep for each `id`| |
| 92 | + |
| 93 | +### Register a new `id` |
| 94 | +The purpose of an `id` is to enable the user to differentiate between the various places a tracking pixel has been embedded. For example, you would want a different `id` for tracking if a user saw an email versus tracking embedded into a specific webpage. |
| 95 | + |
| 96 | +Make a `POST` request to the `/register` endpoint which specifies your new `id` as a parameter using an API key specified in your configuration as the value for a `X-Api-Key` header. |
| 97 | + |
| 98 | +Here is an example that registers an `id` of `testing` for the service when it is running locally: |
| 99 | +```bash |
| 100 | +user@shell> curl -Ss -X POST -H 'X-Api-Key: your-api-key-here' -d 'id=testing' 'http://127.0.0.1:5000/register' |
| 101 | +Success |
| 102 | +``` |
| 103 | + |
| 104 | +If no `Success` message appears, nothing was registered. Double check your API key, your URL, and your port number. |
| 105 | + |
| 106 | +Using your registered `id` as a `GET` parameter, you should now be able to navigate to the tracking pixel in your browser. For the `id` of `testing` like in the above call, the URL to the tracking pixel would be `http://127.0.0.1:5000/?id=testing`. Any unregistered IDs will result in a "Not Found" message and a `404` status code. |
| 107 | + |
| 108 | +### Embed your tracking pixel |
| 109 | +How you embed your pixel will depend on the document format, but here's an example for an HTML page: |
| 110 | +```html |
| 111 | +<img src="http://127.0.0.1:5000/?id=testing" width="1" height="1" /> |
| 112 | +``` |
| 113 | + |
| 114 | +Because the image is a transparent PNG a single pixel in size, it is unlikely to significantly interfere with the formatting of any website, but placing it at the bottom of a page should minimize any potential formatting issues. Specifying the width and height (like in the example or using CSS) should mitigate the likelihood of a broken image icon on your page should PyXIE go offline, or the `id` to be unregistered. |
| 115 | + |
| 116 | +### View or collect stats |
| 117 | +Statistics are only viewable to individuals who have a valid API key, and can be accessed using the `/stats` endpoint. |
| 118 | + |
| 119 | +for example: |
| 120 | +```bash |
| 121 | +user@shell> curl -Ss -H 'X-Api-Key: your-api-key-here' 'http://127.0.0.1:5000/stats' | jq |
| 122 | +{ |
| 123 | + "browser_family_counts": { |
| 124 | + "foo": { |
| 125 | + "192.168.1.99": { |
| 126 | + "Firefox": 1, |
| 127 | + "curl": 1 |
| 128 | + } |
| 129 | + }, |
| 130 | + "testing": { |
| 131 | + "127.0.0.1": { |
| 132 | + "Firefox": 3 |
| 133 | + } |
| 134 | + } |
| 135 | + }, |
| 136 | + "os_family_counts": { |
| 137 | + "foo": { |
| 138 | + "192.168.1.99": { |
| 139 | + "Mac OS X": 1, |
| 140 | + "Unknown": 1 |
| 141 | + } |
| 142 | + }, |
| 143 | + "testing": { |
| 144 | + "127.0.0.1": { |
| 145 | + "Mac OS X": 3 |
| 146 | + } |
| 147 | + } |
| 148 | + }, |
| 149 | + "referrer_counts": { |
| 150 | + "foo": { |
| 151 | + "192.168.1.99": { |
| 152 | + "Unknown": 2 |
| 153 | + } |
| 154 | + }, |
| 155 | + "testing": { |
| 156 | + "127.0.0.1": { |
| 157 | + "Unknown": 3 |
| 158 | + } |
| 159 | + } |
| 160 | + } |
| 161 | +} |
| 162 | +``` |
| 163 | + |
| 164 | +The data is structured in the following format (examples are from the first block in the above): |
| 165 | +- Name of the data (e.g. `browser_family_counts`) |
| 166 | + - an `id` you registered (e.g. `foo`) |
| 167 | + - The IP address of the individual who viewed the tracking pixel (e.g. `192.168.1.99`) |
| 168 | + - The value of the viewer data and the number of times that value has been seen (`Firefox` has been seen `1` time and `curl` has been seen `1` time) |
| 169 | + |
| 170 | +To put all of that together: One or more user at the IP address `192.168.1.99` saw a tracking pixel with an `id` of `foo`. Once with a "browser family" of `Firefox`, and another with `curl`. |
0 commit comments