An overview of all the components in this repo, and related components needed to run this service.
downloader: Takesdownloaditems from the queue, downloads an issue's details from the GitHub API, stores their values onto the disk, and then adds anindexitem onto the queue.indexer: Takesindexitems from the queue, and imports JSON files from the disk into ElasticSearch.elasticsearch: Indexed storage, containing all issue information, and provides APIs to query and consume the data.kibana: Web UI for accessing ElasticSearch data, allowing quick and easy quering, and building data visualizations.queue-cli: Small helper tool that allows manual modifications to the queue, like manually queuing issues for download, or checking the current queue state.redis: Small key-value storage, responsible for holding queue items.reverse-proxy: nginx image with pre-defined config, responsible for making sure that the right requests end up at the right components. For example, serving Kibana at/kibana/, or exposing thewebhook-receiveron/webhooks/.webhook-receiver: HTTP server to receive GitHub Webhook notifications. If a webhook is received, the receiver will add adownloadqueue item.
- External reverse proxy/Load Balancer: The docker-compose stack does not listen on public ports, it just listens on
127.0.0.1:8080with an HTTP-only connection. You are responsible for making sure the application is available from the internet. An example configuration for nginx can be found in/_docs/additional-files/external-nginx.conf, adapt as needed. - Cronjobs/systemd-timer: This repository contains two scripts at
/src/scripts/:make-snapshotandremove-old-snapshots. These scripts are responsible for creating daily snapshots of web-bug data, and for deleting snapshots older than one week. It is your responsibility to make sure those scripts are called once per day. - Backups: You absolutely should back up the data generated by this app to external backup storage.
/_docs/backups.mdcontains additional details and hints.