Production observability is centered on Sentry plus the existing uptime and host health monitors.
Set these variables in /home/ciembor/polish-open-source-rank/.env.local:
SENTRY_DSNSENTRY_ENVIRONMENT=productionSENTRY_RELEASE=<git-sha-or-release-id>SENTRY_TRACES_SAMPLE_RATE=0.05
Configure Sentry alerts for:
- new or regressed exceptions,
- HTTP 5xx growth,
- p95 transaction latency growth,
- failed or missed
monthly-rankingscheck-ins, - failed or missed
package-rankingscheck-ins, - custom events tagged with
monitor=production-alert.
The host alert timer also reads these optional thresholds from
/home/ciembor/polish-open-source-rank/.env.local:
PRODUCTION_ALERT_JOB_STALE_MINUTES=30PRODUCTION_ALERT_LOG_WINDOW_MINUTES=10PRODUCTION_ALERT_HTTP_5XX_THRESHOLD=5PRODUCTION_ALERT_HTTP_MIN_REQUESTS=20PRODUCTION_ALERT_P95_LATENCY_MS_THRESHOLD=1000PRODUCTION_ALERT_SQLITE_RETRY_THRESHOLD=10
Set these variables in /home/ciembor/polish-open-source-rank/.env.local:
CLOUDFLARE_ZONE_IDCLOUDFLARE_API_TOKEN
The token must be scoped to the polish-open-source.pl zone with
Zone -> Cache Purge -> Purge. Keep it out of commits and rotate it if it is
ever pasted into chat, logs, or a shell history that other people can read.
Verify a token without printing it:
curl -fsS -H "Authorization: Bearer $CLOUDFLARE_API_TOKEN" \
https://api.cloudflare.com/client/v4/user/tokens/verifyVerify purge permission with a narrow URL purge before relying on automatic monthly purges:
curl -fsS -X POST "https://api.cloudflare.com/client/v4/zones/$CLOUDFLARE_ZONE_ID/purge_cache" \
-H "Authorization: Bearer $CLOUDFLARE_API_TOKEN" \
-H "Content-Type: application/json" \
--data '{"files":["https://polish-open-source.pl/healthz"]}'bin/publish_snapshot uses purge_everything after successful monthly publish
and rollback because those actions update public pages and badges across many
routes. If a public badge or page is stale, first compare Cloudflare and origin:
curl -I https://polish-open-source.pl/badges/repositories/github/ciembor/agent-rules-books.svg
ssh ciembor@maciej-ciemborowicz.eu \
'curl -sS -I http://127.0.0.1:9293/badges/repositories/github/ciembor/agent-rules-books.svg'If origin is correct and Cloudflare is stale, purge Cloudflare. If origin is wrong, inspect publication data before purging CDN cache.
SESSION_SECRET in /home/ciembor/polish-open-source-rank/.env.local must be
at least 64 characters. Generate a value with:
ruby -rsecurerandom -e 'puts SecureRandom.hex(32)'/internal/* is protected by application Basic Auth. Set these variables in
/home/ciembor/polish-open-source-rank/.env.local:
INTERNAL_BASIC_AUTH_USERNAMEINTERNAL_BASIC_AUTH_PASSWORD
INTERNAL_BASIC_AUTH_PASSWORD must be at least 32 characters. Generate a value
with:
ruby -rsecurerandom -e 'puts SecureRandom.hex(24)'- Push
master. - Confirm GitHub Actions quality passes.
- Confirm the
Deploy or rollbackstep finishes successfully in GitHub Actions. - The workflow waits for built-in smoke checks: local
http://127.0.0.1:9293/healthz, public/healthz,/latest, and/en/latest. - Manually smoke test one user profile and one badge URL after the workflow finishes.
- Check Sentry for new deploy errors and latency regressions.
- Open the
CI and deployworkflow withRun workflow. - Choose
action = rollback. - The workflow swaps only
latestandprevious; it does not roll back farther than one version. - Confirm the built-in smoke checks pass, then manually smoke test the same user profile and badge URL as during deploy.
- Keep the Sentry incident open until errors and latency return to baseline.
Use these commands on the server:
sudo systemctl restart polish-open-source-rank.service
sudo systemctl restart polish-open-source-rank-discord-bot.service
sudo systemctl restart polish-open-source-rank-crawl-resume.serviceMonthly and package jobs are long-running one-shot units. Do not restart them before checking whether they are actively writing:
systemctl status polish-open-source-rank-monthly.service --no-pager
systemctl status polish-open-source-rank-packages.service --no-pager
curl -fsS -u "$INTERNAL_BASIC_AUTH_USERNAME" https://polish-open-source.pl/internal/jobsInternal operations pages must require application Basic Auth. A request without credentials should fail with the app-owned challenge:
curl -fsS -o /dev/null -w '%{http_code}\n' https://polish-open-source.pl/internal/jobsExpected status: 401.
- Check
/internal/jobswith the application Basic Auth credentials for the active section and last heartbeat. - Check Sentry for the matching
monthly-rankingsorpackage-rankingscheck-in. - Inspect the host alert timer with
journalctl -u polish-open-source-rank-alerts.service -n 50 --no-pager. - Inspect job logs with
journalctl -u polish-open-source-rank-monthly.service -n 200 --no-pageror the packages unit. - If the job is stale and no process is still doing useful work, stop the unit
and run
bin/resume_crawlsthroughpolish-open-source-rank-crawl-resume.service.
- Stop web, Discord bot, monthly and packages units.
- Copy the selected SQLite backup to a temporary path.
- Run an integrity check on the copy.
- Replace the active public snapshot or working database only after the integrity check passes.
- Start web first, smoke test public pages, then restart background units.