Problem
The tercet_missing_codes.csv file in this repository is updated over time with newly discovered postal codes that are missing from TERCET data, along with their estimated NUTS mappings. However, the running service only imports this file when explicitly triggered via scripts/import_estimates.py. If the CSV is updated (e.g. via automated monitor contributions), the deployed service remains unaware of the new entries until a manual re-import and redeploy.
Proposal
The service should periodically fetch the latest tercet_missing_codes.csv directly from this GitHub repository and re-import estimates automatically. This could be implemented as:
- Periodic fetch from GitHub: On a configurable interval (e.g. daily), fetch the raw CSV from
https://raw.githubusercontent.com/bk86a/PostalCode2NUTS/main/tercet_missing_codes.csv, compare against the currently loaded data (e.g. by hash or row count), and re-run the import logic if changed.
- Startup + schedule: Always fetch and import on startup, plus schedule a periodic refresh (e.g. via
asyncio background task in the FastAPI lifespan).
This would make deployed instances self-updating as new missing postal codes are discovered and merged into the repository, without requiring a redeploy.
Alternatives considered
- Local file watcher: Only works if the CSV is mounted/updated on the host — doesn't help containerised deployments.
- Webhook-based reload: More responsive but adds complexity (needs a webhook endpoint and GitHub webhook configuration).
- Keep current manual import: Simpler, but means the service can go stale between deploys.
Problem
The
tercet_missing_codes.csvfile in this repository is updated over time with newly discovered postal codes that are missing from TERCET data, along with their estimated NUTS mappings. However, the running service only imports this file when explicitly triggered viascripts/import_estimates.py. If the CSV is updated (e.g. via automated monitor contributions), the deployed service remains unaware of the new entries until a manual re-import and redeploy.Proposal
The service should periodically fetch the latest
tercet_missing_codes.csvdirectly from this GitHub repository and re-import estimates automatically. This could be implemented as:https://raw.githubusercontent.com/bk86a/PostalCode2NUTS/main/tercet_missing_codes.csv, compare against the currently loaded data (e.g. by hash or row count), and re-run the import logic if changed.asynciobackground task in the FastAPI lifespan).This would make deployed instances self-updating as new missing postal codes are discovered and merged into the repository, without requiring a redeploy.
Alternatives considered