|
1 | | -# openpois |
| 1 | +# OpenPOIs |
2 | 2 |
|
3 | | -A Python library for modeling POI (Point of Interest) stability over time using historical OpenStreetMap data, with utilities for downloading current POI snapshots from multiple sources. |
| 3 | +A unified, confidence-scored open dataset of U.S. points of interest, built |
| 4 | +from [OpenStreetMap](https://www.openstreetmap.org) and |
| 5 | +[Overture Maps](https://overturemaps.org). |
4 | 6 |
|
5 | | -## Setup |
| 7 | + |
6 | 8 |
|
7 | | -```bash |
8 | | -make build_env # Create conda environment from environment.yml |
9 | | -make install_package # Install openpois in editable mode |
10 | | -``` |
| 9 | +[](LICENSE) |
| 10 | +[](https://opendatacommons.org/licenses/odbl/1-0/) |
| 11 | +[](pyproject.toml) |
| 12 | +[](https://github.com/henryspatialanalysis/openpois/actions/workflows/deploy-site.yml) |
11 | 13 |
|
12 | | -## POI Snapshot Downloads |
| 14 | +- 🌐 **Live map:** <https://openpois.org> |
| 15 | +- 📘 **Python API docs:** <https://openpois.org/docs/> |
| 16 | +- 🗄️ **Dataset on Source Cooperative:** <https://source.coop/henryspatialanalysis/openpois> |
13 | 17 |
|
14 | | -Two exploratory scripts download current US-wide POI snapshots from different sources. Both output GeoParquet to `~/data/`. |
| 18 | +## What is OpenPOIs? |
15 | 19 |
|
16 | | -### OpenStreetMap |
| 20 | +OpenPOIs conflates points of interest from OpenStreetMap and Overture Maps |
| 21 | +into a single unified dataset, then attaches a per-POI confidence score |
| 22 | +estimating the probability that the place still exists. Confidence comes from |
| 23 | +a Bayesian turnover model fit on OSM tag-edit history. The published dataset |
| 24 | +covers the United States and Puerto Rico and is refreshed periodically. |
17 | 25 |
|
18 | | -Downloads the Geofabrik US extract (~11 GB), filters to POI-relevant tags with osmium-tool, and parses with pyosmium. |
| 26 | +This repository contains the Python library used to produce the data, the |
| 27 | +end-to-end pipelines that download and conflate sources, and the Vue |
| 28 | +front-end that powers the live map. |
19 | 29 |
|
20 | | -```bash |
21 | | -python exploratory/osm_snapshot/download.py |
| 30 | +## Quickstart — read the data |
| 31 | + |
| 32 | +No install required. The dataset is hosted anonymously on Source Cooperative; |
| 33 | +read it straight from S3: |
| 34 | + |
| 35 | +```python |
| 36 | +import pyarrow.dataset as ds |
| 37 | +import pyarrow.fs as pafs |
| 38 | + |
| 39 | +BASE = "us-west-2.opendata.source.coop/henryspatialanalysis/openpois" |
| 40 | +VERSION = "latest" # or pin a dated folder, e.g. "2026-04-23-v0" |
| 41 | + |
| 42 | +fs = pafs.S3FileSystem(anonymous = True, region = "us-west-2") |
| 43 | +pois = ds.dataset( |
| 44 | + f"{BASE}/{VERSION}/conflated-parquet/", |
| 45 | + filesystem = fs, |
| 46 | + format = "parquet", |
| 47 | + partitioning = "hive", |
| 48 | +) |
| 49 | +print(pois.schema) |
| 50 | +print(f"{pois.count_rows():,} POIs") |
22 | 51 | ``` |
23 | 52 |
|
24 | | -Output: `~/data/openpois/snapshots/osm/<VERSION>/osm_snapshot.parquet` (~7.8M POIs) |
| 53 | +GeoPandas, DuckDB, and PMTiles examples live in the |
| 54 | +[dataset README on Source Cooperative](https://source.coop/henryspatialanalysis/openpois). |
25 | 55 |
|
26 | | -### Overture Maps |
| 56 | +## Install the Python library |
27 | 57 |
|
28 | | -Queries the public Overture Maps S3 bucket directly via DuckDB. No authentication required. |
| 58 | +The package is source-install only for now (not yet on PyPI): |
29 | 59 |
|
30 | 60 | ```bash |
31 | | -python exploratory/overture/download.py |
| 61 | +git clone https://github.com/henryspatialanalysis/openpois.git |
| 62 | +cd openpois |
| 63 | +make build_env # conda env from environment.yml |
| 64 | +conda activate openpois |
| 65 | +make install_package # pip install -e . |
32 | 66 | ``` |
33 | 67 |
|
34 | | -Output: `~/data/openpois/snapshots/overture/<VERSION>/overture_snapshot.parquet` (~13M POIs) |
| 68 | +## Library example |
35 | 69 |
|
36 | | -### Configuration |
| 70 | +Load a single category from the published conflated parquet and inspect the |
| 71 | +highest-confidence rows: |
37 | 72 |
|
38 | | -All download settings (bounding boxes, category filters, release dates, output paths) are in `config.yaml`. Set `release_date: null` under any source to auto-detect the latest available snapshot. |
| 73 | +```python |
| 74 | +import geopandas as gpd |
| 75 | +import pyarrow.fs as pafs |
39 | 76 |
|
40 | | ---- |
| 77 | +BASE = "us-west-2.opendata.source.coop/henryspatialanalysis/openpois" |
| 78 | +VERSION = "latest" |
41 | 79 |
|
42 | | -## Web Map |
| 80 | +fs = pafs.S3FileSystem(anonymous = True, region = "us-west-2") |
| 81 | +cafes = gpd.read_parquet( |
| 82 | + f"{BASE}/{VERSION}/conflated-parquet/shared_label=Cafe/part-0.parquet", |
| 83 | + filesystem = fs, |
| 84 | +) |
| 85 | +print(cafes.sort_values("conf_mean", ascending = False).head()) |
| 86 | +``` |
43 | 87 |
|
44 | | -`site/` contains a full-screen interactive web map for exploring the POI snapshots. It shows OpenStreetMap and Overture Maps data with confidence-based coloring (red → yellow → green), address search, and click-to-inspect popups. |
| 88 | +The full library API — I/O adapters, the turnover model, conflation |
| 89 | +primitives — is documented at <https://openpois.org/docs/>. |
45 | 90 |
|
46 | | -```bash |
47 | | -make site_dev # Serve locally with hot reload (http://localhost:5173) |
48 | | -make site_build # Build for production (output: site/dist/) |
49 | | -``` |
| 91 | +## Reproduce the dataset yourself |
| 92 | + |
| 93 | +The data is produced by four pipelines under [scripts/](scripts/), each |
| 94 | +driven by [config.yaml](config.yaml): |
50 | 95 |
|
51 | | -The site is automatically deployed to GitHub Pages via GitHub Actions on every push to `main` that touches `site/**`. The deployment workflow is at `.github/workflows/deploy-site.yml`. |
| 96 | +1. Snapshot downloads (OSM + Overture) |
| 97 | +2. OSM history download and Bayesian turnover-model fit |
| 98 | +3. Apply model to OSM snapshot to get per-POI confidence |
| 99 | +4. Conflate OSM × Overture, partition, publish to Source Cooperative |
52 | 100 |
|
53 | | ---- |
| 101 | +Each pipeline and its scripts are documented in the workflows reference at |
| 102 | +<https://openpois.org/docs/workflows.html>. |
54 | 103 |
|
55 | | -## Historical OSM Change-Rate Modeling |
| 104 | +## Repository layout |
56 | 105 |
|
57 | | -The core workflow models how long POI tags remain stable over time using historical OSM data. |
| 106 | +| Path | Purpose | |
| 107 | +|---|---| |
| 108 | +| [src/openpois/](src/openpois/) | Library source: I/O, models, conflation, publishing | |
| 109 | +| [scripts/](scripts/) | End-to-end pipelines using `config.yaml` | |
| 110 | +| [site/](site/) | Vue 3 + Vite frontend powering openpois.org | |
| 111 | +| [docs/](docs/) | Sphinx documentation source | |
| 112 | +| [tests/](tests/) | Unit tests | |
| 113 | + |
| 114 | +## Web map |
| 115 | + |
| 116 | +The interactive map at <https://openpois.org> is a Vue 3 + Vite app rendering |
| 117 | +PMTiles archives over MapLibre GL. To run it locally: |
58 | 118 |
|
59 | 119 | ```bash |
60 | | -python exploratory/osm_data/download.py # Download OSM history for a bounding box |
61 | | -python exploratory/osm_data/format_tabular.py # Format into observation records |
62 | | -python scripts/models/osm_turnover.py # Fit Poisson change-rate model (JAX) |
| 120 | +make site_dev # http://localhost:5173, hot reload |
| 121 | +make site_build # production build to site/dist/ |
63 | 122 | ``` |
64 | 123 |
|
65 | | ---- |
| 124 | +The site auto-deploys to GitHub Pages via |
| 125 | +[.github/workflows/deploy-site.yml](.github/workflows/deploy-site.yml) on |
| 126 | +every push to `main` that touches `site/`, `src/`, `docs/`, or `scripts/`. |
66 | 127 |
|
67 | 128 | ## Development |
68 | 129 |
|
69 | 130 | ```bash |
70 | | -pytest # Run tests |
71 | | -make export_env # Export conda environment after adding dependencies |
| 131 | +pytest # run the test suite |
| 132 | +make lint # flake8 + pylint |
| 133 | +make export_env # rewrite environment.yml after adding deps |
72 | 134 | ``` |
| 135 | + |
| 136 | +## Licensing |
| 137 | + |
| 138 | +OpenPOIs is dual-licensed: |
| 139 | + |
| 140 | +- **Code** — [MIT License](LICENSE). You can use, modify, and redistribute the |
| 141 | + Python package, scripts, and front-end freely. |
| 142 | +- **Data** — [Open Database License (ODbL) v1.0](https://opendatacommons.org/licenses/odbl/1-0/). |
| 143 | + The published parquet and PMTiles archives are derivative works of |
| 144 | + OpenStreetMap and Overture Maps and inherit ODbL terms. Any public use must |
| 145 | + attribute OpenPOIs, [OpenStreetMap contributors](https://www.openstreetmap.org/copyright), |
| 146 | + and the [Overture Maps Foundation](https://docs.overturemaps.org/attribution/). |
| 147 | + Derivative databases must be released under the same license. |
| 148 | + |
| 149 | +## Citation |
| 150 | + |
| 151 | +If you use OpenPOIs in research or a public product, please cite: |
| 152 | + |
| 153 | +> Henry, N. (2026). *OpenPOIs: a unified, confidence-scored dataset of U.S. points of interest.* Henry Spatial Analysis. <https://openpois.org> |
| 154 | +
|
| 155 | +A machine-readable citation is provided in [CITATION.cff](CITATION.cff); |
| 156 | +GitHub renders it as a "Cite this repository" button on the repo home page. |
| 157 | + |
| 158 | +## Contact |
| 159 | + |
| 160 | +Bug reports, feature requests, and contributions are welcome via |
| 161 | +[GitHub issues](https://github.com/henryspatialanalysis/openpois/issues). |
0 commit comments