File tree Expand file tree Collapse file tree
Expand file tree Collapse file tree Original file line number Diff line number Diff line change 77This module downloads OpenStreetMap data for a given area of interest.
88
99It is broken into the following functions:
10+
1011- build_query_string: Builds an Overpass query string for a given date and bbox.
1112- build_date_range: Creates a list of dates for querying.
1213- collect_element_ids: Queries Overpass for element IDs across a date range.
Original file line number Diff line number Diff line change 99and pyosmium parsing.
1010
1111It is broken into the following functions:
12+
1213- download_pbf: Downloads a PBF file from a URL via streaming HTTP.
1314- filter_pbf: Runs osmium tags-filter to produce a reduced POI-only PBF.
1415- parse_pbf_to_geodataframe: Parses the filtered PBF with pyosmium into a
@@ -362,6 +363,7 @@ def download_osm_snapshot(
362363 GeoParquet.
363364
364365 Steps:
366+
365367 1. download_pbf — streams the Geofabrik US extract (~11 GB) to
366368 raw_pbf_path.
367369 2. filter_pbf — runs osmium tags-filter to produce a small POI-only PBF.
Original file line number Diff line number Diff line change 88given bounding box and set of taxonomy categories.
99
1010It is broken into the following functions:
11+
1112- get_latest_release_date: Finds the most recent Overture release by listing S3.
1213- build_overture_s3_path: Constructs the S3 glob path for a given release.
1314- download_overture_snapshot: Queries S3 via DuckDB, filters by bbox and
@@ -90,9 +91,8 @@ def build_overture_s3_path(
9091 bucket: The S3 bucket name.
9192
9293 Returns:
93- S3 path string suitable for DuckDB read_parquet(), e.g.:
94- 's3://overturemaps-us-west-2/release/2026-02-18.0/
95- theme=places/type=place/*.parquet'
94+ S3 path string suitable for DuckDB ``read_parquet()``, e.g.
95+ ``s3://overturemaps-us-west-2/release/2026-02-18.0/theme=places/type=place/``
9696 """
9797 return (
9898 f"s3://{ bucket } /release/{ release_date } "
Original file line number Diff line number Diff line change @@ -23,12 +23,13 @@ def change_plot_reshape_data(
2323 Reshape data for the change plot. The data comes in with one row per POI-tag, and
2424 is reshaped by elapsed days since the POI-tag was added. For each elapsed day, there
2525 are four possibilities:
26- 1. Confirmed unchanged: The tag was observed unchanged on or *after* this day
27- 2. Confirmed changed: The tag was last observed changed on or *before* this day
28- 2. Unsure: The tag was last observed unchanged *before* this day, but has not yet
29- been observed changed
30- 4. Aged out: The maximum time elapsed between when the tag was added and our data
31- download is *before* this day, so we should drop it from the plot
26+
27+ 1. Confirmed unchanged: The tag was observed unchanged on or *after* this day
28+ 2. Confirmed changed: The tag was last observed changed on or *before* this day
29+ 3. Unsure: The tag was last observed unchanged *before* this day, but has not yet
30+ been observed changed
31+ 4. Aged out: The maximum time elapsed between when the tag was added and our data
32+ download is *before* this day, so we should drop it from the plot
3233
3334 Args:
3435 observations: DataFrame with observations. Each row is an iteration of a
You can’t perform that action at this time.
0 commit comments