You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: .claude/TODO.md
+2Lines changed: 2 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,6 +6,8 @@ Short running list of in-progress / upcoming work. Edit freely; trim older compl
6
6
7
7
## Upcoming
8
8
9
+
-[ ] Trim peak memory in `scripts/conflation/conflate.py` so the OSM(8.7M) × Overture(13M) run fits inside 16GB+4GB swap. Widened Overture allowlist (6.29M → 13.05M) overflowed WSL2 and rebooted the VM on 2026-04-17. Tactics: drop `osm_gdf`/`overture_gdf` down to the minimal column set `build_merge_parts` needs before the scoring pass, free normalized name arrays right after `compute_name_scores`, and narrow `osm_idx`/`overture_idx` to int32. See [src/openpois/conflation/match.py](../src/openpois/conflation/match.py) and [scripts/conflation/conflate.py](../scripts/conflation/conflate.py). Added 2026-04-17.
10
+
-[ ] Instrument setup / matching-reload phase of `scripts/conflation/conflate.py` to pin down the ~17 GB VmHWM spike observed on the 2026-04-17 chunked run (checkpoints reloaded, merge phase bounded — spike is upstream of both). Likely culprits: `pd.concat` of 128 chunk parquets, name/brand array construction holding dual refs, or taxonomy crosswalk transient. Add `psutil` RSS logging at each phase boundary so we can see exactly which step jumps. Added 2026-04-17.
9
11
-[ ] Watch for a DuckDB release that fixes the WSL2 httpfs "Information loss on integer cast" crash (issue #21669, fix PR #21395). Once a tagged release ships with the fix and a full `scripts/overture/download.py` run on WSL2 completes, we can unpin from `duckdb==1.4.1` and revert the per-part download to a single-query DuckDB scan. Added 2026-04-17.
10
12
-[ ] Auto-check taxonomy changes whenever we switch to a new Overture Maps version (detect new/removed L0/L1/L2 categories vs. `taxonomy_crosswalk_overture_maps.csv` and flag gaps). Added 2026-04-16.
11
13
-[ ] Watch for Overture L0/L1 → flat `basic_category` migration (~June 2026). Crosswalk CSV + `assign_overture_shared_label` will need updating. See [docs/taxonomy-setup.md](docs/taxonomy-setup.md).
0 commit comments