Skip to content

Commit 2918fdc

Browse files
authored
Add standalone raster resampling (#1152) (#1172)
* Add design spec for polygonize simplification (#1151) * Add resample function for resolution change without reprojection (#1152) Implements raster resampling with 8 methods (nearest, bilinear, cubic, average, min, max, median, mode) across all 4 backends. Uses map_coordinates with global coordinate mapping for chunk-consistent dask results and numba-accelerated block aggregation kernels. * Add test suite for resample function (#1152) 51 tests covering API validation, output geometry, correctness against known values, NaN handling, edge cases, and backend parity for dask and cupy paths. * Add resample docs and README entry (#1152) * Add resample user guide notebook (#1152) * Renumber resample notebook to 50 to avoid conflicts (#1152) Notebooks 48 and 49 were taken by Sieve Filter, Visibility Analysis, and KDE after the rebase.
1 parent f94a0b7 commit 2918fdc

File tree

8 files changed

+1460
-1
lines changed

8 files changed

+1460
-1
lines changed

README.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -223,10 +223,11 @@ ds.xrs.open_geotiff('large_dem.tif') # read windowed to Dataset
223223
**Consistency:** 100% pixel-exact match vs rioxarray on all tested files (Landsat 8, Copernicus DEM, USGS 1-arc-second, USGS 1-meter).
224224

225225
-----------
226-
### **Reproject / Merge**
226+
### **Reproject / Merge / Resample**
227227

228228
| Name | Description | Source | NumPy xr.DataArray | Dask xr.DataArray | CuPy GPU xr.DataArray | Dask GPU xr.DataArray |
229229
|:----------:|:------------|:------:|:----------------------:|:--------------------:|:-------------------:|:------:|
230+
| [Resample](xrspatial/resample.py) | Changes raster resolution (cell size) without reprojection. Nearest, bilinear, cubic, average, mode, min, max, median methods | Standard (interpolation / block aggregation) | ✅️ | ✅️ | ✅️ | ✅️ |
230231
| [Reproject](xrspatial/reproject/__init__.py) | Reprojects a raster to a new CRS with Numba JIT / CUDA coordinate transforms and resampling. Supports vertical datums (EGM96, EGM2008) and horizontal datum shifts (NAD27, OSGB36, etc.) | Standard (inverse mapping) | ✅️ | ✅️ | ✅️ | ✅️ |
231232
| [Merge](xrspatial/reproject/__init__.py) | Merges multiple rasters into a single mosaic with configurable overlap strategy | Standard (mosaic) | ✅️ | ✅️ | 🔄 | 🔄 |
232233

docs/source/reference/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,7 @@ Reference
2222
multispectral
2323
pathfinding
2424
proximity
25+
resample
2526
surface
2627
terrain_metrics
2728
utilities

docs/source/reference/resample.rst

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
.. _reference.resample:
2+
3+
********
4+
Resample
5+
********
6+
7+
Change raster resolution (cell size) without changing its CRS.
8+
9+
Resample
10+
========
11+
.. autosummary::
12+
:toctree: _autosummary
13+
14+
xrspatial.resample.resample
Lines changed: 138 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,138 @@
1+
# Polygonize Geometry Simplification
2+
3+
**Issue:** #1151
4+
**Date:** 2026-04-06
5+
6+
## Problem
7+
8+
`polygonize()` produces exact pixel-boundary polygons. On high-resolution
9+
rasters this creates dense geometries with thousands of vertices per polygon,
10+
making output slow to render, large on disk, and unwieldy for spatial joins.
11+
12+
The current workaround chains GDAL's `gdal_polygonize.py` with
13+
`ogr2ogr -simplify`, adding an external dependency and intermediate file.
14+
15+
## API
16+
17+
Two new parameters on `polygonize()`:
18+
19+
```python
20+
def polygonize(
21+
raster, mask=None, connectivity=4, transform=None,
22+
column_name="DN", return_type="numpy",
23+
simplify_tolerance=None, # float, coordinate units
24+
simplify_method="douglas-peucker", # str
25+
):
26+
```
27+
28+
- `simplify_tolerance=None` or `0.0`: no simplification (backward compatible).
29+
- `simplify_tolerance > 0`: apply Douglas-Peucker with given tolerance.
30+
- `simplify_method="visvalingam-whyatt"`: raises `NotImplementedError`.
31+
- Negative tolerance raises `ValueError`.
32+
33+
## Algorithm: Shared-Edge Douglas-Peucker
34+
35+
Topology-preserving simplification via shared-edge decomposition, same
36+
approach used by TopoJSON and GRASS `v.generalize`.
37+
38+
### Pipeline position
39+
40+
Simplification runs between boundary tracing and output conversion:
41+
42+
```
43+
CCL -> boundary tracing -> [simplification] -> output conversion
44+
```
45+
46+
For dask backends, simplification runs after chunk merging.
47+
48+
### Steps
49+
50+
1. **Find junctions.** Scan all ring vertices. A junction is any coordinate
51+
that appears as a vertex in 3 or more distinct rings. These points are
52+
pinned and never removed by simplification.
53+
54+
2. **Split rings into edge chains.** Walk each ring and split at junction
55+
vertices. Each resulting chain connects two junctions (or forms a closed
56+
loop when the ring contains no junctions). Each chain is shared by at
57+
most 2 adjacent polygons.
58+
59+
3. **Deduplicate chains.** Normalize each chain by its sorted endpoint pair
60+
so shared edges between adjacent polygons are identified and simplified
61+
only once.
62+
63+
4. **Simplify each chain.** Apply Douglas-Peucker to each unique chain.
64+
Junction endpoints are fixed. The DP implementation is numba-compiled
65+
(`@ngjit`) for performance on large coordinate arrays.
66+
67+
5. **Reassemble rings.** Replace each ring's chain segments with their
68+
simplified versions and rebuild the ring coordinate arrays.
69+
70+
### Why this preserves topology
71+
72+
Adjacent polygons reference the same physical edge chain. Simplifying
73+
each chain once means both neighbors get identical simplified boundaries.
74+
No gaps or overlaps can arise because there is no independent simplification
75+
of shared geometry.
76+
77+
## Implementation
78+
79+
All new code lives in `xrspatial/polygonize.py` as internal functions.
80+
81+
### New functions
82+
83+
| Function | Decorator | Purpose |
84+
|---|---|---|
85+
| `_find_junctions(all_rings)` | pure Python | Scan rings, return set of junction coords |
86+
| `_split_ring_at_junctions(ring, junctions)` | pure Python | Break one ring into chains at junctions |
87+
| `_normalize_chain(chain)` | pure Python | Canonical key for deduplication |
88+
| `_douglas_peucker(coords, tolerance)` | `@ngjit` | DP simplification on Nx2 array |
89+
| `_simplify_polygons(polygon_points, tolerance)` | pure Python | Orchestrator: junctions -> split -> DP -> reassemble |
90+
91+
### Integration point
92+
93+
In `polygonize()`, after the `mapper(raster)(...)` call returns
94+
`(column, polygon_points)` and before the return-type conversion block:
95+
96+
```python
97+
if simplify_tolerance and simplify_tolerance > 0:
98+
polygon_points = _simplify_polygons(polygon_points, simplify_tolerance)
99+
```
100+
101+
### Backend behavior
102+
103+
- **NumPy / CuPy:** simplification runs on CPU-side coordinate arrays
104+
returned by boundary tracing (CuPy already transfers to CPU for tracing).
105+
- **Dask:** simplification runs after `_merge_chunk_polygons()`, on the
106+
fully merged result.
107+
- No GPU-side simplification. Boundary tracing is already CPU-bound;
108+
simplification follows the same pattern.
109+
110+
## Constraints
111+
112+
- No Visvalingam-Whyatt yet. The `simplify_method` parameter is present
113+
in the API for forward compatibility; passing `"visvalingam-whyatt"`
114+
raises `NotImplementedError`.
115+
- No streaming simplification. The full polygon set must fit in memory,
116+
same constraint as existing boundary tracing.
117+
- Minimum ring size after simplification: exterior rings keep at least 4
118+
vertices (3 unique + closing). Degenerate rings (area below tolerance
119+
squared) are dropped.
120+
121+
## Testing
122+
123+
- Correctness: known 4x4 raster, verify simplified polygon areas match
124+
originals (simplification must not change topology, only vertex count).
125+
- Vertex reduction: verify output has fewer vertices than unsimplified.
126+
- Topology: verify no gaps between adjacent polygons (union of simplified
127+
polygons equals union of originals, within floating-point tolerance).
128+
- Edge cases: tolerance=0, tolerance=None, negative tolerance, single-pixel
129+
raster, raster with one uniform value.
130+
- Backend parity: numpy and dask produce same results.
131+
- Return types: simplification works with all five return types.
132+
133+
## Out of scope
134+
135+
- Visvalingam-Whyatt implementation (future PR).
136+
- GPU-accelerated simplification.
137+
- Per-chunk simplification for dask (simplification is post-merge only).
138+
- Area-weighted simplification or other adaptive tolerance schemes.
Lines changed: 210 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,210 @@
1+
{
2+
"cells": [
3+
{
4+
"cell_type": "markdown",
5+
"metadata": {},
6+
"source": [
7+
"# Raster Resampling\n",
8+
"\n",
9+
"The `resample` function changes a raster's resolution (cell size) without\n",
10+
"changing its CRS. This is the operation you'd reach for when you need to\n",
11+
"match two rasters to a common grid or reduce a raster's memory footprint\n",
12+
"before analysis.\n",
13+
"\n",
14+
"**Methods**:\n",
15+
"\n",
16+
"| Method | Direction | Best for |\n",
17+
"|--------|-----------|----------|\n",
18+
"| `nearest` | up/down | Categorical data, fast preview |\n",
19+
"| `bilinear` | up/down | Smooth continuous surfaces |\n",
20+
"| `cubic` | up/down | High-quality continuous surfaces |\n",
21+
"| `average` | down only | Aggregating high-res to low-res |\n",
22+
"| `min`, `max` | down only | Extremes within each output cell |\n",
23+
"| `median` | down only | Robust centre, ignores outliers |\n",
24+
"| `mode` | down only | Majority class in categorical rasters |"
25+
]
26+
},
27+
{
28+
"cell_type": "code",
29+
"execution_count": null,
30+
"metadata": {},
31+
"outputs": [],
32+
"source": [
33+
"import numpy as np\n",
34+
"import xarray as xr\n",
35+
"import matplotlib.pyplot as plt\n",
36+
"\n",
37+
"from xrspatial import resample\n",
38+
"from xrspatial.terrain import generate_terrain"
39+
]
40+
},
41+
{
42+
"cell_type": "markdown",
43+
"metadata": {},
44+
"source": [
45+
"## Generate synthetic terrain"
46+
]
47+
},
48+
{
49+
"cell_type": "code",
50+
"execution_count": null,
51+
"metadata": {},
52+
"outputs": [],
53+
"source": [
54+
"dem = generate_terrain(width=200, height=200)\n",
55+
"# Assign a regular coordinate grid\n",
56+
"dem = dem.assign_coords(\n",
57+
" y=np.linspace(100, 0, dem.sizes['y']),\n",
58+
" x=np.linspace(0, 100, dem.sizes['x']),\n",
59+
")\n",
60+
"dem.attrs['res'] = (0.5, 0.5)\n",
61+
"\n",
62+
"fig, ax = plt.subplots(figsize=(6, 5))\n",
63+
"dem.plot(ax=ax, cmap='terrain')\n",
64+
"ax.set_title(f'Original DEM ({dem.shape[0]}x{dem.shape[1]}, res={dem.attrs[\"res\"][0]:.1f}m)')\n",
65+
"plt.tight_layout()"
66+
]
67+
},
68+
{
69+
"cell_type": "markdown",
70+
"metadata": {},
71+
"source": [
72+
"## Downsample with `scale_factor`"
73+
]
74+
},
75+
{
76+
"cell_type": "code",
77+
"execution_count": null,
78+
"metadata": {},
79+
"outputs": [],
80+
"source": [
81+
"down = resample(dem, scale_factor=0.25, method='bilinear')\n",
82+
"\n",
83+
"fig, axes = plt.subplots(1, 2, figsize=(12, 5))\n",
84+
"dem.plot(ax=axes[0], cmap='terrain')\n",
85+
"axes[0].set_title(f'Original ({dem.shape[0]}x{dem.shape[1]})')\n",
86+
"down.plot(ax=axes[1], cmap='terrain')\n",
87+
"axes[1].set_title(f'Downsampled 4x ({down.shape[0]}x{down.shape[1]})')\n",
88+
"plt.tight_layout()"
89+
]
90+
},
91+
{
92+
"cell_type": "markdown",
93+
"metadata": {},
94+
"source": [
95+
"## Upsample with `target_resolution`"
96+
]
97+
},
98+
{
99+
"cell_type": "code",
100+
"execution_count": null,
101+
"metadata": {},
102+
"outputs": [],
103+
"source": [
104+
"up = resample(down, target_resolution=0.5, method='cubic')\n",
105+
"\n",
106+
"fig, axes = plt.subplots(1, 2, figsize=(12, 5))\n",
107+
"down.plot(ax=axes[0], cmap='terrain')\n",
108+
"axes[0].set_title(f'Coarse ({down.shape[0]}x{down.shape[1]})')\n",
109+
"up.plot(ax=axes[1], cmap='terrain')\n",
110+
"axes[1].set_title(f'Upsampled to 0.5m ({up.shape[0]}x{up.shape[1]})')\n",
111+
"plt.tight_layout()"
112+
]
113+
},
114+
{
115+
"cell_type": "markdown",
116+
"metadata": {},
117+
"source": [
118+
"## Compare resampling methods"
119+
]
120+
},
121+
{
122+
"cell_type": "code",
123+
"execution_count": null,
124+
"metadata": {},
125+
"outputs": [],
126+
"source": [
127+
"methods = ['nearest', 'bilinear', 'cubic', 'average']\n",
128+
"fig, axes = plt.subplots(1, 4, figsize=(16, 4))\n",
129+
"\n",
130+
"for ax, method in zip(axes, methods):\n",
131+
" out = resample(dem, scale_factor=0.1, method=method)\n",
132+
" out.plot(ax=ax, cmap='terrain', add_colorbar=False)\n",
133+
" ax.set_title(method)\n",
134+
" ax.set_aspect('equal')\n",
135+
"\n",
136+
"plt.suptitle('Downsample 10x with different methods', y=1.02)\n",
137+
"plt.tight_layout()"
138+
]
139+
},
140+
{
141+
"cell_type": "markdown",
142+
"metadata": {},
143+
"source": [
144+
"## Categorical raster with `mode`"
145+
]
146+
},
147+
{
148+
"cell_type": "code",
149+
"execution_count": null,
150+
"metadata": {},
151+
"outputs": [],
152+
"source": [
153+
"from xrspatial import equal_interval\n",
154+
"\n",
155+
"# Classify elevation into 5 zones\n",
156+
"classes = equal_interval(dem, k=5)\n",
157+
"classes.attrs = dem.attrs.copy()\n",
158+
"classes = classes.assign_coords(dem.coords)\n",
159+
"\n",
160+
"# Downsample: mode preserves class boundaries\n",
161+
"classes_down = resample(classes.astype('float32'),\n",
162+
" scale_factor=0.2, method='mode')\n",
163+
"\n",
164+
"fig, axes = plt.subplots(1, 2, figsize=(12, 5))\n",
165+
"classes.plot(ax=axes[0], cmap='Set2')\n",
166+
"axes[0].set_title(f'Classes ({classes.shape[0]}x{classes.shape[1]})')\n",
167+
"classes_down.plot(ax=axes[1], cmap='Set2')\n",
168+
"axes[1].set_title(f'Mode downsample ({classes_down.shape[0]}x{classes_down.shape[1]})')\n",
169+
"plt.tight_layout()"
170+
]
171+
},
172+
{
173+
"cell_type": "markdown",
174+
"metadata": {},
175+
"source": [
176+
"## Works with Dask"
177+
]
178+
},
179+
{
180+
"cell_type": "code",
181+
"execution_count": null,
182+
"metadata": {},
183+
"outputs": [],
184+
"source": [
185+
"import dask.array as da\n",
186+
"\n",
187+
"dask_dem = dem.copy()\n",
188+
"dask_dem.data = da.from_array(dem.values, chunks=(100, 100))\n",
189+
"\n",
190+
"result = resample(dask_dem, scale_factor=0.5, method='bilinear')\n",
191+
"print(f'Input: {dask_dem.shape} (dask, chunks={dask_dem.data.chunksize})')\n",
192+
"print(f'Output: {result.shape} (dask, chunks={result.data.chunksize})')\n",
193+
"print(f'Computed shape: {result.compute().shape}')"
194+
]
195+
}
196+
],
197+
"metadata": {
198+
"kernelspec": {
199+
"display_name": "Python 3",
200+
"language": "python",
201+
"name": "python3"
202+
},
203+
"language_info": {
204+
"name": "python",
205+
"version": "3.14.0"
206+
}
207+
},
208+
"nbformat": 4,
209+
"nbformat_minor": 4
210+
}

xrspatial/__init__.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -93,6 +93,7 @@
9393
from xrspatial.preview import preview # noqa
9494
from xrspatial.proximity import allocation # noqa
9595
from xrspatial.rasterize import rasterize # noqa
96+
from xrspatial.resample import resample # noqa
9697
from xrspatial.proximity import direction # noqa
9798
from xrspatial.proximity import euclidean_distance # noqa
9899
from xrspatial.proximity import great_circle_distance # noqa

0 commit comments

Comments
 (0)