Skip to content

Commit 05fc5f0

Browse files
authored
Fixes #875: add out-of-core dask CPU viewshed (#897)
* Fixes #875: add out-of-core dask CPU viewshed Add a dask+numpy backend for viewshed() using a three-tier strategy: A. max_distance specified → extract spatial window, compute, run exact R2 angular sweep on the small numpy window. B. Full R2 fits in memory (280 bytes/cell < 50% available RAM) → compute the full dask array, run R2 directly. C. Otherwise → horizon-profile distance sweep algorithm that processes cells in Chebyshev-ring order with an LRU chunk cache, keeping memory bounded regardless of grid size. Also adds a max_distance parameter to viewshed() for all backends, a memory guard for Tier C, and four new dask-specific tests. * Add max_distance support for all backends and dask+cupy handling - Extract max_distance window logic into _viewshed_windowed() that works for numpy, cupy, dask+numpy, and dask+cupy uniformly - Handle dask+cupy in _viewshed_dask(): Tier B computes to cupy and uses GPU RTX when available; Tier C converts cupy chunks to numpy for the distance-sweep fallback - Add tests: numpy max_distance, max_distance-vs-full comparison (numpy + cupy parametrized) * Update README feature matrix: viewshed now supports all 4 backends * Fix OOM in dask max_distance: build output lazily with _dask_embed_window _viewshed_windowed was allocating np.full((H, W), ...) for the output before wrapping it as dask — instant OOM on a 30TB input even with max_distance set. Now for dask inputs the output is built chunk-by-chunk: overlapping chunks get a concrete numpy block, all others are lazy da.full blocks that consume no memory until materialized. Adds test_viewshed_dask_max_distance_lazy_output which creates a 100k x 100k (80GB) dask raster and verifies the output stays lazy. * Apply circular distance mask to max_distance output The window extraction uses a square (Chebyshev) region but max_distance is a Euclidean radius. Add a circular mask after computing the window result so cells at the corners beyond max_distance are set to INVISIBLE. Update tests to account for the circular boundary.
1 parent 8c2b75a commit 05fc5f0

File tree

3 files changed

+738
-8
lines changed

3 files changed

+738
-8
lines changed

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -225,7 +225,7 @@ In the GIS world, rasters are used for representing continuous phenomena (e.g. e
225225
| [Hillshade](xrspatial/hillshade.py) | Simulates terrain illumination from a given sun angle and azimuth | ✅️ | ✅️ | ✅️ | ✅️ |
226226
| [Slope](xrspatial/slope.py) | Computes terrain gradient steepness at each cell in degrees | ✅️ | ✅️ | ✅️ | ✅️ |
227227
| [Terrain Generation](xrspatial/terrain.py) | Generates synthetic terrain elevation using fractal noise | ✅️ | ✅️ | ✅️ | |
228-
| [Viewshed](xrspatial/viewshed.py) | Determines visible cells from a given observer point on terrain | ✅️ | | | |
228+
| [Viewshed](xrspatial/viewshed.py) | Determines visible cells from a given observer point on terrain | ✅️ | ✅️ | ✅️ | ✅️ |
229229
| [Min Observable Height](xrspatial/experimental/min_observable_height.py) | Finds the minimum observer height needed to see each cell *(experimental)* | ✅️ | | | |
230230
| [Perlin Noise](xrspatial/perlin.py) | Generates smooth continuous random noise for procedural textures | ✅️ | ✅️ | ✅️ | |
231231
| [Bump Mapping](xrspatial/bump.py) | Adds randomized bump features to simulate natural terrain variation | ✅️ | | | |

xrspatial/tests/test_viewshed.py

Lines changed: 201 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,4 @@
1+
import dask.array as da
12
import datashader as ds
23
import numpy as np
34
import pandas as pd
@@ -6,6 +7,7 @@
67

78
from xrspatial import viewshed
89
from xrspatial.tests.general_checks import general_output_checks
10+
from xrspatial.viewshed import INVISIBLE
911

1012
from ..gpu_rtx import has_rtx
1113

@@ -159,3 +161,202 @@ def test_viewshed_flat(backend, observer_elev, target_elev):
159161
# Should do better with viewshed gpu output angles.
160162
mask = (v.data < 90)
161163
np.testing.assert_allclose(v.data[mask], angle[mask], atol=0.03)
164+
165+
166+
# -------------------------------------------------------------------
167+
# Dask backend tests
168+
# -------------------------------------------------------------------
169+
170+
@pytest.mark.parametrize("observer_elev", [5, 2])
171+
@pytest.mark.parametrize("target_elev", [0, 1])
172+
def test_viewshed_dask_flat(observer_elev, target_elev):
173+
"""Flat terrain: dask should match analytical formula."""
174+
x, y = 0, 0
175+
ny, nx = 5, 4
176+
arr = da.from_array(np.full((ny, nx), 1.3), chunks=(3, 2))
177+
xs = np.arange(nx) * 0.5
178+
ys = np.arange(ny) * 1.5
179+
xarr = xa.DataArray(arr, coords=dict(x=xs, y=ys), dims=["y", "x"])
180+
v = viewshed(xarr, x=x, y=y,
181+
observer_elev=observer_elev, target_elev=target_elev)
182+
result = v.values
183+
184+
xs2, ys2 = np.meshgrid(xs, ys)
185+
d_vert = observer_elev - target_elev
186+
d_horz = np.sqrt((xs2 - x) ** 2 + (ys2 - y) ** 2)
187+
expected = np.rad2deg(np.arctan2(d_horz, d_vert))
188+
expected[0, 0] = result[0, 0] # observer cell
189+
np.testing.assert_allclose(result, expected)
190+
191+
192+
def test_viewshed_dask_matches_numpy():
193+
"""Dask should closely match numpy R2 sweep on small varied terrain."""
194+
np.random.seed(42)
195+
ny, nx = 12, 10
196+
terrain = np.random.uniform(0, 5, (ny, nx))
197+
xs = np.arange(nx, dtype=float)
198+
ys = np.arange(ny, dtype=float)
199+
200+
# numpy reference
201+
raster_np = xa.DataArray(terrain.copy(), coords=dict(x=xs, y=ys),
202+
dims=["y", "x"])
203+
v_np = viewshed(raster_np, x=5.0, y=6.0, observer_elev=10)
204+
205+
# dask (will use Tier B — full compute + R2 — on this small grid)
206+
raster_da = xa.DataArray(
207+
da.from_array(terrain.copy(), chunks=(4, 5)),
208+
coords=dict(x=xs, y=ys), dims=["y", "x"])
209+
v_da = viewshed(raster_da, x=5.0, y=6.0, observer_elev=10)
210+
211+
np.testing.assert_allclose(v_da.values, v_np.values)
212+
213+
214+
def test_viewshed_dask_max_distance():
215+
"""max_distance should produce partial viewshed within radius."""
216+
ny, nx = 20, 20
217+
arr_np = np.zeros((ny, nx))
218+
xs = np.arange(nx, dtype=float)
219+
ys = np.arange(ny, dtype=float)
220+
221+
raster_da = xa.DataArray(
222+
da.from_array(arr_np, chunks=(10, 10)),
223+
coords=dict(x=xs, y=ys), dims=["y", "x"])
224+
v = viewshed(raster_da, x=10.0, y=10.0,
225+
observer_elev=5, max_distance=5.0)
226+
result = v.values
227+
228+
# Observer cell is visible
229+
assert result[10, 10] == 180.0
230+
231+
# Cells far beyond max_distance should be INVISIBLE
232+
assert result[0, 0] == INVISIBLE
233+
assert result[19, 19] == INVISIBLE
234+
235+
# Cells within max_distance should be visible (flat terrain, observer up)
236+
assert result[10, 12] > INVISIBLE # 2 cells away
237+
assert result[8, 10] > INVISIBLE # 2 cells away
238+
239+
240+
def test_viewshed_dask_distance_sweep():
241+
"""Force the distance-sweep path (Tier C) and verify flat terrain."""
242+
from unittest.mock import patch
243+
244+
ny, nx = 10, 10
245+
arr_np = np.full((ny, nx), 0.0)
246+
xs = np.arange(nx, dtype=float)
247+
ys = np.arange(ny, dtype=float)
248+
249+
raster_da = xa.DataArray(
250+
da.from_array(arr_np, chunks=(5, 5)),
251+
coords=dict(x=xs, y=ys), dims=["y", "x"])
252+
253+
# R2 needs 280*10*10=28000 bytes; Tier B requires <50% of avail.
254+
# Output grid is 10*10*8=800 bytes; memory guard requires <80% of avail.
255+
# 10000 bytes: skips Tier B (28000 > 5000) and passes guard (800 < 8000).
256+
with patch('xrspatial.viewshed._available_memory_bytes',
257+
return_value=10_000):
258+
v = viewshed(raster_da, x=5.0, y=5.0, observer_elev=5)
259+
260+
result = v.values
261+
assert result[5, 5] == 180.0
262+
# All cells on flat terrain should be visible
263+
assert (result > INVISIBLE).all()
264+
265+
266+
def test_viewshed_dask_max_distance_lazy_output():
267+
"""max_distance on a large dask raster should produce a lazy output
268+
without allocating the full grid in memory."""
269+
ny, nx = 100_000, 100_000 # 80 GB if materialized
270+
# Don't actually create the data — just define a lazy dask array
271+
single_chunk = da.zeros((1000, 1000), chunks=(1000, 1000),
272+
dtype=np.float64)
273+
# Tile to 100k x 100k via dask (no memory allocated)
274+
big = da.tile(single_chunk, (100, 100))
275+
xs = np.arange(nx, dtype=float)
276+
ys = np.arange(ny, dtype=float)
277+
raster = xa.DataArray(big, coords=dict(x=xs, y=ys), dims=["y", "x"])
278+
v = viewshed(raster, x=50000.0, y=50000.0,
279+
observer_elev=5, max_distance=10.0)
280+
# Output should be a dask array (lazy), not numpy
281+
assert isinstance(v.data, da.Array)
282+
assert v.shape == (ny, nx)
283+
# Only compute a small slice to verify correctness
284+
# max_distance=10 → radius_cells=10 → window is obs ± 10
285+
center = v.isel(y=slice(49989, 50012), x=slice(49989, 50012)).values
286+
# Observer is at index 11 within this 23-cell slice
287+
assert center[11, 11] == 180.0 # observer cell
288+
# Cells within the circle should be visible (flat terrain, observer up)
289+
assert center[11, 13] > INVISIBLE # 2 cells away
290+
# Corner (49989,49989) is sqrt(11^2+11^2) ≈ 15.6 from observer → outside
291+
assert center[0, 0] == INVISIBLE
292+
293+
294+
def test_viewshed_numpy_max_distance():
295+
"""max_distance should work on plain numpy raster too."""
296+
ny, nx = 20, 20
297+
arr_np = np.zeros((ny, nx))
298+
xs = np.arange(nx, dtype=float)
299+
ys = np.arange(ny, dtype=float)
300+
301+
raster_np = xa.DataArray(arr_np, coords=dict(x=xs, y=ys),
302+
dims=["y", "x"])
303+
v = viewshed(raster_np, x=10.0, y=10.0,
304+
observer_elev=5, max_distance=5.0)
305+
result = v.values
306+
307+
assert result[10, 10] == 180.0
308+
assert result[0, 0] == INVISIBLE
309+
assert result[19, 19] == INVISIBLE
310+
assert result[10, 12] > INVISIBLE
311+
312+
313+
@pytest.mark.parametrize("backend", ["numpy", "cupy"])
314+
def test_viewshed_max_distance_matches_full(backend):
315+
"""max_distance results should match full viewshed within the radius."""
316+
if backend == "cupy":
317+
if not has_rtx():
318+
pytest.skip("rtxpy not available")
319+
else:
320+
import cupy as cp
321+
322+
ny, nx = 10, 8
323+
np.random.seed(123)
324+
arr = np.random.uniform(0, 3, (ny, nx))
325+
xs = np.arange(nx, dtype=float)
326+
ys = np.arange(ny, dtype=float)
327+
if backend == "cupy":
328+
arr_backend = cp.asarray(arr)
329+
else:
330+
arr_backend = arr.copy()
331+
332+
raster = xa.DataArray(arr_backend, coords=dict(x=xs, y=ys),
333+
dims=["y", "x"])
334+
v_full = viewshed(raster, x=4.0, y=5.0, observer_elev=10)
335+
336+
if backend == "cupy":
337+
arr_backend = cp.asarray(arr)
338+
else:
339+
arr_backend = arr.copy()
340+
raster2 = xa.DataArray(arr_backend, coords=dict(x=xs, y=ys),
341+
dims=["y", "x"])
342+
v_dist = viewshed(raster2, x=4.0, y=5.0, observer_elev=10,
343+
max_distance=3.5)
344+
345+
full_vals = v_full.values if isinstance(v_full.data, np.ndarray) \
346+
else v_full.data.get()
347+
dist_vals = v_dist.values if isinstance(v_dist.data, np.ndarray) \
348+
else v_dist.data.get()
349+
350+
# Within the circular radius, results should match
351+
obs_r, obs_c = 5, 4
352+
max_d = 3.5
353+
for r in range(max(0, obs_r - 3), min(ny, obs_r + 4)):
354+
for c in range(max(0, obs_c - 3), min(nx, obs_c + 4)):
355+
dr = (r - obs_r) * 1.0 # ns_res = 1
356+
dc = (c - obs_c) * 1.0 # ew_res = 1
357+
if np.sqrt(dr**2 + dc**2) > max_d:
358+
continue # outside circle — correctly INVISIBLE
359+
np.testing.assert_allclose(
360+
dist_vals[r, c], full_vals[r, c],
361+
atol=0.03,
362+
err_msg=f"Mismatch at ({r},{c})")

0 commit comments

Comments
 (0)