Skip to content

Commit 8efb6d0

Browse files
authored
Add memory guard to erosion dask paths (#1121)
* Add sweep-performance design spec Parallel subagent triage + ralph-loop workflow for auditing all xrspatial modules for performance bottlenecks, OOM risk under 30TB dask workloads, and backend-specific anti-patterns. * Add sweep-performance implementation plan 7 tasks covering command scaffold, module scoring, parallel subagent dispatch, report merging, ralph-loop generation, and smoke tests. * Add sweep-performance slash command * Add memory guard to erosion dask paths (#1120) Particle erosion is inherently global and cannot be chunked. Add _check_erosion_memory that estimates ~3x working set (input + brush scratch + output) and raises MemoryError before .compute() when the array won't fit in available RAM.
1 parent 1d60fbd commit 8efb6d0

File tree

1 file changed

+22
-0
lines changed

1 file changed

+22
-0
lines changed

xrspatial/erosion.py

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -332,12 +332,33 @@ def _erode_numpy(data, random_pos, boy, box, bw, params):
332332
return hm.astype(np.float32)
333333

334334

335+
def _check_erosion_memory(data):
336+
"""Raise MemoryError if the array is too large to materialize."""
337+
estimated = np.prod(data.shape) * data.dtype.itemsize
338+
# The erosion kernel needs ~3x: input copy + brush scratch + output
339+
working = estimated * 3
340+
try:
341+
from xrspatial.zonal import _available_memory_bytes
342+
avail = _available_memory_bytes()
343+
except ImportError:
344+
avail = 2 * 1024**3
345+
if working > 0.8 * avail:
346+
raise MemoryError(
347+
f"erode() needs ~{working / 1e9:.1f} GB to materialize and "
348+
f"process the full grid but only ~{avail / 1e9:.1f} GB "
349+
f"available. Particle erosion is a global operation that "
350+
f"cannot be chunked. Downsample the input or use a machine "
351+
f"with more RAM."
352+
)
353+
354+
335355
def _erode_dask_numpy(data, random_pos, boy, box, bw, params):
336356
"""Run erosion on a dask+numpy array.
337357
338358
Erosion is a global operation (particles traverse the full grid),
339359
so we materialize to numpy, run the CPU kernel, then re-wrap.
340360
"""
361+
_check_erosion_memory(data)
341362
np_data = data.compute()
342363
result = _erode_numpy(np_data, random_pos, boy, box, bw, params)
343364
return da.from_array(result, chunks=data.chunksize)
@@ -349,6 +370,7 @@ def _erode_dask_cupy(data, random_pos, boy, box, bw, params):
349370
Materializes to a single CuPy array, runs the GPU kernel, then
350371
re-wraps as dask.
351372
"""
373+
_check_erosion_memory(data)
352374
cp_data = data.compute() # CuPy ndarray
353375
result = _erode_cupy(cp_data, random_pos, boy, box, bw, params)
354376
return da.from_array(result, chunks=data.chunksize,

0 commit comments

Comments
 (0)