Add memory guard to cost_distance iterative Dijkstra + da.block assembly by brendancol · Pull Request #1119 · xarray-contrib/xarray-spatial

brendancol · 2026-03-31T18:11:43Z

Summary

Add memory guard before _preprocess_tiles: estimates ~3x dataset size and raises MemoryError if it would exceed 80% of available RAM, suggesting finite max_cost
Replace np.concatenate result assembly with da.block to avoid building a monolithic numpy array from tile results

Context

Found during performance sweep triage (#1118). The iterative tiled Dijkstra path (triggered when max_cost=inf or implied radius > chunk dimensions) caches all tiles in RAM via dask.compute(*blocks). The _assemble_result then np.concatenated everything into a single numpy array. At 30TB both are fatal.

The da.block change means the assembled result stays as a dask array with proper chunk structure, avoiding the second full materialization.

Test plan

All 44 existing cost_distance tests pass (verified)

Parallel subagent triage + ralph-loop workflow for auditing all xrspatial modules for performance bottlenecks, OOM risk under 30TB dask workloads, and backend-specific anti-patterns.

7 tasks covering command scaffold, module scoring, parallel subagent dispatch, report merging, ralph-loop generation, and smoke tests.

…1118) - Add memory guard before _preprocess_tiles: estimates ~3x dataset (source + friction cache + result) and raises MemoryError if it would exceed 80% of available RAM, suggesting finite max_cost. - Replace np.concatenate assembly with da.block to avoid building a monolithic numpy array from tile results. Tiles are now wrapped in dask.delayed and assembled lazily.

brendancol added 4 commits March 31, 2026 06:54

Add sweep-performance design spec

09b92b3

Parallel subagent triage + ralph-loop workflow for auditing all xrspatial modules for performance bottlenecks, OOM risk under 30TB dask workloads, and backend-specific anti-patterns.

Add sweep-performance implementation plan

0f243bf

7 tasks covering command scaffold, module scoring, parallel subagent dispatch, report merging, ralph-loop generation, and smoke tests.

Add sweep-performance slash command

4087176

github-actions bot added the performance PR touches performance-sensitive code label Mar 31, 2026

brendancol merged commit 1d60fbd into master Mar 31, 2026
11 checks passed

brendancol mentioned this pull request Apr 1, 2026

Add ASV benchmarks for 6 modules changed in v0.9.5 #1137

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add memory guard to cost_distance iterative Dijkstra + da.block assembly#1119

Add memory guard to cost_distance iterative Dijkstra + da.block assembly#1119
brendancol merged 4 commits intomasterfrom
issue-1118

brendancol commented Mar 31, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

brendancol commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Context

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

brendancol commented Mar 31, 2026 •

edited

Loading