Add memory guard for proximity line-sweep dask path#1113
Merged
brendancol merged 2 commits intoissue-1110from Mar 31, 2026
Merged
Add memory guard for proximity line-sweep dask path#1113brendancol merged 2 commits intoissue-1110from
brendancol merged 2 commits intoissue-1110from
Conversation
When max_distance >= raster diagonal, the line-sweep algorithm rechunks the entire array into a single chunk. Add a memory estimate check (~35 bytes/pixel working memory) that raises ValueError before the rechunk if the working set would exceed 80% of available RAM.
When max_distance >= raster diagonal and the non-KDTree path is used (GREAT_CIRCLE metric or no scipy), the line-sweep rechunks to a single chunk. The existing memory guard already catches this for the GREAT_CIRCLE case. Add a pre-rechunk estimate (~35 bytes/pixel) to _process_dask for the general case, raising ValueError before the rechunk if working memory would exceed 80% of available RAM. The EUCLIDEAN/MANHATTAN + scipy path uses the memory-safe KDTree and already has its own guards.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
_process_daskthat raisesValueErrorbefore the single-chunk rechunk if working memory would exceed 80% of available RAMContext
Found during performance sweep triage (#1111). The default
max_distance=np.infwith scipy installed routes to the memory-safe KDTree path, so the initial "every dask user OOMs" assessment was overstated. The OOM risk is real for GREAT_CIRCLE or when scipy is missing, and those paths already had guards (lines 1309-1324). This adds an additional guard in_process_daskfor the general line-sweep case.Test plan
test_proximity_dask_inf_distance_memory_guard-- verifies MemoryError raised for GREAT_CIRCLE with tight memory