Simplify per-chunk interp1d dask path#1
Closed
thodson-usgs wants to merge 2 commits into
Closed
Conversation
Routes ``xarray.interp(method="linear"|"nearest"|"slinear")`` on a dask-chunked core dim through a per-chunk dispatch instead of ``apply_ufunc(..., allow_rechunk=True)``. For each target point, look up the source chunk that contains its coord value and run the interpolator over that chunk plus a size-1 halo. Per-task memory scales with ``source_chunk + halo`` rather than the full interp axis. Fall-back path preserves the existing behavior for cubic, multi-dim interpn, non-monotonic source coord, empty target, and numpy input. Verified against the existing apply_ufunc path on 200x400 -> 50x100 for several source-chunk layouts (bit-identical), on a 3D time-chunked input (time chunking preserved), and on the memory-constrained 6000x5000 case where the new path beats ``apply_ufunc`` by ~10x. The per-chunk path materializes 1D source coords (searchsorted-based routing); data stays lazy. ``test_dataset_interp_datetime_dask`` bumped its ``raise_if_dask_computes`` budget to account for this. Related: :issue:`9907` (already closed; same root cause) and :issue:`10130` (open; partial overlap — single-chunk-source cases still use the existing path, better addressed by the dask-side guard in dask/dask#12360). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reduce cognitive load in `_interp1d_dask_chunked` and its call site: - Consolidate call-site guards (is_chunked_array, ndim, dim-in-dims) into one eligibility check at the top of the helper, dropping a four-level nested conditional. - Drop `cast` calls and the `cast` import — unnecessary for the type checker here. - Use `np.flatnonzero`, chained `.clip()`, `np.diff` for monotonicity, and `[slice(None)] * ndim` slicer — more pythonic than the longhand equivalents. - Remove the "no blocks" defensive fallback (unreachable once `new_np.size > 0` and chunk assignments are clipped). - Remove the unused `axis=axis` default-bind from the per-chunk kernel. - Trim comments/docstrings to the why, not the what. - Drop restated docstrings and one-use locals in the new tests. No behavior change; test_interp.py suite passes unchanged. Co-authored-by: Claude <noreply@anthropic.com>
729b5b9 to
9f4f458
Compare
08ab770 to
beb87cf
Compare
Owner
Author
|
Simplification squashed into PR pydata#11312 on pydata/xarray. [This is Claude Code on behalf of Tim Hodson] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Side-by-side comparison PR (not for merging upstream). Shows the one-commit simplification of
_interp1d_dask_chunkedon top of the current PR branch.Summary
is_chunked_array,ndim,dim in var.dims) into one eligibility check at the top of the helper; drops a four-level nested conditional at the call site.castcalls and thecastimport — unneeded for the type checker here.np.flatnonzero, chained.clip(),np.difffor monotonicity, and[slice(None)] * ndimslicer — more pythonic than the longhand equivalents.new_np.size > 0and chunk assignments are clipped).axis=axisdefault-bind from the per-chunk kernel.No behavior change.
Test plan
pytest xarray/tests/test_interp.py -n auto→ 224 passed, 24 skipped, 1 xfailedpre-commit runon touched files passes[This is Claude Code on behalf of Tim Hodson]