Fix open_geotiff eager out-of-bounds window error (#1634)#1636
Merged
Conversation
`open_geotiff(path, window=...)` on the eager (numpy) path produced an opaque `CoordinateValidationError` when the window extended past the source extent. `read_to_array` clamped the bad window to file bounds and returned a smaller array, but the eager code in `open_geotiff` used the unclamped window indices for y/x coord generation and the windowed transform shift in `_populate_attrs_from_geo_info`, so the coord array length differed from the data and xarray refused to construct the DataArray. The dask path (`read_geotiff_dask`) already validated the window up front since #1561, raising a clear `ValueError` with the format `window=... is outside the source extent (HxW) or has non-positive size.` The two backends therefore disagreed on the contract. Fix: validate `window` up front in `open_geotiff`'s eager branch via `_read_geo_info` (metadata-only read, O(1) memory, no extra pixel cost) using the exact same condition the dask path applies, raising the same `ValueError` message format. Out-of-bounds windows now fail with one consistent error regardless of which backend the user requests. 12 regression tests in `test_window_out_of_bounds_1634.py` cover: negative-start, past-right-edge, past-bottom-edge, past-both-edges, zero-size, inverted-window, full-extent in-bounds, interior subset, edge-aligned, cross-backend parity (eager == dask), message-format parity, and the issue's exact reproducer. All 1286 existing non-network geotiff tests still pass.
Contributor
There was a problem hiding this comment.
Pull request overview
This PR fixes a backend inconsistency in xrspatial.geotiff.open_geotiff(..., window=...): the eager (NumPy) path now rejects out-of-bounds windows with the same clear ValueError message format as the dask path, instead of surfacing an opaque xarray CoordinateValidationError caused by coord/data length mismatches.
Changes:
- Added eager-path window extent validation in
open_geotiffto matchread_geotiff_dask’s contract and error message. - Added a regression test suite covering multiple out-of-bounds window shapes plus eager-vs-dask parity.
- Updated the internal sweep tracking CSV entry for the geotiff accuracy pass.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
xrspatial/geotiff/__init__.py |
Adds eager-path pre-validation for window to align error behavior with the dask backend. |
xrspatial/geotiff/tests/test_window_out_of_bounds_1634.py |
New regression tests ensuring eager and dask paths reject identical invalid windows consistently. |
.claude/sweep-accuracy-state.csv |
Records the issue #1634 sweep/audit status and summary. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Remove unused tempfile import flagged by F401 in test_window_out_of_bounds_1634.py. - Move the eager-path window-bounds check into read_to_array. The previous fix called _read_geo_info() up front in open_geotiff, which for file-like sources read the entire buffer before read_to_array read it again, doubling I/O. read_to_array already parses the IFD and has ifd.height / ifd.width before invoking _read_tiles / _read_strips, so the check piggybacks on the existing parse. Same ValueError message; one metadata pass for both backends. For path-based sources the previous code was cheap (mmap is lazy) but for BytesIO / fsspec sources this drops one full read of the buffer per windowed eager call.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
open_geotiff(path, window=...)on the eager (numpy) path produced an opaqueCoordinateValidationErrorwhen the window extended past the source extent. The dask path already rejected the same input with a clearValueError, so the two backends diverged on the contract.windowup front in the eager branch via_read_geo_info(metadata-only read, no extra pixel cost) using the exact condition the dask path applies, raising the same message format.test_window_out_of_bounds_1634.pylock the eager path to the dask path's contract.Closes #1634.
Test plan
pytest xrspatial/geotiff/tests/test_window_out_of_bounds_1634.py-- 12 new tests, all pass post-fix and verified to fail pre-fix.pytest xrspatial/geotiff/tests/excluding two HTTP-network tests and one pre-existing matplotlib failure unrelated to this change -- 1286 passed, 7 skipped (env-conditional).ValueErrorwith the same message format on the same bad window.