Add Hypothesis property/fuzz tests for geotiff (#1661)#1666
Merged
Conversation
…ps (#1661) Adds xrspatial/geotiff/tests/test_fuzz_hypothesis_1661.py with three property groups: round-trip equality through to_geotiff/open_geotiff, IFD layout permutations via make_minimal_tiff, and single-byte mutation of a small corpus. Hypothesis is gated with pytest.importorskip so the suite still collects without the dependency installed. The byte-mutation property surfaced three real bugs on first run. All three turned into bare IndexError / ZeroDivisionError on a corrupt header byte instead of the documented ValueError / TypeError contract. Fixed: - _read_strips ZeroDivisionError when RowsPerStrip == 0 - _read_strips IndexError when StripByteCounts is shorter than the expected strip count (placed after the existing dimension safety check so a crafted huge-header still raises the safety-limit error first) - IFD.sample_format IndexError when SampleFormat has count=0; falls back to the default sample_format = 1 (unsigned int) Three targeted regression tests lock in the fixes alongside the property tests that surfaced them.
brendancol
added a commit
that referenced
this pull request
May 12, 2026
Merge origin/main into the SSRF-hardening branch and address the Copilot review comments. Redirect handling (review #1, #2): - urllib3 path: stop delegating redirect-follow to urllib3. _HTTPSource now GETs with redirect=False and walks the 3xx chain in Python, re-validating each Location against _validate_http_url. Caps the chain at _HTTP_MAX_REDIRECTS (5). A public URL can no longer 3xx into a loopback or private IP. - stdlib path: install _ValidatingRedirectHandler on a module-level opener so the stdlib fallback gets the same per-hop re-validation and redirect cap. read_range / read_all both go through it now. Scheme allow-list (review #3): - Drop XRSPATIAL_GEOTIFF_ALLOWED_SCHEMES. _HTTPSource is a urllib3 / urllib Range-GET implementation; widening the validator without widening the source just moved the failure to connect time. fsspec handles every other scheme:// via _open_source's _CloudSource branch. Env-var clamp (review #4): - _max_tile_bytes_from_env now falls back to the default for zero or negative values, matching the timeout helpers. The previous max(1, val) clamp silently rejected every tile on a typoed XRSPATIAL_COG_MAX_TILE_BYTES=-1. Docs and comments (review #5, #6): - "Security and I/O limits" docs: explain that non-http(s) schemes dispatch via fsspec and that the redirect cap re-validates each hop. Drop the misleading XRSPATIAL_GEOTIFF_ALLOWED_SCHEMES example. - test_read_to_array_rejects_file_url: comment corrected to match _open_source routing file:// through fsspec, and the expected- exception list now includes ImportError. Merge conflict resolution against origin/main: - __init__.py and test_features.py: keep both GeoTIFFFallbackWarning (#1662) and UnsafeURLError in __all__ / TestPublicAPI.expected. - _reader.py _read_strips: byte-cap (this PR) and RowsPerStrip<=0 guard (#1666) both kept; cap runs first. - docs/geotiff.rst: "Security and I/O limits" and "Strict mode" live as sibling sections. xrspatial/geotiff/tests/: 1611 passed (3 pre-existing matplotlib palette failures unchanged).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #1661.
Summary
xrspatial/geotiff/tests/test_fuzz_hypothesis_1661.pywith three property groups: round-trip equality, IFD layout permutations, and single-byte mutation.hypothesisis gated viapytest.importorskipso it stays optional.ValueError/TypeErrorparser contract.Bugs surfaced by Hypothesis
The fuzzer ran an exhaustive sweep over the 198-byte
le_strip_f32corpus member (4x4 float32 stripped, no geo). Three offsets, all flipped to0x00, produced non-typed exceptions:ZeroDivisionErrorin_read_strips: r0 // rpsrps <= 0->ValueErrorIndexError: tuple index out of rangein_read_stripsIndexError: tuple index out of rangeinIFD.sample_formatsample_format = 1when the tuple is emptyAfter the fixes, an exhaustive sweep across all four corpus members (
le_strip_f32,be_strip_u16,le_tiled_f32,le_geo_f32) produces zero non-typed exceptions across every offset/byte combination.Why this matters
These are the same bug shape as the recent commit log cited in the issue: a header field is read without a guard, a downstream operation assumes structure, and a corrupt or unusual input crashes with the wrong exception type. Hand-written tests cover the shapes the author thinks of; Hypothesis exercises the rest of the byte space cheaply.
Followups (not in this PR)
hypothesistosetup.cfg [tests]once the harness has run in CI for a release cycle without flaking.Test plan
pytest xrspatial/geotiff/tests/test_fuzz_hypothesis_1661.py -v- 10 passed in 2.77spytest xrspatial/geotiff/tests/- 1476 passed, 7 skipped, 0 new failures (3 deselected pre-existing matplotlib RecursionErrors unrelated to this PR)