geotiff: bundle remaining ambiguous-metadata fail-closed checks (#1987)#2030
Open
brendancol wants to merge 1 commit into
Open
geotiff: bundle remaining ambiguous-metadata fail-closed checks (#1987)#2030brendancol wants to merge 1 commit into
brendancol wants to merge 1 commit into
Conversation
Bundles four of the five remaining #1987 slices (PRs 2, 3, 4, 7 in the issue's numbering). PR 6 (``ConflictingCRSError``) landed earlier as this branch's parent. PR 5 (``MixedBandMetadataError``) is intentionally deferred to a follow-up that also migrates ~35 existing VRT test fixtures via the new ``band_nodata='first'`` opt-out kwarg. Active checks added in this PR ------------------------------ * ``UnparseableCRSError`` (#1987 PR 2): - Write: typed the existing ``_validate_crs_fallback`` raise from plain ``ValueError`` to ``UnparseableCRSError``. No behaviour change (subclass relationship). - Read: new ``_check_read_unparseable_crs`` that runs ``pyproj.CRS. from_user_input`` on ``crs_wkt`` and raises if pyproj cannot parse. Tolerates pyproj-parseable placeholders like ``"EPSG:4326"`` that the GDAL VRT ``<SRS>`` convention stashes into ``crs_wkt``. - Opt-out: ``allow_unparseable_crs=True`` kwarg, threaded through ``open_geotiff`` / ``read_geotiff_dask`` / ``read_geotiff_gpu`` / ``read_vrt``. * ``RotatedTransformError`` (#1987 PR 3): - Read: new ``_check_read_rotated_transform`` that rejects affine transforms with non-zero b / d terms (rasterio Affine ordering). Downstream xrspatial ops (slope, aspect, hillshade, proximity, zonal) assume axis-aligned grids; a rotated transform silently produced wrong results. - Opt-out: ``allow_rotated=True`` kwarg, same threading as ``allow_unparseable_crs``. * ``NonUniformCoordsError`` (#1987 PR 4): - Write: new ``_check_write_non_uniform_coords`` that diffs the ``y`` and ``x`` coord arrays against the first step and rejects when relative drift exceeds 1e-5 (mirrors the existing #1720 coord-regularity tolerance). The int-dtype sentinel from #1969 is exempted (the no-georef fallback uses 0..N-1 ints which the writer treats specially). - No new kwarg: the fix is to resample, not to opt out. * ``ConflictingNodataError`` (#1987 PR 7): - Write: new ``_check_write_conflicting_nodata`` that refuses when ``attrs['nodata']`` disagrees with every concrete entry in ``attrs['nodatavals']``. ``None`` and NaN entries in the rioxarray tuple are skipped (same convention as ``_resolve_nodata_attr``). NaN as the canonical value paired with a concrete numeric in the tuple also raises -- "NaN is the sentinel" and "X is the sentinel" contradict. - Opt-out: explicit ``nodata=`` writer kwarg overrides both attrs and bypasses the check. Shared infrastructure --------------------- * ``_attrs.py`` gains ``_validate_read_geo_info(geo_info, *, window, allow_rotated, allow_unparseable_crs)`` -- one helper called from the four read backends so the check site is uniform. * Each writer entry point (``to_geotiff``, ``_write_vrt_tiled``, ``write_geotiff_gpu``) now builds a context dict carrying ``crs_kwarg`` / ``attrs_crs`` / ``attrs_crs_wkt`` / ``nodata_kwarg`` / ``attrs_nodata`` / ``attrs_nodatavals`` / ``coord_y`` / ``coord_x`` and feeds it to ``validate_write_metadata``. Deferred: MixedBandMetadataError (#1987 PR 5) --------------------------------------------- The mixed-band check function and its constants are defined in ``_validation.py`` but the registration call is commented out. The ``band_nodata=`` kwarg is threaded through both ``read_vrt`` and ``_read_vrt_chunked`` so the follow-up PR is a one-line registration plus the test-fixture migration. About 35 existing VRT tests would need to opt in to ``band_nodata='first'`` to keep their legacy assertions, which is its own commit. Test updates ------------ * ``test_attrs_contract_aliases_1984.py::test_canonical_nodata_wins_over_aliases`` split into a resolver-layer test (still asserts canonical wins) and a write-layer test (now asserts ``ConflictingNodataError``). * ``test_nodata_attr_aliases_1582.py::test_explicit_nodata_attr_wins_over_aliases`` updated to expect ``ConflictingNodataError`` on the disagreement and to show the ``nodata=`` kwarg opt-out. * ``test_reader_kwarg_order_1935.py`` updated: the new ``allow_rotated`` / ``allow_unparseable_crs`` kwargs join the canonical order; ``band_nodata`` goes in ``read_vrt``'s ``allowed_tail`` since it is VRT-specific; ``read_geotiff_gpu``'s deprecated ``gpu`` alias moved back to the tail. * ``test_ambiguous_metadata_hooks_1987.py`` framework tests now use an opaque ``_dispatch_probe`` payload key instead of ``"crs_wkt": "EPSG:4326"`` / ``"MALFORMED"`` placeholders that the newly registered ``_check_read_unparseable_crs`` would refuse. * ``test_remaining_fail_closed_1987.py`` -- 19 new tests covering each active check plus the round-trip read-write contract. Verification ------------ - ``pytest xrspatial/geotiff/tests/ -k 'not gpu and not cuda'`` -- 3013 passed (was 2994 pre-PR, +19 new). - ``pytest xrspatial/geotiff/tests/test_remaining_fail_closed_1987.py -v`` -- 19 passed. Refs #1987.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Bundles four of the five remaining #1987 slices:
UnparseableCRSError,RotatedTransformError,NonUniformCoordsError, andConflictingNodataError.ConflictingCRSErroralready merged via #2021.MixedBandMetadataErroris deferred to a follow-up so the ~35 existing VRT test fixtures can migrate to the newband_nodata='first'opt-out in their own commit.Builds on
This branch was authored on top of 1987-pr1-conflicting-crs-write so the two stay clean together. Easiest if #2021 lands first.
Active checks added
UnparseableCRSError(#1987 PR 2)_validate_crs_fallbackraise from plainValueErrortoUnparseableCRSError(subclass; no behaviour change)._check_read_unparseable_crsthat runspyproj.CRS.from_user_inputoncrs_wkt. ToleratesEPSG:4326-style placeholders (the GDAL<SRS>convention).allow_unparseable_crs=Truekwarg, threaded throughopen_geotiff/read_geotiff_dask/read_geotiff_gpu/read_vrt.RotatedTransformError(#1987 PR 3)allow_rotated=True, same threading.NonUniformCoordsError(#1987 PR 4)ConflictingNodataError(#1987 PR 7)attrs['nodata']disagrees with every concrete entry inattrs['nodatavals'].NoneandNaNentries in the rioxarray tuple are skipped. NaN canonical + concrete tuple entry also raises (contradictory sentinels).nodata=writer kwarg overrides both attrs.Shared infrastructure
_attrs.pygains_validate_read_geo_info(...), a one-call wrapper used from all four read backends so each backend has a single check call site.crs_kwarg/attrs_crs/attrs_crs_wkt/nodata_kwarg/attrs_nodata/attrs_nodatavals/coord_y/coord_xand feeds it tovalidate_write_metadata.Deferred:
MixedBandMetadataError(#1987 PR 5)The check function and constants are in
_validation.pybut the registration is commented out. Theband_nodata=kwarg is already threaded throughread_vrt/_read_vrt_chunkedso the follow-up is a one-line registration plus the test sweep over ~35 fixtures.Test updates
test_attrs_contract_aliases_1984.py::test_canonical_nodata_wins_over_aliasesnodata=kwarg opt-out).test_nodata_attr_aliases_1582.py::test_explicit_nodata_attr_wins_over_aliasesConflictingNodataErrorplus the kwarg-bypass path.test_reader_kwarg_order_1935.pyband_nodataallowed inread_vrt's tail;gpualias moved back to last onread_geotiff_gpu.test_ambiguous_metadata_hooks_1987.py_dispatch_probe) instead of thecrs_wkt/MALFORMEDplaceholders the new check would refuse.test_remaining_fail_closed_1987.pyTest plan
pytest xrspatial/geotiff/tests/ -k "not gpu and not cuda"— 3013 passed (+19 new tests vs pre-PR baseline).pytest xrspatial/geotiff/tests/test_remaining_fail_closed_1987.py -v— 19 passed.Refs #1987.