Skip to content

Commit 2ba24b1

Browse files
authored
geotiff: bundle remaining ambiguous-metadata fail-closed checks (#1987) (#2030) (#2031)
* geotiff: bundle remaining ambiguous-metadata fail-closed checks (#1987) Bundles four of the five remaining #1987 slices (PRs 2, 3, 4, 7 in the issue's numbering). PR 6 (``ConflictingCRSError``) landed earlier as this branch's parent. PR 5 (``MixedBandMetadataError``) is intentionally deferred to a follow-up that also migrates ~35 existing VRT test fixtures via the new ``band_nodata='first'`` opt-out kwarg. Active checks added in this PR ------------------------------ * ``UnparseableCRSError`` (#1987 PR 2): - Write: typed the existing ``_validate_crs_fallback`` raise from plain ``ValueError`` to ``UnparseableCRSError``. No behaviour change (subclass relationship). - Read: new ``_check_read_unparseable_crs`` that runs ``pyproj.CRS. from_user_input`` on ``crs_wkt`` and raises if pyproj cannot parse. Tolerates pyproj-parseable placeholders like ``"EPSG:4326"`` that the GDAL VRT ``<SRS>`` convention stashes into ``crs_wkt``. - Opt-out: ``allow_unparseable_crs=True`` kwarg, threaded through ``open_geotiff`` / ``read_geotiff_dask`` / ``read_geotiff_gpu`` / ``read_vrt``. * ``RotatedTransformError`` (#1987 PR 3): - Read: new ``_check_read_rotated_transform`` that rejects affine transforms with non-zero b / d terms (rasterio Affine ordering). Downstream xrspatial ops (slope, aspect, hillshade, proximity, zonal) assume axis-aligned grids; a rotated transform silently produced wrong results. - Opt-out: ``allow_rotated=True`` kwarg, same threading as ``allow_unparseable_crs``. * ``NonUniformCoordsError`` (#1987 PR 4): - Write: new ``_check_write_non_uniform_coords`` that diffs the ``y`` and ``x`` coord arrays against the first step and rejects when relative drift exceeds 1e-5 (mirrors the existing #1720 coord-regularity tolerance). The int-dtype sentinel from #1969 is exempted (the no-georef fallback uses 0..N-1 ints which the writer treats specially). - No new kwarg: the fix is to resample, not to opt out. * ``ConflictingNodataError`` (#1987 PR 7): - Write: new ``_check_write_conflicting_nodata`` that refuses when ``attrs['nodata']`` disagrees with every concrete entry in ``attrs['nodatavals']``. ``None`` and NaN entries in the rioxarray tuple are skipped (same convention as ``_resolve_nodata_attr``). NaN as the canonical value paired with a concrete numeric in the tuple also raises -- "NaN is the sentinel" and "X is the sentinel" contradict. - Opt-out: explicit ``nodata=`` writer kwarg overrides both attrs and bypasses the check. Shared infrastructure --------------------- * ``_attrs.py`` gains ``_validate_read_geo_info(geo_info, *, window, allow_rotated, allow_unparseable_crs)`` -- one helper called from the four read backends so the check site is uniform. * Each writer entry point (``to_geotiff``, ``_write_vrt_tiled``, ``write_geotiff_gpu``) now builds a context dict carrying ``crs_kwarg`` / ``attrs_crs`` / ``attrs_crs_wkt`` / ``nodata_kwarg`` / ``attrs_nodata`` / ``attrs_nodatavals`` / ``coord_y`` / ``coord_x`` and feeds it to ``validate_write_metadata``. Deferred: MixedBandMetadataError (#1987 PR 5) --------------------------------------------- The mixed-band check function and its constants are defined in ``_validation.py`` but the registration call is commented out. The ``band_nodata=`` kwarg is threaded through both ``read_vrt`` and ``_read_vrt_chunked`` so the follow-up PR is a one-line registration plus the test-fixture migration. About 35 existing VRT tests would need to opt in to ``band_nodata='first'`` to keep their legacy assertions, which is its own commit. Test updates ------------ * ``test_attrs_contract_aliases_1984.py::test_canonical_nodata_wins_over_aliases`` split into a resolver-layer test (still asserts canonical wins) and a write-layer test (now asserts ``ConflictingNodataError``). * ``test_nodata_attr_aliases_1582.py::test_explicit_nodata_attr_wins_over_aliases`` updated to expect ``ConflictingNodataError`` on the disagreement and to show the ``nodata=`` kwarg opt-out. * ``test_reader_kwarg_order_1935.py`` updated: the new ``allow_rotated`` / ``allow_unparseable_crs`` kwargs join the canonical order; ``band_nodata`` goes in ``read_vrt``'s ``allowed_tail`` since it is VRT-specific; ``read_geotiff_gpu``'s deprecated ``gpu`` alias moved back to the tail. * ``test_ambiguous_metadata_hooks_1987.py`` framework tests now use an opaque ``_dispatch_probe`` payload key instead of ``"crs_wkt": "EPSG:4326"`` / ``"MALFORMED"`` placeholders that the newly registered ``_check_read_unparseable_crs`` would refuse. * ``test_remaining_fail_closed_1987.py`` -- 19 new tests covering each active check plus the round-trip read-write contract. Verification ------------ - ``pytest xrspatial/geotiff/tests/ -k 'not gpu and not cuda'`` -- 3013 passed (was 2994 pre-PR, +19 new). - ``pytest xrspatial/geotiff/tests/test_remaining_fail_closed_1987.py -v`` -- 19 passed. Refs #1987. * geotiff: address review suggestions and nits on #1987 fail-closed bundle - vrt.py: eager read_vrt now parses the VRT XML and runs the #1987 read-side validators *before* _read_vrt_internal materialises the mosaic, so a rejected file fails fast instead of loading the full array into host memory first. The parsed VRTDataset is threaded through via the existing `parsed=` kwarg so the XML is parsed only once. - _validation.py: extracted `_gdal_geotransform_to_affine_tuple` so the eager and chunked VRT call sites share one GDAL->rasterio reorder helper instead of duplicating the index shuffle. - _attrs.py: documented in `_validate_read_geo_info` that the built transform is always axis-aligned (rotated TIFFs raise NotImplementedError upstream in `_geotags`), so the rotated check fires only on the VRT path. - tests: added direct coverage for the row-axis rotation branch (GDAL GT[4] non-zero) and for the zero-step / constant-float-coord branch in `_check_write_non_uniform_coords`. Refactored the rotated-VRT fixture to take a `geo_transform` kwarg so both rotation axes share the same builder.
1 parent 8bc981d commit 2ba24b1

15 files changed

Lines changed: 1097 additions & 36 deletions

CHANGELOG.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,10 @@
1313
- Default internal `_vrt.read_vrt` `missing_sources` to `'raise'` so an unreadable VRT source no longer produces a silent zero-fill hole on integer rasters; pass `missing_sources='warn'` to opt back into the previous lenient behaviour (#1843)
1414
- Deprecate read-side emission of matplotlib colormap-derived attrs (cmap, colormap_rgba) on palette TIFFs; the writer cannot set Photometric=3 so they do not round-trip. Construct ListedColormap from attrs['colormap'] in caller code. These attrs still emit for now but trigger a DeprecationWarning. Removal planned for a future release. (#1984)
1515
- Reject writes whose `attrs['crs']` and `attrs['crs_wkt']` resolve to different CRSes (after pyproj canonicalisation) instead of silently emitting the EPSG and dropping the WKT. The new `ConflictingCRSError` (subclass of `ValueError` and `GeoTIFFAmbiguousMetadataError`) names the offending attrs; pass `crs=` explicitly to override both attrs and bypass the check. Read-back DataArrays carrying both attrs continue to round-trip because the reader's two attrs derive from the same on-disk CRS. (#1987)
16+
- Reject reads whose CRS string cannot be parsed by pyproj instead of emitting it verbatim in `attrs['crs_wkt']` and letting downstream code crash on first use. Raises `UnparseableCRSError` (subclass of `ValueError`); pass `allow_unparseable_crs=True` to `open_geotiff` / `read_geotiff_dask` / `read_geotiff_gpu` / `read_vrt` to keep the legacy behaviour. The existing write-side raise from `_validate_crs_fallback` was retyped from plain `ValueError` to the new `UnparseableCRSError` subclass (no behaviour change). (#1987)
17+
- Reject reads whose affine transform has non-zero rotation/shear terms instead of returning an axis-misaligned grid that downstream xrspatial ops (slope, aspect, hillshade, proximity, zonal) silently compute wrong results on. Raises `RotatedTransformError`; pass `allow_rotated=True` to `open_geotiff` / `read_geotiff_dask` / `read_geotiff_gpu` / `read_vrt` to read the pixel grid without the axis-aligned-grid assumption. (#1987)
18+
- Reject writes whose `coords['y']` or `coords['x']` are not uniformly spaced instead of silently using the first two values as the pixel size and misrepresenting the rest of the axis. Raises `NonUniformCoordsError`; the existing int-dtype sentinel convention from #1969 (used by the no-georef coord fallback) is exempted. (#1987)
19+
- Reject writes whose `attrs['nodata']` disagrees with every concrete entry in `attrs['nodatavals']` instead of silently picking the canonical scalar and dropping the rioxarray tuple. Raises `ConflictingNodataError`; pass `nodata=` explicitly to the writer to override both attrs and bypass the check. `_FillValue` continues to be deprioritised per the existing resolver convention. (#1987)
1620

1721

1822
### Version 0.9.9 - 2026-05-05

xrspatial/geotiff/__init__.py

Lines changed: 20 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -79,6 +79,7 @@
7979
_extent_to_window,
8080
_extract_rich_tags,
8181
_populate_attrs_from_geo_info,
82+
_validate_read_geo_info,
8283
_resolve_nodata_attr,
8384
_set_nodata_attrs,
8485
)
@@ -251,6 +252,8 @@ def open_geotiff(source: str | BinaryIO, *,
251252
max_cloud_bytes=_MAX_CLOUD_BYTES_SENTINEL,
252253
on_gpu_failure: str = _ON_GPU_FAILURE_SENTINEL,
253254
missing_sources: str = _MISSING_SOURCES_SENTINEL,
255+
allow_rotated: bool = False,
256+
allow_unparseable_crs: bool = False,
254257
) -> xr.DataArray:
255258
"""Read a GeoTIFF, COG, or VRT file into an xarray.DataArray.
256259
@@ -441,7 +444,10 @@ def open_geotiff(source: str | BinaryIO, *,
441444
vrt_kwargs['missing_sources'] = missing_sources
442445
return read_vrt(source, dtype=dtype, window=window, band=band,
443446
name=name, chunks=chunks, gpu=gpu,
444-
max_pixels=max_pixels, **vrt_kwargs)
447+
max_pixels=max_pixels,
448+
allow_rotated=allow_rotated,
449+
allow_unparseable_crs=allow_unparseable_crs,
450+
**vrt_kwargs)
445451

446452
# File-like buffers don't support the GPU or dask code paths because
447453
# those re-open the source by path from worker tasks or device-side
@@ -466,14 +472,18 @@ def open_geotiff(source: str | BinaryIO, *,
466472
window=window, band=band,
467473
name=name, chunks=chunks,
468474
max_pixels=max_pixels,
475+
allow_rotated=allow_rotated,
476+
allow_unparseable_crs=allow_unparseable_crs,
469477
**gpu_kwargs)
470478

471479
# Dask path (CPU)
472480
if chunks is not None:
473481
return read_geotiff_dask(source, dtype=dtype, chunks=chunks,
474482
overview_level=overview_level,
475483
window=window, band=band,
476-
max_pixels=max_pixels, name=name)
484+
max_pixels=max_pixels, name=name,
485+
allow_rotated=allow_rotated,
486+
allow_unparseable_crs=allow_unparseable_crs)
477487

478488
kwargs = {}
479489
if max_pixels is not None:
@@ -514,6 +524,14 @@ def open_geotiff(source: str | BinaryIO, *,
514524
import os
515525
name = os.path.splitext(os.path.basename(source))[0]
516526

527+
# Issue #1987 ambiguous-metadata checks. Run before attrs population
528+
# so a rejected file does not leak a partly-populated attrs dict.
529+
_validate_read_geo_info(
530+
geo_info, window=window,
531+
allow_rotated=allow_rotated,
532+
allow_unparseable_crs=allow_unparseable_crs,
533+
)
534+
517535
attrs = {}
518536
_populate_attrs_from_geo_info(attrs, geo_info, window=window)
519537

xrspatial/geotiff/_attrs.py

Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -467,6 +467,57 @@ def _set_nodata_attrs(attrs: dict, nodata, *, array_dtype) -> None:
467467
attrs['masked_nodata'] = bool(np.dtype(array_dtype).kind == 'f')
468468

469469

470+
def _validate_read_geo_info(
471+
geo_info,
472+
*,
473+
window=None,
474+
allow_rotated: bool = False,
475+
allow_unparseable_crs: bool = False,
476+
) -> None:
477+
"""Run issue #1987 read-side ambiguous-metadata checks against ``geo_info``.
478+
479+
Centralised helper so the eager numpy, dask, GPU, and VRT read
480+
paths run the same checks before constructing the returned
481+
DataArray. Forwards ``allow_rotated`` / ``allow_unparseable_crs``
482+
to the registered checks (``_check_read_rotated_transform`` and
483+
``_check_read_unparseable_crs`` today; sibling checks attach via
484+
the registry).
485+
486+
Raises whichever ``GeoTIFFAmbiguousMetadataError`` subclass a
487+
registered check picks. The hook is a no-op when no check is
488+
registered, so callers can use this helper unconditionally without
489+
coupling each backend to the current check list.
490+
491+
Note: the transform tuple built here is always axis-aligned
492+
(``b == 0`` / ``d == 0``) because ``_transform_tuple_from_pixel_geometry``
493+
only carries origin + pixel size, and the upstream TIFF reader
494+
rejects rotated ``ModelTransformationTag`` entries with
495+
``NotImplementedError`` in ``_geotags._extract_transform_and_georef``
496+
before we reach this helper. The rotated-transform check therefore
497+
fires only on the VRT path, which builds its context from the GDAL
498+
``geo_transform`` via ``_gdal_geotransform_to_affine_tuple``.
499+
"""
500+
from ._validation import validate_read_metadata
501+
transform_for_check = (
502+
_transform_tuple_from_pixel_geometry(
503+
geo_info.transform.origin_x,
504+
geo_info.transform.origin_y,
505+
geo_info.transform.pixel_width,
506+
geo_info.transform.pixel_height,
507+
window=window,
508+
)
509+
if (geo_info.transform is not None
510+
and getattr(geo_info, 'has_georef', True))
511+
else None
512+
)
513+
validate_read_metadata({
514+
'allow_rotated': allow_rotated,
515+
'allow_unparseable_crs': allow_unparseable_crs,
516+
'transform': transform_for_check,
517+
'crs_wkt': geo_info.crs_wkt,
518+
})
519+
520+
470521
def _populate_attrs_from_geo_info(attrs: dict, geo_info, *, window=None) -> None:
471522
"""Populate ``attrs`` with all GeoTIFF metadata from ``geo_info``.
472523

xrspatial/geotiff/_backends/dask.py

Lines changed: 15 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,11 @@
1717
import numpy as np
1818
import xarray as xr
1919

20-
from .._attrs import _populate_attrs_from_geo_info, _set_nodata_attrs
20+
from .._attrs import (
21+
_populate_attrs_from_geo_info,
22+
_set_nodata_attrs,
23+
_validate_read_geo_info,
24+
)
2125
from .._coords import (
2226
coords_from_geo_info as _coords_from_geo_info,
2327
geo_to_coords as _geo_to_coords,
@@ -34,7 +38,9 @@ def read_geotiff_dask(source: str, *,
3438
band: int | None = None,
3539
name: str | None = None,
3640
chunks: int | tuple = 512,
37-
max_pixels: int | None = None) -> xr.DataArray:
41+
max_pixels: int | None = None,
42+
allow_rotated: bool = False,
43+
allow_unparseable_crs: bool = False) -> xr.DataArray:
3844
"""Read a GeoTIFF as a dask-backed DataArray for out-of-core processing.
3945
4046
Each chunk is loaded lazily via windowed reads.
@@ -294,6 +300,13 @@ def read_geotiff_dask(source: str, *,
294300
import os
295301
name = os.path.splitext(os.path.basename(source))[0]
296302

303+
# Issue #1987 ambiguous-metadata checks.
304+
_validate_read_geo_info(
305+
geo_info, window=window,
306+
allow_rotated=allow_rotated,
307+
allow_unparseable_crs=allow_unparseable_crs,
308+
)
309+
297310
attrs = {}
298311
_populate_attrs_from_geo_info(attrs, geo_info, window=window)
299312
# ``masked_nodata`` reflects the declared dask graph dtype: a float

xrspatial/geotiff/_backends/gpu.py

Lines changed: 36 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,11 @@
2020
import numpy as np
2121
import xarray as xr
2222

23-
from .._attrs import _populate_attrs_from_geo_info, _set_nodata_attrs
23+
from .._attrs import (
24+
_populate_attrs_from_geo_info,
25+
_set_nodata_attrs,
26+
_validate_read_geo_info,
27+
)
2428
from .._coords import (
2529
coords_from_geo_info as _coords_from_geo_info,
2630
)
@@ -80,6 +84,8 @@ def read_geotiff_gpu(source: str, *,
8084
chunks: int | tuple | None = None,
8185
max_pixels: int | None = None,
8286
on_gpu_failure: str = _ON_GPU_FAILURE_SENTINEL,
87+
allow_rotated: bool = False,
88+
allow_unparseable_crs: bool = False,
8389
gpu: str = _GPU_DEPRECATED_SENTINEL,
8490
) -> xr.DataArray:
8591
"""Read a GeoTIFF with GPU-accelerated decompression via Numba CUDA.
@@ -233,6 +239,8 @@ def read_geotiff_gpu(source: str, *,
233239
source, dtype=dtype, chunks=chunks,
234240
overview_level=overview_level, window=window, band=band,
235241
name=name, max_pixels=max_pixels,
242+
allow_rotated=allow_rotated,
243+
allow_unparseable_crs=allow_unparseable_crs,
236244
)
237245

238246
from .._reader import (
@@ -382,6 +390,11 @@ def read_geotiff_gpu(source: str, *,
382390
if name is None:
383391
import os
384392
name = os.path.splitext(os.path.basename(source))[0]
393+
_validate_read_geo_info(
394+
geo_info, window=window,
395+
allow_rotated=allow_rotated,
396+
allow_unparseable_crs=allow_unparseable_crs,
397+
)
385398
attrs = {}
386399
_populate_attrs_from_geo_info(attrs, geo_info, window=window)
387400
# Apply nodata mask + record sentinel so the GPU read agrees
@@ -710,6 +723,12 @@ def _read_once():
710723
import os
711724
name = os.path.splitext(os.path.basename(source))[0]
712725

726+
_validate_read_geo_info(
727+
geo_info, window=window,
728+
allow_rotated=allow_rotated,
729+
allow_unparseable_crs=allow_unparseable_crs,
730+
)
731+
713732
attrs = {}
714733
_populate_attrs_from_geo_info(attrs, geo_info, window=window)
715734
# ``attrs['nodata']`` + ``attrs['masked_nodata']`` reflect the
@@ -881,7 +900,9 @@ def _decode_window_gpu_direct(file_path, all_offsets, all_byte_counts,
881900

882901

883902
def _read_geotiff_gpu_chunked(source, *, dtype, chunks, overview_level,
884-
window, band, name, max_pixels):
903+
window, band, name, max_pixels,
904+
allow_rotated: bool = False,
905+
allow_unparseable_crs: bool = False):
885906
"""Lazy Dask+CuPy backend for ``read_geotiff_gpu(chunks=...)``.
886907
887908
Two paths produce the same shape of dask graph:
@@ -941,6 +962,8 @@ def _read_geotiff_gpu_chunked(source, *, dtype, chunks, overview_level,
941962
src_path, ifd, geo_info, header,
942963
dtype=dtype, chunks=chunks, window=window, band=band,
943964
name=name, max_pixels=max_pixels,
965+
allow_rotated=allow_rotated,
966+
allow_unparseable_crs=allow_unparseable_crs,
944967
)
945968
except Exception:
946969
# GDS qualification failed; fall back to the CPU path. The
@@ -952,6 +975,8 @@ def _read_geotiff_gpu_chunked(source, *, dtype, chunks, overview_level,
952975
source, dtype=dtype, chunks=chunks,
953976
overview_level=overview_level, window=window, band=band,
954977
max_pixels=max_pixels, name=name,
978+
allow_rotated=allow_rotated,
979+
allow_unparseable_crs=allow_unparseable_crs,
955980
)
956981

957982
cpu_dask_arr = cpu_da.data
@@ -973,7 +998,9 @@ def _upload(block):
973998

974999
def _read_geotiff_gpu_chunked_gds(source, ifd, geo_info, header, *,
9751000
dtype, chunks, window, band, name,
976-
max_pixels):
1001+
max_pixels,
1002+
allow_rotated: bool = False,
1003+
allow_unparseable_crs: bool = False):
9771004
"""Build a Dask+CuPy graph that decodes each chunk disk->GPU.
9781005
9791006
Caller must have verified that the source qualifies via
@@ -1159,6 +1186,12 @@ def _chunk_task(meta, r0, c0, r1, c1):
11591186
else:
11601187
dims = ['y', 'x']
11611188

1189+
_validate_read_geo_info(
1190+
geo_info, window=window,
1191+
allow_rotated=allow_rotated,
1192+
allow_unparseable_crs=allow_unparseable_crs,
1193+
)
1194+
11621195
attrs = {}
11631196
_populate_attrs_from_geo_info(attrs, geo_info, window=window)
11641197
# ``masked_nodata`` reflects the declared dask graph dtype; mirrors

xrspatial/geotiff/_backends/vrt.py

Lines changed: 59 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,10 @@ def read_vrt(source: str, *,
3737
chunks: int | tuple | None = None,
3838
gpu: bool = False,
3939
max_pixels: int | None = None,
40-
missing_sources: str = 'raise') -> xr.DataArray:
40+
missing_sources: str = 'raise',
41+
allow_rotated: bool = False,
42+
allow_unparseable_crs: bool = False,
43+
band_nodata: str | None = None) -> xr.DataArray:
4144
"""Read a GDAL Virtual Raster Table (.vrt) into an xarray.DataArray.
4245
4346
The VRT's source GeoTIFFs are read via windowed reads and assembled
@@ -158,11 +161,43 @@ def read_vrt(source: str, *,
158161
dtype=dtype,
159162
max_pixels=max_pixels,
160163
missing_sources=missing_sources,
164+
allow_rotated=allow_rotated,
165+
allow_unparseable_crs=allow_unparseable_crs,
166+
band_nodata=band_nodata,
161167
)
162168

169+
# Issue #1987 ambiguous-metadata checks for the eager VRT path. Parse
170+
# the VRT XML up front and validate before ``_read_vrt_internal``
171+
# touches any pixel data, so a rejected file does not first
172+
# materialise the full mosaic into host memory. The parsed
173+
# ``VRTDataset`` is threaded into the internal reader via ``parsed=``
174+
# so we don't double-parse the XML.
175+
import os as _os
176+
from .._validation import (
177+
validate_read_metadata,
178+
_gdal_geotransform_to_affine_tuple,
179+
)
180+
from .._vrt import parse_vrt as _parse_vrt, _read_vrt_xml
181+
_xml_str = _read_vrt_xml(source)
182+
_vrt_dir = _os.path.dirname(_os.path.abspath(source))
183+
_parsed_vrt = _parse_vrt(_xml_str, _vrt_dir)
184+
validate_read_metadata({
185+
'allow_rotated': allow_rotated,
186+
'allow_unparseable_crs': allow_unparseable_crs,
187+
'transform': _gdal_geotransform_to_affine_tuple(
188+
_parsed_vrt.geo_transform
189+
),
190+
'crs_wkt': _parsed_vrt.crs_wkt,
191+
'band_nodata': band_nodata,
192+
'band_nodata_values': (
193+
[b.nodata for b in _parsed_vrt.bands]
194+
if _parsed_vrt.bands else None
195+
),
196+
})
197+
163198
arr, vrt = _read_vrt_internal(
164199
source, window=window, band=band, max_pixels=max_pixels,
165-
missing_sources=missing_sources,
200+
missing_sources=missing_sources, parsed=_parsed_vrt,
166201
)
167202

168203
if name is None:
@@ -339,7 +374,10 @@ def _vrt_chunk_read(source, r0, c0, r1, c1, *,
339374

340375

341376
def _read_vrt_chunked(source, *, window, band, name, chunks, gpu, dtype,
342-
max_pixels, missing_sources):
377+
max_pixels, missing_sources,
378+
allow_rotated: bool = False,
379+
allow_unparseable_crs: bool = False,
380+
band_nodata: str | None = None):
343381
"""Lazy ``read_vrt`` dispatch when ``chunks=`` is set (issue #1814).
344382
345383
Parses the VRT XML once to recover the extent, CRS, GeoTransform,
@@ -383,6 +421,24 @@ def _read_vrt_chunked(source, *, window, band, name, chunks, gpu, dtype,
383421
vrt_dir = _os.path.dirname(_os.path.abspath(source))
384422
vrt = parse_vrt(xml_str, vrt_dir)
385423

424+
# Issue #1987 ambiguous-metadata checks on the chunked VRT path. Run
425+
# before the band-count validator below so a rejected file does not
426+
# produce side effects.
427+
from .._validation import (
428+
validate_read_metadata,
429+
_gdal_geotransform_to_affine_tuple,
430+
)
431+
validate_read_metadata({
432+
'allow_rotated': allow_rotated,
433+
'allow_unparseable_crs': allow_unparseable_crs,
434+
'transform': _gdal_geotransform_to_affine_tuple(vrt.geo_transform),
435+
'crs_wkt': vrt.crs_wkt,
436+
'band_nodata': band_nodata,
437+
'band_nodata_values': (
438+
[b.nodata for b in vrt.bands] if vrt.bands else None
439+
),
440+
})
441+
386442
# Validate ``band`` against the parsed band count, matching the
387443
# internal reader's contract so the failure mode is the same whether
388444
# the user reads eagerly or chunked.

xrspatial/geotiff/_crs.py

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,7 @@
1515
import numbers
1616
import warnings
1717

18+
from ._errors import UnparseableCRSError
1819
from ._runtime import GeoTIFFFallbackWarning, _geotiff_strict_mode
1920

2021

@@ -165,15 +166,15 @@ def _validate_crs_fallback(
165166
return
166167
if allow_unparseable_crs:
167168
return
168-
raise ValueError(
169+
raise UnparseableCRSError(
169170
"crs is not an EPSG code, is not a WKT string "
170171
"(no PROJCS / GEOGCS / PROJCRS / GEOGCRS root), and could not "
171172
f"be parsed: got {wkt_fallback!r}. Writing it verbatim to "
172173
"GTCitationGeoKey would produce a file most GeoTIFF readers "
173174
"cannot interpret. Pass an EPSG int (recommended), a real "
174175
"WKT string, install pyproj so EPSG / PROJ tokens can be "
175176
"resolved, or pass allow_unparseable_crs=True to keep the "
176-
"pre-#1929 citation-only behaviour."
177+
"pre-#1929 citation-only behaviour. See issue #1987."
177178
)
178179

179180

0 commit comments

Comments
 (0)