Skip to content

Commit f0a09c0

Browse files
authored
Block write_geotiff_gpu(file_like, cog=True) for parity with to_geotiff (#1652) (#1653)
* Block write_geotiff_gpu(file_like, cog=True) for parity with to_geotiff (#1652) to_geotiff has long rejected cog=True + file-like destinations, but the explicit write_geotiff_gpu entry point silently accepted the combo and emitted a COG into the buffer. The two writers should agree on which inputs they refuse: to_geotiff(gpu=True, cog=True, path=BytesIO) raises, so write_geotiff_gpu(da, BytesIO, cog=True) should too. Mirror the existing to_geotiff guard on the GPU entry point. Non-cog file-like writes remain supported on this path (the gate is targeted at cog=True only). Add regression coverage in test_bytesio_source.py. Also clarify the path/compression docstring on write_geotiff_gpu: - path: document that file-like destinations are accepted (cog=True requires a string path). - compression: list the full codec set the function actually accepts and note the deliberate JPEG asymmetry with to_geotiff (#1651 downgraded to docs-only after PR #1647 confirmed advanced-API intent). Update .claude/sweep-api-consistency-state.csv with the 2026-05-11 re-audit row. * Address Copilot review feedback on #1653 - Mirror to_geotiff's cog=True + file-like rejection verbatim (same error string), so callers see identical messages from either entry point. Previously write_geotiff_gpu raised a different message that added "...on the GPU writer..." and dropped the BytesIO hint. - Add a TypeError gate for non-str, non-file-like path arguments, so passing e.g. an int falls through to a clear TypeError instead of an os.path / unicode error deep in the writer. Mirrors to_geotiff's existing TypeError verbatim. - Harden the BytesIO + write_geotiff_gpu tests with the repo's standard _gpu_available() helper (cupy.cuda.is_available() + ImportError guard) instead of pytest.importorskip('cupy'), so CI hosts where CuPy imports but CUDA is unavailable skip cleanly rather than hard-failing in cupy.asarray(). - Add two new regression tests: one pinning byte-for-byte parity of the cog/file-like error message between the two writers, and one pinning the new TypeError on invalid path types. Skip: the low-confidence type annotation suggestion (path: str|os.PathLike|SupportsWrite[bytes]). The other path-accepting functions in this module (to_geotiff, write_vrt, etc.) deliberately leave path untyped; adding a precise union here would diverge from the local convention for marginal benefit. All 17 tests in test_bytesio_source.py pass.
1 parent 181056e commit f0a09c0

3 files changed

Lines changed: 142 additions & 3 deletions

File tree

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,4 @@
11
module,last_inspected,issue,severity_max,categories_found,notes
22
geotiff,2026-05-11,1644,MEDIUM,3,"Filed write_geotiff_gpu compression docstring drift vs to_geotiff (MEDIUM Cat 3, #1644). Fix on deep-sweep-api-consistency-geotiff-2026-05-11-1778545740: sync the full 9-codec list into the docstring and note GPU vs CPU encode paths; regression test test_compression_docstring_1644.py pins the codec list and exercises each CPU-fallback codec end-to-end. Other potential drifts surveyed: write_vrt returns str while to_geotiff/write_geotiff_gpu return None (LOW, intentional backward-compat); write_vrt nodata typed float|None vs int-accepting siblings (LOW, PEP 484 int->float compat); kwarg-only ordering drift across read functions (LOW, no user impact). Prior issues 1631/1637/1615/1560/1541/1562 all CLOSED."
3+
geotiff,2026-05-11,1652,MEDIUM,5,"Filed MEDIUM file-like cog=True drift #1652 (write_geotiff_gpu accepted BytesIO+cog=True; to_geotiff blocked it). Fixed in PR (TBD): mirror to_geotiff's gate on the explicit GPU writer; add regression tests in test_bytesio_source.py. Also filed #1651 (JPEG acceptance drift) but downgraded to LOW after #1647 confirmed write_geotiff_gpu(jpeg) is deliberate advanced-API; PR (TBD) carries the docstring clarification. Prior 1631/1644 noted in earlier rows (1644 open, fix in PR #1649). LOW: streaming_buffer_bytes default drift to_geotiff=256MB vs write_geotiff_gpu=None (no functional impact, explicit forwarding); to_geotiff data: annotation misses cupy.ndarray (accepted via auto-dispatch). cuda-validated."
34
reproject,2026-05-10,1570,HIGH,2;5,"Filed cross-module attrs['vertical_crs'] type collision (string vs EPSG int) vs xrspatial.geotiff. Fixed in PR (TBD): reproject now writes EPSG int and preserves friendly token under vertical_datum. MEDIUM kwarg-order drift (transform_precision vs chunk_size) and missing type hints vs geotiff documented but not fixed (cosmetic, kwarg-only)."

xrspatial/geotiff/__init__.py

Lines changed: 24 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2714,7 +2714,7 @@ def _read_once():
27142714

27152715

27162716
def write_geotiff_gpu(data: xr.DataArray | cupy.ndarray | np.ndarray,
2717-
path: str, *,
2717+
path, *,
27182718
crs: int | str | None = None,
27192719
nodata=None,
27202720
compression: str = 'zstd',
@@ -2747,8 +2747,12 @@ def write_geotiff_gpu(data: xr.DataArray | cupy.ndarray | np.ndarray,
27472747
2D or 3D raster. CuPy-backed inputs stay on device; NumPy/Dask
27482748
inputs are uploaded via ``cupy.asarray(np.asarray(data))``
27492749
before compression (matches ``to_geotiff`` parity).
2750-
path : str
2751-
Output file path.
2750+
path : str or binary file-like
2751+
Output file path or any object with a ``write`` method
2752+
(e.g. ``io.BytesIO``). ``cog=True`` requires a string path:
2753+
the auto-dispatch path through ``to_geotiff(gpu=True, cog=True)``
2754+
rejects file-like destinations, and the explicit GPU writer
2755+
mirrors that rule (issue #1652).
27522756
crs : int, str, or None
27532757
EPSG code or WKT string.
27542758
nodata : float, int, or None
@@ -2837,6 +2841,23 @@ def write_geotiff_gpu(data: xr.DataArray | cupy.ndarray | np.ndarray,
28372841
"max_z_error is not supported on the GPU writer "
28382842
"(nvCOMP has no LERC backend). Use to_geotiff(..., gpu=False) "
28392843
"or omit max_z_error.")
2844+
# Mirror to_geotiff's path-type + cog=True gating verbatim so callers
2845+
# see identical errors from the two entry points. The auto-dispatch
2846+
# path through ``to_geotiff(gpu=True, cog=True, path=BytesIO)`` raises
2847+
# before reaching here; the explicit GPU writer mirrors the same gate
2848+
# so callers cannot bypass it (issue #1652). Non-cog file-like writes
2849+
# remain supported on this entry point.
2850+
_path_is_file_like = (
2851+
not isinstance(path, str)) and hasattr(path, 'write')
2852+
if _path_is_file_like:
2853+
if cog:
2854+
raise ValueError(
2855+
"cog=True is not supported for file-like destinations. "
2856+
"Pass a string path or write to BytesIO without cog=True.")
2857+
elif not isinstance(path, str):
2858+
raise TypeError(
2859+
f"path must be a str or a binary file-like with a write() "
2860+
f"method, got {type(path).__name__}")
28402861
# streaming_buffer_bytes is intentionally a no-op on the GPU path;
28412862
# the kwarg exists for API parity with to_geotiff so callers can pass
28422863
# the same kwargs to both entry points without filtering.

xrspatial/geotiff/tests/test_bytesio_source.py

Lines changed: 117 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@
66
"""
77
from __future__ import annotations
88

9+
import importlib.util
910
import io
1011
from concurrent.futures import ThreadPoolExecutor
1112

@@ -17,6 +18,26 @@
1718
from xrspatial.geotiff._reader import _BytesIOSource, read_to_array
1819

1920

21+
def _gpu_available() -> bool:
22+
"""True when cupy imports AND a CUDA runtime is initialised.
23+
24+
Mirrors the helper used in other geotiff GPU tests so the BytesIO
25+
GPU-writer tests skip cleanly on hosts where CuPy is installed but
26+
CUDA is unavailable (Copilot review on #1653).
27+
"""
28+
if importlib.util.find_spec("cupy") is None:
29+
return False
30+
try:
31+
import cupy
32+
return bool(cupy.cuda.is_available())
33+
except Exception:
34+
return False
35+
36+
37+
_HAS_GPU = _gpu_available()
38+
_gpu_only = pytest.mark.skipif(not _HAS_GPU, reason="cupy + CUDA required")
39+
40+
2041
def _make_da(height=32, width=40, dtype=np.float32):
2142
arr = np.arange(height * width, dtype=dtype).reshape(height, width)
2243
# Simple geotransform with negative pixel_height (north-up)
@@ -268,3 +289,99 @@ def seek(self, *a, **k):
268289

269290
assert _is_file_like(io.BytesIO(b'x')) is True
270291
assert _is_file_like(ReadSeekNoTell()) is False
292+
293+
294+
class TestWriteGeotiffGpuBytesIO:
295+
"""Regression coverage for ``write_geotiff_gpu`` file-like behaviour.
296+
297+
``to_geotiff(gpu=True, ...)`` always rejects BytesIO destinations paired
298+
with ``cog=True`` (the auto-dispatch path's existing guard). The explicit
299+
GPU writer used to silently accept that combo and produce a COG into the
300+
buffer, so the two entry points disagreed on what ``to_geotiff(gpu=True,
301+
cog=True, path=BytesIO)`` does. These tests pin the mirrored gate added
302+
by issue #1652 and confirm the non-cog file-like path still works.
303+
"""
304+
305+
@_gpu_only
306+
def test_cog_with_bytesio_rejected_1652(self):
307+
import cupy
308+
da = xr.DataArray(
309+
cupy.asarray(np.random.rand(64, 64).astype(np.float32)),
310+
dims=['y', 'x'],
311+
coords={'y': np.arange(64.0), 'x': np.arange(64.0)},
312+
attrs={'crs': 4326},
313+
)
314+
from xrspatial.geotiff import write_geotiff_gpu
315+
316+
buf = io.BytesIO()
317+
with pytest.raises(ValueError, match='cog=True'):
318+
write_geotiff_gpu(da, buf, cog=True)
319+
320+
@_gpu_only
321+
def test_cog_with_bytesio_error_matches_to_geotiff_1652(self):
322+
"""The error string must match ``to_geotiff``'s gate verbatim so
323+
downstream callers can rely on a single message (Copilot review
324+
on #1653)."""
325+
import cupy
326+
da = xr.DataArray(
327+
cupy.asarray(np.random.rand(64, 64).astype(np.float32)),
328+
dims=['y', 'x'],
329+
coords={'y': np.arange(64.0), 'x': np.arange(64.0)},
330+
attrs={'crs': 4326},
331+
)
332+
from xrspatial.geotiff import write_geotiff_gpu
333+
334+
# to_geotiff's canonical message; mirrored verbatim in
335+
# write_geotiff_gpu's gate.
336+
expected = (
337+
"cog=True is not supported for file-like destinations. "
338+
"Pass a string path or write to BytesIO without cog=True."
339+
)
340+
341+
buf = io.BytesIO()
342+
with pytest.raises(ValueError) as exc_info:
343+
write_geotiff_gpu(da, buf, cog=True)
344+
assert str(exc_info.value) == expected
345+
346+
# And the CPU writer raises the same string for parity.
347+
with pytest.raises(ValueError) as exc_info_cpu:
348+
to_geotiff(_make_da(), io.BytesIO(), cog=True)
349+
assert str(exc_info_cpu.value) == expected
350+
351+
@_gpu_only
352+
def test_invalid_path_type_raises_typeerror_1652(self):
353+
"""Mirror to_geotiff's TypeError for non-str, non-file-like paths
354+
so callers see identical behaviour from both entry points."""
355+
import cupy
356+
da = xr.DataArray(
357+
cupy.asarray(np.random.rand(64, 64).astype(np.float32)),
358+
dims=['y', 'x'],
359+
coords={'y': np.arange(64.0), 'x': np.arange(64.0)},
360+
attrs={'crs': 4326},
361+
)
362+
from xrspatial.geotiff import write_geotiff_gpu
363+
364+
with pytest.raises(TypeError, match="path must be a str"):
365+
write_geotiff_gpu(da, 42) # int is neither str nor file-like
366+
367+
@_gpu_only
368+
def test_non_cog_bytesio_still_works_1652(self):
369+
import cupy
370+
arr_cpu = np.random.rand(64, 64).astype(np.float32)
371+
da = xr.DataArray(
372+
cupy.asarray(arr_cpu),
373+
dims=['y', 'x'],
374+
coords={'y': np.arange(64.0), 'x': np.arange(64.0)},
375+
attrs={'crs': 4326},
376+
)
377+
from xrspatial.geotiff import write_geotiff_gpu
378+
379+
buf = io.BytesIO()
380+
# Non-cog file-like write is still supported on the explicit GPU
381+
# writer; only cog=True is gated.
382+
write_geotiff_gpu(da, buf)
383+
assert len(buf.getvalue()) > 0
384+
385+
# Verify it round-trips through open_geotiff
386+
rd = open_geotiff(io.BytesIO(buf.getvalue()))
387+
np.testing.assert_allclose(np.asarray(rd.values), arr_cpu)

0 commit comments

Comments
 (0)