Skip to content

Commit 50424ac

Browse files
committed
Address Copilot review feedback on #1649
Four wording fixes; no behavioural change: - compression='jpeg': the docstring previously implied parity with to_geotiff, but to_geotiff rejects jpeg at runtime (it omits the JPEGTables tag and produces files that don't round-trip through GDAL). Spell out that write_geotiff_gpu DOES accept jpeg, that the on-disk bytes are self-contained JFIF tiles, and that GDAL/rasterio interop is not guaranteed until the JPEGTables fix lands. - compression='jpeg2000' / 'j2k': replace "route to the CPU encoders" with the actual conditional behaviour (nvJPEG2K GPU encode first, CPU glymur fallback when libnvjpeg2k is missing) and call out that the two paths are not byte-stable, so byte-for-byte parity with to_geotiff isn't a contract here. - test module docstring: distinguish "not nvCOMP-accelerated" (lzw, packbits, lz4, lerc — truly CPU-only) from jpeg2000/j2k (GPU first with CPU fallback) and from jpeg (write_geotiff_gpu only, separate test module). - "exercised by test_features.py" referenced a file that doesn't cover JPEG. Repoint the test docstring to test_gpu_writer_compression_modes_2026_05_11.py which actually pins JPEG round-trips for the GPU writer. All 7 tests in test_compression_docstring_1644.py still pass.
1 parent aa6ef5d commit 50424ac

2 files changed

Lines changed: 66 additions & 25 deletions

File tree

xrspatial/geotiff/__init__.py

Lines changed: 29 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -2723,18 +2723,38 @@ def write_geotiff_gpu(data: xr.DataArray | cupy.ndarray | np.ndarray,
27232723
nodata : float, int, or None
27242724
NoData value.
27252725
compression : str
2726-
Codec name. Accepts the same set as ``to_geotiff``: ``'none'``,
2727-
``'deflate'``, ``'lzw'``, ``'jpeg'``, ``'packbits'``, ``'zstd'``,
2728-
``'lz4'``, ``'jpeg2000'`` (alias ``'j2k'``), or ``'lerc'``.
2726+
Codec name. Accepts the same set ``to_geotiff`` lists in its
2727+
own signature: ``'none'``, ``'deflate'``, ``'lzw'``, ``'jpeg'``,
2728+
``'packbits'``, ``'zstd'``, ``'lz4'``, ``'jpeg2000'`` (alias
2729+
``'j2k'``), or ``'lerc'``.
27292730
27302731
``'zstd'`` (default) and ``'deflate'`` compress on the GPU via
27312732
nvCOMP batch compression -- the fastest paths and the reason to
2732-
use this entry point. ``'jpeg'`` uses nvJPEG when available and
2733-
falls back to Pillow otherwise. ``'jpeg2000'`` / ``'j2k'`` and
2734-
``'lerc'`` route to the CPU encoders so the output matches the
2735-
CPU writer byte-for-byte, but lose the GPU compression speedup.
2736-
``'lzw'``, ``'packbits'``, and ``'lz4'`` likewise fall through
2737-
to the CPU encoder for parity with ``to_geotiff``.
2733+
use this entry point.
2734+
2735+
``'jpeg'`` uses nvJPEG when libnvjpeg is loadable and falls
2736+
back to Pillow otherwise. Unlike ``to_geotiff`` (which rejects
2737+
``compression='jpeg'`` at runtime because its CPU encoder omits
2738+
the required TIFF JPEGTables tag (347)), this GPU entry point
2739+
emits self-contained JFIF tiles. The two writers therefore
2740+
disagree about JPEG-in-TIFF interop: files produced here decode
2741+
fine through this library's own reader but may not round-trip
2742+
through GDAL/rasterio/libtiff readers that require the
2743+
JPEGTables tag. Treat ``write_geotiff_gpu(..., compression=
2744+
'jpeg')`` as "experimental, internal-reader only" until the
2745+
JPEGTables fix lands.
2746+
2747+
``'jpeg2000'`` / ``'j2k'`` attempt an nvJPEG2K GPU encode first
2748+
and fall back to the CPU encoder (``glymur``) when libnvjpeg2k
2749+
is unavailable. The GPU and CPU paths are NOT byte-for-byte
2750+
identical (different libraries, different default parameters);
2751+
if you need exact CPU-writer parity, use ``to_geotiff`` instead.
2752+
2753+
``'lerc'``, ``'lzw'``, ``'packbits'``, and ``'lz4'`` have no
2754+
nvCOMP/CUDA accelerator and fall through to the CPU encoder for
2755+
parity with ``to_geotiff`` (LERC is byte-stable across CPU/CPU
2756+
because there is only one encoder; the others are likewise
2757+
identical bytes).
27382758
compression_level : int or None
27392759
Compression effort level. Accepted for API compatibility but
27402760
currently ignored -- nvCOMP does not expose level control.

xrspatial/geotiff/tests/test_compression_docstring_1644.py

Lines changed: 37 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -5,12 +5,30 @@
55
``write_geotiff_gpu.__doc__`` listed only four codecs (``'zstd'``,
66
``'deflate'``, ``'jpeg'``, ``'none'``) under the ``compression``
77
parameter, while the implementation actually accepts every codec
8-
``to_geotiff`` does. Codecs unsupported by nvCOMP fall through to the
9-
CPU encoders (``lzw``, ``packbits``, ``lz4``, ``lerc``, ``jpeg2000`` /
10-
``j2k``) so the output matches the CPU writer byte-for-byte. This
11-
module pins the full codec list against future drift and confirms the
12-
underlying entry point accepts the codec names that the docstring now
13-
advertises.
8+
``to_geotiff`` does.
9+
10+
Routing for the additional codecs:
11+
12+
* ``'lzw'``, ``'packbits'``, ``'lz4'``, ``'lerc'`` -- not nvCOMP-
13+
accelerated and have no GPU library, so they fall through to the
14+
CPU encoder. Byte-for-byte identical to ``to_geotiff``.
15+
* ``'jpeg2000'`` / ``'j2k'`` -- attempts an nvJPEG2K *GPU* encode
16+
first via ``_nvjpeg2k_batch_encode`` and falls back to the CPU
17+
``glymur`` encoder only when libnvjpeg2k is unavailable. The two
18+
paths are NOT byte-stable against each other; this module pins the
19+
acceptance contract (the codec name is accepted and a file gets
20+
written), not output-byte parity with the CPU writer.
21+
* ``'jpeg'`` -- accepted here even though ``to_geotiff`` rejects it
22+
(the CPU writer omits the JPEGTables tag, so its output doesn't
23+
round-trip through GDAL). The GPU path emits self-contained JFIF
24+
tiles. Covered separately by
25+
``test_gpu_writer_compression_modes_2026_05_11.py``; this module
26+
excludes it from the parametrized fallback list because the test
27+
data needs to be uint8 with sensible pixel content.
28+
29+
This module pins the full codec list against future drift and confirms
30+
the underlying entry point accepts the codec names that the docstring
31+
now advertises.
1432
"""
1533
from __future__ import annotations
1634

@@ -42,11 +60,13 @@ def _gpu_available() -> bool:
4260
)
4361

4462

45-
# The full set ``to_geotiff`` accepts, mirrored to ``write_geotiff_gpu``
46-
# so both entry points stay in lockstep. Excludes ``jpeg`` because PR
47-
# #1633 already pins that name and the ``to_geotiff`` runtime rejects
48-
# it -- but it is still listed in the docstring as an accepted codec
49-
# name, matching ``to_geotiff``'s wording.
63+
# Codecs to exercise end-to-end through the GPU writer to confirm they
64+
# accept the docstring's advertised names. Excludes ``jpeg`` because
65+
# (a) ``to_geotiff`` rejects it at runtime and (b) the JPEG round-trip
66+
# is covered with appropriate uint8 RGB data in
67+
# ``test_gpu_writer_compression_modes_2026_05_11.py``; keeping it out of
68+
# this parametrize avoids exercising the JPEG path on dtype/shape
69+
# combinations that aren't representative.
5070
_GPU_FALLBACK_CODECS = (
5171
"lzw", "packbits", "lz4", "lerc", "jpeg2000", "j2k",
5272
)
@@ -83,11 +103,12 @@ def test_write_geotiff_gpu_accepts_cpu_fallback_codecs(tmp_path, codec):
83103
84104
Confirms the docstring's promise that the GPU writer accepts the
85105
same codec set as ``to_geotiff``. ``jpeg`` is exercised separately
86-
by ``test_features.py`` because the test data must be 8-bit
87-
integer. ``jpeg2000`` / ``j2k`` go through ``glymur`` which only
88-
accepts uint8/uint16 -- pick a uint16 source for those codecs so
89-
the encode path is the one users actually hit, not a dtype-rejected
90-
pre-check inside glymur.
106+
by ``test_gpu_writer_compression_modes_2026_05_11.py`` because the
107+
test data must be uint8 with sensible content. ``jpeg2000`` /
108+
``j2k`` will attempt nvJPEG2K if available and fall back to
109+
``glymur`` otherwise; either way the encoder needs uint8/uint16
110+
input, so pick a uint16 source for those codecs so the encode path
111+
is the one users actually hit, not a dtype-rejected pre-check.
91112
"""
92113
import cupy
93114

0 commit comments

Comments
 (0)