Add test coverage for overview_resampling min/max/median modes (#1639)

brendancol · web-flow · commit 977968f401ec · 2026-05-11T17:24:22.000-07:00
* Add test coverage for overview_resampling min/max/median modes The to_geotiff and write_geotiff_gpu overview_resampling parameter accepts six modes (mean, nearest, mode, cubic, min, max, median) but the last three had no test coverage on either the CPU block reducer (_block_reduce_2d) or the GPU block reducer (_block_reduce_2d_gpu). A regression in any of those branches would ship undetected. This commit closes a Cat 4 HIGH parameter-coverage gap with 26 new tests: * CPU and GPU unit tests for the block reducer on finite input. * CPU and GPU unit tests for partial-NaN input verifying the nan-aware reductions skip NaN cells. * End-to-end COG writes for to_geotiff(cog=True) and write_geotiff_gpu(cog=True) for each of the three modes. * CPU/GPU parity check for to_geotiff(gpu=True) overview output. * CPU nodata-sentinel regression (issue 1613 path, here extended to the min/max/median branches that 1613 did not test). * Error path: ValueError on an unknown method name for both backends. All 26 tests pass on a GPU-enabled host. Pass 6 of the test-coverage sweep on the geotiff module. State CSV notes column updated. * Address Copilot review on #1639: clarify prior coverage scope - Fix "six" -> "seven" reductions count (mean, nearest, min, max, median, mode, cubic). - Reword module docstring: CPU end-to-end paths for min/max/median were already covered by test_cog_overview_nodata_1613; the gap this file closes is GPU end-to-end + direct CPU/GPU block-reducer branches. - Mirror the clarification in the sweep-test-coverage state CSV row.
diff --git a/.claude/sweep-test-coverage-state.csv b/.claude/sweep-test-coverage-state.csv
@@ -1,3 +1,3 @@
 module,last_inspected,issue,severity_max,categories_found,notes
-geotiff,2026-05-11,,HIGH,2;3,"Pass 5 (2026-05-11): added test_degenerate_shapes_backends_2026_05_11.py covering Cat 3 HIGH geometric gaps (1x1 / 1xN / Nx1 reads on dask+numpy, GPU, dask+cupy backends; 1x1 / 1xN / Nx1 writes through write_geotiff_gpu) and Cat 2 MEDIUM NaN/Inf gaps (all-NaN read on GPU + dask+cupy, Inf / -Inf reads on all non-eager backends, NaN sentinel mask on dask read path including sentinel block split across chunk boundary). 23 tests, all passing on GPU host. Prior passes still hold: pass 4 (r4) closed read_geotiff_gpu/dask name= + max_pixels= kwargs (Cat 4), pass 3 (r3) closed read_vrt GPU/dask+GPU backend dispatch (Cat 1) and dtype/name kwargs (Cat 4)."
+geotiff,2026-05-11,,HIGH,2;3;4,"Pass 6 (2026-05-11): added test_overview_resampling_min_max_median_2026_05_11.py covering Cat 4 HIGH parameter-coverage gap on overview_resampling=min/max/median. CPU end-to-end paths were already covered by test_cog_overview_nodata_1613::test_cpu_cog_overview_aggregations_ignore_sentinel; the GPU end-to-end paths and the direct CPU+GPU block-reducer branches had no targeted tests, so a regression on those code paths would ship undetected. 26 tests, all passing on GPU host: block-reducer unit tests (finite + partial-NaN), end-to-end COG writes for both to_geotiff and write_geotiff_gpu, CPU/GPU parity for to_geotiff(gpu=True), CPU nodata-sentinel regression check, and ValueError error-path tests for unknown method names on both backends. Pass 5 (2026-05-11): added test_degenerate_shapes_backends_2026_05_11.py covering Cat 3 HIGH geometric gaps (1x1 / 1xN / Nx1 reads on dask+numpy, GPU, dask+cupy backends; 1x1 / 1xN / Nx1 writes through write_geotiff_gpu) and Cat 2 MEDIUM NaN/Inf gaps (all-NaN read on GPU + dask+cupy, Inf / -Inf reads on all non-eager backends, NaN sentinel mask on dask read path including sentinel block split across chunk boundary). 23 tests, all passing on GPU host. Prior passes still hold: pass 4 (r4) closed read_geotiff_gpu/dask name= + max_pixels= kwargs (Cat 4), pass 3 (r3) closed read_vrt GPU/dask+GPU backend dispatch (Cat 1) and dtype/name kwargs (Cat 4)."
 reproject,2026-05-10,,HIGH,1;4;5,"Added 39 tests: LiteCRS direct coverage, itrf_transform behaviour/roundtrip/array, itrf_frames, geoid_height numerical correctness + raster happy-path, vertical helpers (ellipsoidal<->orthometric/depth), reproject() lat/lon and latitude/longitude dim propagation. Note: _merge_arrays_cupy is imported but unused (no cupy merge dispatch in merge()); flagged as feature gap not test gap."
diff --git a/xrspatial/geotiff/tests/test_overview_resampling_min_max_median_2026_05_11.py b/xrspatial/geotiff/tests/test_overview_resampling_min_max_median_2026_05_11.py
@@ -0,0 +1,317 @@
+"""Parameter coverage for the ``overview_resampling`` modes
+``'min'``, ``'max'``, and ``'median'``.
+
+The CPU writer (``xrspatial.geotiff._writer._block_reduce_2d``) and the
+GPU writer (``xrspatial.geotiff._gpu_decode._block_reduce_2d_gpu``) both
+implement seven resampling reductions for COG overview generation:
+
+* ``mean`` -- covered by ``test_cog_overview_nodata_1613`` and
+  ``test_features``.
+* ``nearest`` -- covered by ``test_features`` and the same suite.
+* ``mode`` -- covered by ``test_mode_overview_perf``.
+* ``cubic`` -- covered by ``test_cog_cubic_overview_nodata_1623``.
+* ``min`` / ``max`` / ``median`` -- CPU end-to-end paths covered by
+  ``test_cog_overview_nodata_1613::test_cpu_cog_overview_aggregations_ignore_sentinel``,
+  but the GPU end-to-end paths and the direct CPU/GPU block-reducer
+  branches had no targeted tests prior to this file.
+
+Test coverage gap sweep 2026-05-11 (pass 6) closes a Cat 4 (parameter
+coverage) HIGH gap: the GPU end-to-end paths and the direct CPU+GPU
+block-reducer branches for ``overview_resampling='min'/'max'/'median'``
+had no targeted tests, so a regression on those code paths would ship
+undetected.
+
+The tests cover:
+
+* CPU writer (``to_geotiff(cog=True, overview_resampling='min'/'max'/'median')``)
+  with finite data, and with a nodata sentinel so the nan-aware
+  reductions (``nanmin`` / ``nanmax`` / ``nanmedian``) get exercised.
+* GPU writer (``write_geotiff_gpu`` and ``to_geotiff(gpu=True, ...)``)
+  for the three modes, verifying the cupy implementation matches the
+  CPU implementation byte-for-byte.
+* The ``cog=True`` overview-level read path round-trips the resampled
+  data so the full write/read pipeline is exercised.
+"""
+from __future__ import annotations
+
+import importlib.util
+
+import numpy as np
+import pytest
+import xarray as xr
+
+from xrspatial.geotiff import open_geotiff, to_geotiff
+from xrspatial.geotiff._writer import _block_reduce_2d
+
+
+def _gpu_available() -> bool:
+    """True if cupy is importable and CUDA is initialised."""
+    if importlib.util.find_spec("cupy") is None:
+        return False
+    try:
+        import cupy
+        return bool(cupy.cuda.is_available())
+    except Exception:
+        return False
+
+
+_HAS_GPU = _gpu_available()
+_gpu_only = pytest.mark.skipif(
+    not _HAS_GPU,
+    reason="cupy + CUDA required",
+)
+
+
+# ---------------------------------------------------------------------------
+# Fixtures: 4x4 rasters with deterministic values so the 2x decimated
+# overview has a closed-form min / max / median for every 2x2 block.
+# ---------------------------------------------------------------------------
+
+def _arr_4x4_ramp() -> np.ndarray:
+    """4x4 float32 ramp.
+
+    Block layout (top-left 2x2, top-right 2x2, ...):
+
+        [ 1  2 | 3  4 ]
+        [ 5  6 | 7  8 ]
+        --------------
+        [ 9 10 |11 12 ]
+        [13 14 |15 16 ]
+
+    Per-block reductions:
+      * min:    [[1, 3], [9, 11]]
+      * max:    [[6, 8], [14, 16]]
+      * median: [[3.5, 5.5], [11.5, 13.5]]   (mean of the two middle values)
+    """
+    return np.arange(1, 17, dtype=np.float32).reshape(4, 4)
+
+
+def _arr_4x4_with_nan() -> np.ndarray:
+    """4x4 float32 ramp with one NaN per top-row 2x2 block.
+
+    Block layout:
+
+        [NaN  2 | 3 NaN]
+        [  5  6 | 7  8 ]
+        --------------
+        [  9 10 |11 12 ]
+        [ 13 14 |15 16 ]
+
+    Per-block reductions (NaN ignored):
+      * min:    [[2, 3], [9, 11]]
+      * max:    [[6, 8], [14, 16]]
+      * median: the four-cell median ignoring NaN. For top-left, finite
+        cells are {2, 5, 6} -> median 5. Top-right finite {3, 7, 8} -> 7.
+        Bottom rows are unchanged ramp medians.
+    """
+    arr = _arr_4x4_ramp()
+    arr[0, 0] = np.nan
+    arr[0, 3] = np.nan
+    return arr
+
+
+# Expected outputs are computed via numpy.nan* once at module import so
+# they double as a CPU-impl correctness check.
+
+_RAMP_EXPECTED_MIN = np.array([[1.0, 3.0], [9.0, 11.0]], dtype=np.float32)
+_RAMP_EXPECTED_MAX = np.array([[6.0, 8.0], [14.0, 16.0]], dtype=np.float32)
+_RAMP_EXPECTED_MEDIAN = np.array([[3.5, 5.5], [11.5, 13.5]], dtype=np.float32)
+
+
+# ---------------------------------------------------------------------------
+# Cat 4 HIGH: CPU writer overview_resampling=min/max/median (block reducer).
+# ---------------------------------------------------------------------------
+
+@pytest.mark.parametrize("method, expected", [
+    ('min', _RAMP_EXPECTED_MIN),
+    ('max', _RAMP_EXPECTED_MAX),
+    ('median', _RAMP_EXPECTED_MEDIAN),
+])
+def test_block_reduce_2d_cpu(method, expected):
+    """``_block_reduce_2d`` returns the documented reduction per 2x2 block."""
+    arr = _arr_4x4_ramp()
+    out = _block_reduce_2d(arr, method)
+    np.testing.assert_allclose(out, expected)
+
+
+@pytest.mark.parametrize("method", ['min', 'max', 'median'])
+def test_block_reduce_2d_cpu_skips_nan(method):
+    """``_block_reduce_2d`` uses nan-aware reductions so partial-NaN
+    blocks aggregate over the finite cells only."""
+    arr = _arr_4x4_with_nan()
+    out = _block_reduce_2d(arr, method)
+    assert np.all(np.isfinite(out)), (
+        f"method={method!r} returned NaN for a partial-NaN block")
+
+    # Recompute expected via numpy nan-aware ops on the same 2x2 reshape.
+    blocks = arr.reshape(2, 2, 2, 2)
+    flat = blocks.transpose(0, 2, 1, 3).reshape(2, 2, 4)
+    if method == 'min':
+        expected = np.nanmin(flat, axis=2)
+    elif method == 'max':
+        expected = np.nanmax(flat, axis=2)
+    else:
+        expected = np.nanmedian(flat, axis=2)
+    np.testing.assert_allclose(out, expected.astype(np.float32))
+
+
+@pytest.mark.parametrize("method, expected", [
+    ('min', _RAMP_EXPECTED_MIN),
+    ('max', _RAMP_EXPECTED_MAX),
+    ('median', _RAMP_EXPECTED_MEDIAN),
+])
+def test_to_geotiff_cog_overview_resampling_cpu(tmp_path, method, expected):
+    """End-to-end: ``to_geotiff(cog=True, overview_resampling=method)``
+    writes a COG whose overview level 1 matches the closed-form 2x2
+    reduction."""
+    arr = _arr_4x4_ramp()
+    da = xr.DataArray(arr, dims=['y', 'x'])
+    p = str(tmp_path / f'cog_{method}.tif')
+    to_geotiff(da, p, cog=True, compression='deflate', tiled=True,
+               tile_size=2, overview_levels=[1],
+               overview_resampling=method)
+
+    ov = open_geotiff(p, overview_level=1)
+    np.testing.assert_allclose(np.asarray(ov.data), expected)
+
+
+@pytest.mark.parametrize("method", ['min', 'max', 'median'])
+def test_to_geotiff_cog_overview_resampling_cpu_nodata(tmp_path, method):
+    """CPU writer: nan-aware reductions skip the sentinel when ``nodata``
+    is set (the regression that motivated issue #1613, here covering the
+    min/max/median branches that #1613 did not test)."""
+    arr = _arr_4x4_with_nan()
+    da = xr.DataArray(arr, dims=['y', 'x'])
+    p = str(tmp_path / f'cog_{method}_nodata.tif')
+    to_geotiff(da, p, nodata=-9999.0, cog=True, compression='deflate',
+               tiled=True, tile_size=2, overview_levels=[1],
+               overview_resampling=method)
+
+    ov = open_geotiff(p, overview_level=1)
+    out = np.asarray(ov.data)
+
+    # Recompute expected from the same nan-aware reduction on the source.
+    blocks = arr.reshape(2, 2, 2, 2)
+    flat = blocks.transpose(0, 2, 1, 3).reshape(2, 2, 4)
+    if method == 'min':
+        expected = np.nanmin(flat, axis=2)
+    elif method == 'max':
+        expected = np.nanmax(flat, axis=2)
+    else:
+        expected = np.nanmedian(flat, axis=2)
+    np.testing.assert_allclose(out, expected.astype(np.float32))
+
+
+# ---------------------------------------------------------------------------
+# Cat 4 HIGH: GPU writer overview_resampling=min/max/median.
+# ---------------------------------------------------------------------------
+
+@_gpu_only
+@pytest.mark.parametrize("method, expected", [
+    ('min', _RAMP_EXPECTED_MIN),
+    ('max', _RAMP_EXPECTED_MAX),
+    ('median', _RAMP_EXPECTED_MEDIAN),
+])
+def test_block_reduce_2d_gpu(method, expected):
+    """``_block_reduce_2d_gpu`` returns the same reduction as the CPU
+    block reducer for finite input."""
+    import cupy
+
+    from xrspatial.geotiff._gpu_decode import _block_reduce_2d_gpu
+
+    arr_cpu = _arr_4x4_ramp()
+    arr_gpu = cupy.asarray(arr_cpu)
+    out = _block_reduce_2d_gpu(arr_gpu, method)
+    np.testing.assert_allclose(cupy.asnumpy(out), expected)
+
+
+@_gpu_only
+@pytest.mark.parametrize("method", ['min', 'max', 'median'])
+def test_block_reduce_2d_gpu_matches_cpu_with_nan(method):
+    """GPU nan-aware reductions match CPU nan-aware reductions for a
+    partial-NaN block."""
+    import cupy
+
+    from xrspatial.geotiff._gpu_decode import _block_reduce_2d_gpu
+
+    arr_cpu = _arr_4x4_with_nan()
+    cpu_out = _block_reduce_2d(arr_cpu, method)
+    gpu_out = _block_reduce_2d_gpu(cupy.asarray(arr_cpu), method)
+    np.testing.assert_allclose(cupy.asnumpy(gpu_out), cpu_out)
+
+
+@_gpu_only
+@pytest.mark.parametrize("method, expected", [
+    ('min', _RAMP_EXPECTED_MIN),
+    ('max', _RAMP_EXPECTED_MAX),
+    ('median', _RAMP_EXPECTED_MEDIAN),
+])
+def test_write_geotiff_gpu_cog_overview_resampling(tmp_path, method, expected):
+    """End-to-end: ``write_geotiff_gpu(cog=True, overview_resampling=method)``
+    writes a COG whose overview level 1 matches the closed-form 2x2
+    reduction. Exercises the GPU make-overview path including the dispatch
+    on ``method``."""
+    import cupy
+
+    from xrspatial.geotiff import write_geotiff_gpu
+
+    arr = _arr_4x4_ramp()
+    arr_gpu = cupy.asarray(arr)
+    da = xr.DataArray(arr_gpu, dims=['y', 'x'])
+    p = str(tmp_path / f'cog_{method}_gpu.tif')
+    write_geotiff_gpu(da, p, cog=True, compression='deflate', tiled=True,
+                      tile_size=2, overview_levels=[1],
+                      overview_resampling=method)
+
+    ov = open_geotiff(p, overview_level=1)
+    np.testing.assert_allclose(np.asarray(ov.data), expected)
+
+
+@_gpu_only
+@pytest.mark.parametrize("method", ['min', 'max', 'median'])
+def test_to_geotiff_gpu_cog_overview_matches_cpu(tmp_path, method):
+    """``to_geotiff(gpu=True, ..., overview_resampling=method)`` produces
+    overview bytes that round-trip to the same values as the CPU writer."""
+    import cupy
+
+    arr = _arr_4x4_ramp()
+    da_cpu = xr.DataArray(arr, dims=['y', 'x'])
+    p_cpu = str(tmp_path / f'cog_{method}_cpu.tif')
+    to_geotiff(da_cpu, p_cpu, cog=True, compression='deflate', tiled=True,
+               tile_size=2, overview_levels=[1],
+               overview_resampling=method)
+
+    da_gpu = xr.DataArray(cupy.asarray(arr), dims=['y', 'x'])
+    p_gpu = str(tmp_path / f'cog_{method}_gpu_via_to_geotiff.tif')
+    to_geotiff(da_gpu, p_gpu, gpu=True, cog=True, compression='deflate',
+               tiled=True, tile_size=2, overview_levels=[1],
+               overview_resampling=method)
+
+    ov_cpu = np.asarray(open_geotiff(p_cpu, overview_level=1).data)
+    ov_gpu = np.asarray(open_geotiff(p_gpu, overview_level=1).data)
+    np.testing.assert_allclose(ov_gpu, ov_cpu)
+
+
+# ---------------------------------------------------------------------------
+# Error path: unknown method names raise ValueError on both backends.
+# ---------------------------------------------------------------------------
+
+def test_block_reduce_2d_cpu_unknown_method_raises():
+    """The CPU block reducer raises ``ValueError`` on an unknown method
+    name. Exercises the else-branch that lists the valid methods."""
+    arr = _arr_4x4_ramp()
+    with pytest.raises(ValueError, match="Unknown overview resampling"):
+        _block_reduce_2d(arr, 'bogus')
+
+
+@_gpu_only
+def test_block_reduce_2d_gpu_unknown_method_raises():
+    """The GPU block reducer raises ``ValueError`` on an unknown method
+    name. The CPU equivalent already raises for parity."""
+    import cupy
+
+    from xrspatial.geotiff._gpu_decode import _block_reduce_2d_gpu
+
+    arr_gpu = cupy.asarray(_arr_4x4_ramp())
+    with pytest.raises(ValueError, match="Unknown GPU overview resampling"):
+        _block_reduce_2d_gpu(arr_gpu, 'bogus')