Skip to content

Commit 8adb749

Browse files
authored
Hide read_to_array from xrspatial.geotiff public namespace (#1708) (#1711)
* Hide read_to_array from xrspatial.geotiff public namespace (#1708) The api-consistency sweep on 2026-05-12 flagged an orphan-API leak: xrspatial/geotiff/__init__.py imports ``read_to_array`` from ``._reader`` so the eager ``open_geotiff`` body can call it, but as a side-effect that bound the name in the public ``xrspatial.geotiff`` namespace. ``__all__`` does not list it, the module-level "Public API" docstring does not document it, and the Sphinx pages do not advertise it -- but ``from xrspatial.geotiff import read_to_array`` worked anyway. The library's own ``test_band_validation_1673.py`` imported it that way six times, proving the leak was being used as if it were public. External users who pattern-matched against the test would also be relying on a name the maintainer never committed to support. A future cleanup that removed the top-level import would have broken them with no deprecation path. This change binds the helper under a leading-underscore name in ``__init__.py``: from ._reader import UnsafeURLError from ._reader import read_to_array as _read_to_array and updates each internal call site (six call sites: ``open_geotiff``, the dask delayed window reader, and four GPU-fallback paths in ``read_geotiff_gpu``). The single test file that imported the leaked name is updated to import from ``xrspatial.geotiff._reader`` directly, which is the canonical private path. ``test_namespace_no_leak_1708.py`` pins the namespace contract: ``read_to_array`` must not be an attribute of ``xrspatial.geotiff``, ``__all__`` must not list it, and the canonical ``xrspatial.geotiff._reader.read_to_array`` import must still work. This guards against a future maintainer re-adding a bare ``from ._reader import read_to_array`` line without realising it would re-leak the name. No runtime behaviour change in the supported public API. The deprecated ``plot_geotiff`` continues to be exposed at the top level for backward compatibility because it has a documented deprecation path; this commit takes no position on ``plot_geotiff``. Closes #1708. * Patch _read_to_array alias in dask astype trace test The #1708 cleanup hid ``read_to_array`` from ``xrspatial.geotiff`` and bound the internal alias as ``_read_to_array``. The existing #1624 trace test still set ``xrspatial.geotiff.read_to_array`` via monkeypatch, which now raises ``AttributeError`` at the setattr call. Patch the alias the dask worker actually reads instead.
1 parent 258d16c commit 8adb749

4 files changed

Lines changed: 95 additions & 17 deletions

File tree

xrspatial/geotiff/__init__.py

Lines changed: 16 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -43,7 +43,14 @@
4343
from typing import BinaryIO
4444

4545
from ._geotags import GeoTransform, RASTER_PIXEL_IS_AREA, RASTER_PIXEL_IS_POINT
46-
from ._reader import UnsafeURLError, read_to_array
46+
from ._reader import UnsafeURLError
47+
# ``read_to_array`` is internal: it is used by ``open_geotiff`` and the
48+
# GPU fallback below but is not in ``__all__`` or the module-level
49+
# Public API docstring. Bind it under a leading-underscore name so it
50+
# does not leak into ``xrspatial.geotiff``'s public namespace. Tests
51+
# and internal callers that genuinely need it can import directly from
52+
# ``xrspatial.geotiff._reader``. See issue #1708.
53+
from ._reader import read_to_array as _read_to_array
4754
from ._writer import write
4855

4956
# All names below are part of the supported public API. ``plot_geotiff``
@@ -807,7 +814,7 @@ def open_geotiff(source, *, dtype=None,
807814
# in :func:`read_geotiff_dask`. That keeps the two backends in sync
808815
# on the contract without forcing a second metadata parse here. See
809816
# issue #1634.
810-
arr, geo_info = read_to_array(
817+
arr, geo_info = _read_to_array(
811818
source, window=window,
812819
overview_level=overview_level, band=band,
813820
**kwargs,
@@ -2171,9 +2178,9 @@ def _read(http_meta):
21712178
_r2a_kwargs = {}
21722179
if max_pixels is not None:
21732180
_r2a_kwargs['max_pixels'] = max_pixels
2174-
arr, _ = read_to_array(source, window=(r0, c0, r1, c1),
2175-
overview_level=overview_level,
2176-
band=band, **_r2a_kwargs)
2181+
arr, _ = _read_to_array(source, window=(r0, c0, r1, c1),
2182+
overview_level=overview_level,
2183+
band=band, **_r2a_kwargs)
21772184
if nodata is not None:
21782185
# ``arr`` was just decoded by ``_fetch_decode_cog_http_tiles``
21792186
# or ``read_to_array``; both return freshly-allocated buffers
@@ -2677,7 +2684,7 @@ def read_geotiff_gpu(source: str, *,
26772684
# its geo_info and apply our own transform update below so the
26782685
# result is correct regardless of merge order.
26792686
src.close()
2680-
arr_cpu, _ = read_to_array(
2687+
arr_cpu, _ = _read_to_array(
26812688
source, overview_level=overview_level)
26822689
arr_gpu = cupy.asarray(arr_cpu)
26832690
if orientation != 1:
@@ -2843,7 +2850,7 @@ def _read_once():
28432850
# Drop read_to_array's geo_info: orientation transform handling
28442851
# below operates on our pre-extracted geo_info so the 2/3/4 case
28452852
# is covered regardless of #1539's merge state.
2846-
arr_cpu, _ = read_to_array(
2853+
arr_cpu, _ = _read_to_array(
28472854
source, overview_level=overview_level)
28482855
arr_gpu = cupy.asarray(arr_cpu)
28492856
arr_was_cpu_decoded = True
@@ -2856,7 +2863,7 @@ def _read_once():
28562863
f"({height}, {width}, {samples})"
28572864
)
28582865
elif has_sparse_tile:
2859-
arr_cpu, _ = read_to_array(
2866+
arr_cpu, _ = _read_to_array(
28602867
source, overview_level=overview_level)
28612868
arr_gpu = cupy.asarray(arr_cpu)
28622869
arr_was_cpu_decoded = True
@@ -2913,7 +2920,7 @@ def _read_once():
29132920
RuntimeWarning,
29142921
stacklevel=2,
29152922
)
2916-
arr_cpu, _ = read_to_array(
2923+
arr_cpu, _ = _read_to_array(
29172924
source, overview_level=overview_level)
29182925
arr_gpu = cupy.asarray(arr_cpu)
29192926
arr_was_cpu_decoded = True

xrspatial/geotiff/tests/test_band_validation_1673.py

Lines changed: 9 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -46,7 +46,7 @@ def multiband_tiff_path(tmp_path):
4646

4747
def test_read_to_array_negative_band_rejected(multiband_tiff_path):
4848
"""``band=-1`` no longer silently selects the last channel."""
49-
from xrspatial.geotiff import read_to_array
49+
from xrspatial.geotiff._reader import read_to_array
5050

5151
path, _ = multiband_tiff_path
5252
with pytest.raises(IndexError, match="band=-1 out of range"):
@@ -55,7 +55,7 @@ def test_read_to_array_negative_band_rejected(multiband_tiff_path):
5555

5656
def test_read_to_array_band_equal_to_samples_rejected(multiband_tiff_path):
5757
"""``band=samples_per_pixel`` (off-by-one) raises a typed error."""
58-
from xrspatial.geotiff import read_to_array
58+
from xrspatial.geotiff._reader import read_to_array
5959

6060
path, _ = multiband_tiff_path
6161
# File has 3 bands; valid indices are 0, 1, 2.
@@ -65,7 +65,7 @@ def test_read_to_array_band_equal_to_samples_rejected(multiband_tiff_path):
6565

6666
def test_read_to_array_band_far_above_samples_rejected(multiband_tiff_path):
6767
"""A wildly out-of-range band index gives the same typed error."""
68-
from xrspatial.geotiff import read_to_array
68+
from xrspatial.geotiff._reader import read_to_array
6969

7070
path, _ = multiband_tiff_path
7171
with pytest.raises(IndexError, match="band=103 out of range"):
@@ -74,7 +74,7 @@ def test_read_to_array_band_far_above_samples_rejected(multiband_tiff_path):
7474

7575
def test_read_to_array_valid_band_still_works(multiband_tiff_path):
7676
"""Valid band indices keep working after the validation guard."""
77-
from xrspatial.geotiff import read_to_array
77+
from xrspatial.geotiff._reader import read_to_array
7878

7979
path, arr = multiband_tiff_path
8080
out, _ = read_to_array(path, band=1)
@@ -83,7 +83,7 @@ def test_read_to_array_valid_band_still_works(multiband_tiff_path):
8383

8484
def test_read_to_array_band_none_still_returns_all_bands(multiband_tiff_path):
8585
"""``band=None`` still returns the full multi-band array."""
86-
from xrspatial.geotiff import read_to_array
86+
from xrspatial.geotiff._reader import read_to_array
8787

8888
path, arr = multiband_tiff_path
8989
out, _ = read_to_array(path)
@@ -92,7 +92,8 @@ def test_read_to_array_band_none_still_returns_all_bands(multiband_tiff_path):
9292

9393
def test_backend_parity_negative_band(multiband_tiff_path):
9494
"""Local eager and dask paths raise the same error for ``band=-1``."""
95-
from xrspatial.geotiff import read_geotiff_dask, read_to_array
95+
from xrspatial.geotiff import read_geotiff_dask
96+
from xrspatial.geotiff._reader import read_to_array
9697

9798
path, _ = multiband_tiff_path
9899

@@ -110,7 +111,8 @@ def test_backend_parity_negative_band(multiband_tiff_path):
110111

111112
def test_backend_parity_band_equal_to_samples(multiband_tiff_path):
112113
"""Local eager and dask paths agree on the off-by-one rejection."""
113-
from xrspatial.geotiff import read_geotiff_dask, read_to_array
114+
from xrspatial.geotiff import read_geotiff_dask
115+
from xrspatial.geotiff._reader import read_to_array
114116

115117
path, _ = multiband_tiff_path
116118

xrspatial/geotiff/tests/test_dask_no_op_astype_1624.py

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -104,7 +104,12 @@ def wrapped_r2a(*args, **kwargs):
104104
captured.append(tracked)
105105
return tracked, meta
106106

107-
monkeypatch.setattr(gt, 'read_to_array', wrapped_r2a)
107+
# ``read_geotiff_dask``'s per-chunk worker calls the alias
108+
# ``_read_to_array`` bound in ``xrspatial.geotiff``. Patch that
109+
# binding; patching ``_reader.read_to_array`` would not affect the
110+
# already-imported alias. See issue #1708 for why ``read_to_array``
111+
# is internal.
112+
monkeypatch.setattr(gt, '_read_to_array', wrapped_r2a)
108113

109114
dk = read_geotiff_dask(path, chunks=4)
110115
dk.compute()
Lines changed: 64 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,64 @@
1+
"""Regression test for #1708: ``read_to_array`` no longer leaks into
2+
``xrspatial.geotiff``'s public namespace.
3+
4+
Before the fix, ``xrspatial/geotiff/__init__.py`` did
5+
6+
from ._reader import UnsafeURLError, read_to_array
7+
8+
so ``from xrspatial.geotiff import read_to_array`` worked even though
9+
``read_to_array`` was not in ``__all__`` and not documented as public.
10+
The api-consistency sweep on 2026-05-12 flagged this as an orphan API
11+
(Cat 5): the name is reachable from the public namespace and tests
12+
relied on it, but the maintainer had not committed to it as a
13+
supported entry point. A future cleanup that removed the top-level
14+
import would have broken users with no deprecation path.
15+
16+
The fix binds the helper under a leading-underscore name in
17+
``__init__.py`` (``_read_to_array``) and updates the single internal
18+
test that depended on the top-level import to go through
19+
``xrspatial.geotiff._reader`` directly.
20+
21+
This module pins the public-namespace contract so a future maintainer
22+
who re-adds a bare ``from ._reader import read_to_array`` to
23+
``__init__.py`` fails CI before merging.
24+
"""
25+
from __future__ import annotations
26+
27+
import pytest
28+
29+
import xrspatial.geotiff as g
30+
31+
32+
def test_read_to_array_not_in_module_namespace():
33+
"""``read_to_array`` is internal and must not be a public attribute
34+
of ``xrspatial.geotiff``. Tests and internal callers go through
35+
``xrspatial.geotiff._reader`` directly."""
36+
assert not hasattr(g, 'read_to_array'), (
37+
"read_to_array leaked into xrspatial.geotiff's public namespace. "
38+
"It should be bound as _read_to_array (underscore-prefix) so it "
39+
"does not appear in `from xrspatial.geotiff import *` or tab "
40+
"completion. See issue #1708."
41+
)
42+
43+
44+
def test_read_to_array_not_in_all():
45+
"""``__all__`` does not advertise ``read_to_array``. (Pre-existing
46+
state, pinned so a future change can't quietly promote it without
47+
also adding it to the module-level Public API docstring.)"""
48+
assert 'read_to_array' not in g.__all__
49+
50+
51+
def test_read_to_array_still_importable_from_reader_submodule():
52+
"""The function itself is still reachable via the canonical private
53+
path so tests and internal code keep working."""
54+
from xrspatial.geotiff._reader import read_to_array
55+
56+
assert callable(read_to_array)
57+
58+
59+
def test_import_from_top_level_now_fails():
60+
"""Direct top-level import is the regression we are guarding
61+
against. Before #1708 this succeeded; afterwards it must raise
62+
ImportError."""
63+
with pytest.raises(ImportError):
64+
from xrspatial.geotiff import read_to_array # noqa: F401

0 commit comments

Comments
 (0)