Skip to content

Commit 29ebb71

Browse files
committed
geotiff: tracemalloc peak-memory check for #1736 regression test
Addresses two Copilot review comments on PR #1746: * Update the module docstring's stale reference to ``_compress_tiles`` (which doesn't exist) to point at the actual function name, ``xrspatial.geotiff._writer._write_tiled``. * Add real teeth behind the "no dead alloc" test-module name. The three round-trip tests only covered correctness; nothing in the file would have failed if the ``bytearray(n_tiles * tile_bytes)`` allocation were quietly restored. Two new tests snapshot ``tracemalloc`` peak across a direct ``_write_tiled`` call (uncompressed branch, single- and 3-band 1024x1024 uint8 inputs) and assert peak stays under 1.5x raster bytes. The current implementation lands at ~1.06-1.12x; the deleted bytearray pushed peak to ~2.07x. The 1.5x cap reliably catches reintroduction without flagging unrelated implementation drift. The tracemalloc approach was preferred over renaming the tests/file to "round-trip-only" because it preserves both the original intent of the test name and gives future maintainers an automatic alarm if the regression is reintroduced.
1 parent 3ce1681 commit 29ebb71

1 file changed

Lines changed: 80 additions & 11 deletions

File tree

xrspatial/geotiff/tests/test_writer_uncompressed_tiled_no_dead_alloc_1736.py

Lines changed: 80 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,26 +1,44 @@
11
"""Regression test for issue #1736.
22
3-
The uncompressed tiled-write branch of ``_compress_tiles`` previously
4-
allocated a contiguous ``bytearray`` plus a memoryview ``(n_tiles *
5-
tw * th * bytes_per_sample * samples)`` bytes long at the top of the
6-
loop and never read either back. Tile bytes were still built via
7-
``tile_arr.tobytes()`` and appended to a list. The dead buffer roughly
8-
doubled peak memory for an uncompressed write.
9-
10-
The fix is a pure deletion. This test pins the round-trip so a future
11-
refactor that re-introduces a real contiguous buffer keeps the same
12-
external behaviour: writing an uncompressed tiled GeoTIFF must still
13-
read back identically with no holes between tiles.
3+
The uncompressed tiled branch of ``xrspatial.geotiff._writer._write_tiled``
4+
previously allocated a contiguous ``bytearray`` plus a memoryview
5+
``(n_tiles * tw * th * bytes_per_sample * samples)`` bytes long at the
6+
top of the loop and never read either back. Tile bytes were still
7+
built via ``tile_arr.tobytes()`` and appended to a list. The dead
8+
buffer roughly doubled peak memory for an uncompressed write.
9+
10+
The fix is a pure deletion. The tests below cover both behaviours
11+
worth pinning:
12+
13+
* round-trip fidelity (writing an uncompressed tiled GeoTIFF must
14+
still read back identically with no holes between tiles); and
15+
* peak-memory shape, by snapshotting ``tracemalloc`` peak across a
16+
direct ``_write_tiled`` call. The current implementation lands at
17+
roughly ``1.06x`` the raw raster size; the dead bytearray pushed it
18+
to ``~2.07x``. The threshold below (``1.5x``) catches any
19+
reintroduction of that allocation with comfortable headroom for
20+
unrelated implementation changes.
1421
"""
1522
from __future__ import annotations
1623

1724
import os
25+
import tracemalloc
1826
import uuid
1927

2028
import numpy as np
2129
import xarray as xr
2230

2331
from xrspatial.geotiff import to_geotiff, open_geotiff
32+
from xrspatial.geotiff._compression import COMPRESSION_NONE
33+
from xrspatial.geotiff._writer import _write_tiled
34+
35+
36+
# Peak ``tracemalloc`` size, in multiples of the input raster size, that
37+
# the uncompressed branch of ``_write_tiled`` must stay under. The dead
38+
# bytearray drove peak to ~2.07x; the current implementation sits at
39+
# ~1.06-1.12x across the cases below. 1.5x leaves room for unrelated
40+
# refactors while still firmly catching the regression.
41+
_PEAK_RATIO_LIMIT = 1.5
2442

2543

2644
def test_uncompressed_tiled_round_trip_exact(tmp_path):
@@ -64,3 +82,54 @@ def test_uncompressed_tiled_round_trip_multiband(tmp_path):
6482

6583
out = open_geotiff(p)
6684
np.testing.assert_array_equal(out.data, data)
85+
86+
87+
def _peak_ratio_for_write_tiled(data: np.ndarray, tile_size: int) -> float:
88+
"""Return ``tracemalloc`` peak / ``data.nbytes`` for one
89+
``_write_tiled`` call against the uncompressed branch.
90+
91+
Allocations made before this call are excluded from peak by the
92+
``reset_peak`` step, so the ratio reflects what ``_write_tiled``
93+
itself adds.
94+
"""
95+
tracemalloc.start()
96+
try:
97+
tracemalloc.reset_peak()
98+
_write_tiled(data, COMPRESSION_NONE, 1, tile_size=tile_size)
99+
_current, peak = tracemalloc.get_traced_memory()
100+
finally:
101+
tracemalloc.stop()
102+
return peak / data.nbytes
103+
104+
105+
def test_uncompressed_tiled_peak_memory_single_band():
106+
"""Peak memory for the uncompressed branch should stay below
107+
``_PEAK_RATIO_LIMIT * raster_bytes``. Reintroducing the dead
108+
``bytearray(n_tiles * tile_bytes)`` would push the ratio to ~2x
109+
and fail this check."""
110+
h, w = 1024, 1024 # 1 MB raw, exact tile divisor -> no edge padding
111+
data = np.random.RandomState(20260512).randint(
112+
0, 255, size=(h, w), dtype=np.uint8,
113+
)
114+
ratio = _peak_ratio_for_write_tiled(data, tile_size=256)
115+
assert ratio < _PEAK_RATIO_LIMIT, (
116+
f"_write_tiled peak memory {ratio:.2f}x raster exceeds the "
117+
f"{_PEAK_RATIO_LIMIT}x cap; the dead bytearray from #1736 may "
118+
f"have been reintroduced."
119+
)
120+
121+
122+
def test_uncompressed_tiled_peak_memory_multiband():
123+
"""3-band variant of the peak-memory check. ``samples == 3``
124+
triples the would-be dead buffer, so this case is even more
125+
sensitive to a regression."""
126+
h, w = 1024, 1024
127+
data = np.random.RandomState(20260513).randint(
128+
0, 255, size=(h, w, 3), dtype=np.uint8,
129+
)
130+
ratio = _peak_ratio_for_write_tiled(data, tile_size=256)
131+
assert ratio < _PEAK_RATIO_LIMIT, (
132+
f"_write_tiled peak memory {ratio:.2f}x raster exceeds the "
133+
f"{_PEAK_RATIO_LIMIT}x cap; the dead bytearray from #1736 may "
134+
f"have been reintroduced."
135+
)

0 commit comments

Comments
 (0)