Skip to content

Inherit georef from level-0 IFD when reading overview IFDs (#1640)#1641

Merged
brendancol merged 2 commits into
mainfrom
deep-sweep-metadata-geotiff-2026-05-11-aee6c04a
May 12, 2026
Merged

Inherit georef from level-0 IFD when reading overview IFDs (#1640)#1641
brendancol merged 2 commits into
mainfrom
deep-sweep-metadata-geotiff-2026-05-11-aee6c04a

Conversation

@brendancol
Copy link
Copy Markdown
Contributor

Summary

Closes #1640.

open_geotiff(path, overview_level=N) with N >= 1 was silently
dropping CRS, the transform attr, and georeferenced y/x coords.
GDAL-style COG writers (including this package's to_geotiff) put
GeoKeys / ModelPixelScale / ModelTiepoint only on the level-0 IFD, so
the reader's per-IFD extract_geo_info call returned an empty
GeoTransform and no CRS for every overview read. Downstream spatial
ops then derived pixel sizes from the integer fallback coords and got
silently-wrong answers.

The fix lives in a new helper,
extract_geo_info_with_overview_inheritance. When the selected IFD
is a reduced-resolution overview (NewSubfileType bit 0) and its own
georef is missing, the helper re-runs extract_geo_info on the first
full-resolution IFD and rescales the pixel size by
width_full / width_overview (same for height) before copying the
CRS-side metadata across. Overviews that already carry their own
geokeys (some writers replicate them) are returned unchanged.

The helper is wired into every read entry point so all four backends
benefit: read_to_array (eager numpy and dask chunk workers),
_read_cog_http (HTTP COGs), _read_geo_info (dask metadata probe),
and read_geotiff_gpu (cupy and dask+cupy).

Test plan

  • New regression tests in tests/test_overview_geo_inheritance_1640.py:
    • CRS / transform / coord-extent across overview levels 0..3 on each
      of the four backends (numpy, dask+numpy, cupy, dask+cupy)
    • Overview IFD that carries its own geokeys keeps them (no clobber)
    • File with no full-res sibling falls back without raising
    • Level-0 reads are byte-for-byte unchanged
  • Full geotiff suite passes (1363 pass, two pre-existing matplotlib
    deepcopy failures in test_features.py unrelated to this change)
  • Broader xrspatial/tests/ suite passes (3528 passed, 15 skipped)

GDAL-style COG writers, including this package's own to_geotiff, put
the GeoKeyDirectory, ModelPixelScale, and ModelTiepoint only on the
level-0 IFD. Before this fix, open_geotiff(path, overview_level=N)
with N >= 1 returned a DataArray whose CRS, transform attr, and y/x
coords were silently replaced with defaults (unit transform, integer
pixel indices, no crs). Downstream spatial ops then computed pixel
sizes from the wrong coords and produced silently-wrong answers.

Add extract_geo_info_with_overview_inheritance in _geotags.py. When
the selected IFD is a reduced-resolution overview (NewSubfileType bit
0) and its own extract_geo_info reports has_georef=False, the helper
re-runs extract_geo_info on the first full-resolution IFD and rescales
the pixel size by width_full / width_overview (and the same for
height) before copying CRS-side metadata across. Overviews that
already carry their own valid georef (some writers replicate it) are
left untouched.

Wire the helper into every read entry point:
- read_to_array (eager numpy and dask chunk workers)
- _read_cog_http (HTTP COGs)
- _read_geo_info (dask metadata probe)
- read_geotiff_gpu (cupy and dask+cupy paths)

Add xrspatial/geotiff/tests/test_overview_geo_inheritance_1640.py
covering all four backends (numpy, dask+numpy, cupy, dask+cupy) plus
two edge cases: an overview that carries its own geokeys keeps them,
and a file with no full-res sibling falls back without raising.
@github-actions github-actions Bot added the performance PR touches performance-sensitive code label May 12, 2026
@brendancol brendancol merged commit 03d7380 into main May 12, 2026
11 of 12 checks passed
brendancol added a commit that referenced this pull request May 12, 2026
PR #1641 (issue #1640) inherits level-0 georef on overview reads but
keeps the level-0 origin_x / origin_y unchanged. That is correct for
PixelIsArea -- origin is the upper-left corner of pixel (0, 0), which
is also the upper-left corner of the overview's pixel (0, 0). It is
wrong for PixelIsPoint (GeoKey 1025 = 2): the origin is the center of
pixel (0, 0), and an overview pixel covering the first scale_x columns
of level 0 has its center at the centroid of those level-0 pixels
(origin + (scale - 1) * 0.5 * pixel_size_lvl0).

Before this fix, open_geotiff on a 1024x1024 PixelIsPoint COG with
10 m pixels and origin (0, 0) returned x[:3] = [0, 20, 40] for
overview_level=1 instead of [5, 25, 45]. Downstream sel / interp /
reproject silently snaps to the wrong pixel for any DEM-style
PixelIsPoint COG (USGS, OpenTopography, Copernicus DEM).

Fix in extract_geo_info_with_overview_inheritance: choose the
effective raster_type first (overview's own when it explicitly
declared non-default, otherwise inherit from level 0), then apply
origin_shift = (scale - 1) * 0.5 * pixel_size_lvl0 along each axis
when that effective raster_type is PixelIsPoint. The PixelIsArea path
is byte-equivalent to before.

Add 13 regression tests in test_overview_pixel_is_point_1642.py:
centroid identity across all four backends, transform-tuple values
across all four backends, uniform grid step, unit-level helper tests
for both raster_types via stubbed extract_geo_info, an own-geokeys-
not-clobbered case on PixelIsPoint, and a PixelIsArea regression
check so the #1640 contract still holds. All 1397 existing non-network
geotiff tests still pass.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

geotiff: open_geotiff(overview_level>=1) drops CRS, transform, and georeferenced coords

1 participant