Skip to content

Fix _write_vrt_tiled metadata drop vs to_geotiff (#1606)#1609

Merged
brendancol merged 2 commits into
mainfrom
deep-sweep-metadata-geotiff-2026-05-11-aa5f821c
May 11, 2026
Merged

Fix _write_vrt_tiled metadata drop vs to_geotiff (#1606)#1609
brendancol merged 2 commits into
mainfrom
deep-sweep-metadata-geotiff-2026-05-11-aa5f821c

Conversation

@brendancol
Copy link
Copy Markdown
Contributor

Summary

Closes #1606.

to_geotiff(da, "out.vrt") -- which dispatches to _write_vrt_tiled -- silently dropped a chunk of metadata that the matching to_geotiff(da, "out.tif") call preserved:

  • attrs['nodatavals'] / attrs['_FillValue'] -- the VRT path read attrs['nodata'] directly instead of routing through _resolve_nodata_attr. A rioxarray-style DataArray therefore lost its sentinel through .vrt but kept it through .tif.
  • attrs['gdal_metadata'] / attrs['gdal_metadata_xml']
  • attrs['extra_tags'] and the friendly tag attrs folded in by _merge_friendly_extra_tags
  • attrs['x_resolution'] / attrs['y_resolution'] / attrs['resolution_unit']
  • attrs['raster_type']

Fix

Pull the same rich-tag set to_geotiff already extracts (resolving nodata via the alias-aware _resolve_nodata_attr helper added in #1582) and thread it through _write_single_tile so every per-tile file carries the same metadata that the single-file .tif writer emits. _write_single_tile now accepts the extra kwargs and forwards them to the underlying write() call.

Test plan

  • New regression module test_vrt_tiled_metadata_1606.py covers nodatavals, _FillValue, gdal_metadata, x_resolution / y_resolution / resolution_unit, and raster_type propagation, plus a TIF-vs-VRT-tile parity test and a dask-backed variant
  • All 18 tests under test_vrt_write.py + test_vrt_tiled_metadata_1606.py pass
  • Full xrspatial/geotiff/tests/ suite: 1180 passed (3 pre-existing matplotlib palette failures unrelated to this change)
  • Verified across numpy / cupy / dask / dask+cupy backends manually

Found by /sweep-metadata.

`to_geotiff(da, "out.vrt")` (which dispatches to `_write_vrt_tiled`)
silently dropped a chunk of metadata that the matching
`to_geotiff(da, "out.tif")` call preserved:

* `attrs['nodatavals']` / `attrs['_FillValue']` -- the VRT path read
  `attrs['nodata']` directly instead of going through
  `_resolve_nodata_attr`, so a rioxarray-style DataArray round-tripped
  without its sentinel.
* `attrs['gdal_metadata']` / `attrs['gdal_metadata_xml']`
* `attrs['extra_tags']` and the friendly tag attrs folded in by
  `_merge_friendly_extra_tags`
* `attrs['x_resolution']` / `attrs['y_resolution']` /
  `attrs['resolution_unit']`
* `attrs['raster_type']`

Pull the same rich-tag set `to_geotiff` already extracts (resolving
nodata via the alias-aware helper) and thread it through
`_write_single_tile` so every per-tile file under the VRT carries
the same metadata it would have received from a single-file `.tif`
write. Verified across numpy / cupy / dask / dask+cupy backends.

Found by /sweep-metadata.
@github-actions github-actions Bot added the performance PR touches performance-sensitive code label May 11, 2026
@brendancol brendancol requested a review from Copilot May 11, 2026 18:48
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes backend inconsistencies in xrspatial.geotiff.to_geotiff when writing VRT tiled outputs by ensuring _write_vrt_tiled preserves the same metadata set that the single-file GeoTIFF path already propagates (notably alias-resolved nodata and rich GDAL/TIFF tags), closing #1606.

Changes:

  • Update _write_vrt_tiled to resolve nodata via _resolve_nodata_attr and propagate rich metadata (GDAL metadata, extra tags, resolution tags, raster type) into each per-tile GeoTIFF.
  • Extend _write_single_tile to accept/forward the additional metadata kwargs to the underlying write() call.
  • Add a regression test module covering VRT tile metadata parity and a dask-backed VRT tiled path.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
xrspatial/geotiff/__init__.py Propagates alias-resolved nodata and rich metadata through the VRT tiled writer by threading new kwargs into per-tile writes.
xrspatial/geotiff/tests/test_vrt_tiled_metadata_1606.py Adds regression coverage for metadata propagation/parity for VRT-tiled outputs, including a dask-backed variant.
.claude/sweep-metadata-state.csv Records the #1606 sweep finding/state entry.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread xrspatial/geotiff/tests/test_vrt_tiled_metadata_1606.py Outdated
Comment thread xrspatial/geotiff/__init__.py Outdated
Comment thread xrspatial/geotiff/__init__.py Outdated
…ct_rich_tags

Three follow-ups from the Copilot review on PR #1609:

1. **Docstring typo** in `test_vrt_tiled_metadata_1606.py`: the module
   header listed `` `['resolution_unit']` `` instead of the proper
   `` `attrs['resolution_unit']` `` form. Fixed.

2. **Test coverage** for the previously-uncovered rich-tag paths:

   - `attrs['gdal_metadata_xml']` (pre-built XML string, bypasses the
     dict->XML builder) round-trips through `to_geotiff(..., '*.vrt')`
     and the per-tile reader.
   - `attrs['extra_tags']` entry (Software tag 305 ASCII) round-trips.
   - `attrs['image_description']` (friendly-tag attr folded into
     `extra_tags` as tag 270 by `_merge_friendly_extra_tags`)
     round-trips.

3. **Refactor**: the rich-tag extraction logic (`raster_type` mapping,
   `gdal_metadata_xml` build, `extra_tags` merge, `resolution_unit`
   string->id mapping with the `_unit_ids` dict) was duplicated across
   `to_geotiff`, `_write_vrt_tiled`, and `write_geotiff_gpu`. Lifted
   into a private `_extract_rich_tags(attrs) -> dict` helper that
   returns kwargs ready to splat into `write(...)`. All three callsites
   now use it. The string-to-id map for `resolution_unit` is also
   promoted to a module-level `_RESOLUTION_UNIT_IDS` constant so the
   helper does not rebuild it on every call.

Tests: full `xrspatial/geotiff/tests/` suite (minus three pre-existing
matplotlib palette failures unrelated to this PR) is 1179 passed.
@brendancol brendancol merged commit 1ce5959 into main May 11, 2026
10 of 11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

geotiff: _write_vrt_tiled drops nodatavals/gdal_metadata/extra_tags/resolution/raster_type

2 participants