Skip to content

geotiff: document attrs tier classification in _attrs.py (#1984)#2000

Merged
brendancol merged 2 commits into
mainfrom
1984-pr1-attrs-tier-docstring
May 17, 2026
Merged

geotiff: document attrs tier classification in _attrs.py (#1984)#2000
brendancol merged 2 commits into
mainfrom
1984-pr1-attrs-tier-docstring

Conversation

@brendancol
Copy link
Copy Markdown
Contributor

Summary

PR 1 of 7 from the implementation plan on issue #1984.

Documents the three-tier attrs contract in the module docstring of xrspatial/geotiff/_attrs.py. Docstring-only; no behaviour change.

The three tiers are:

  • Canonical: keys xrspatial owns and round-trips without loss (crs, crs_wkt, transform, nodata, raster_type, extra_tags, gdal_metadata, gdal_metadata_xml, x_resolution, y_resolution, resolution_unit, _xrspatial_geotiff_contract).
  • Compatibility alias: read for ecosystem interop, never emitted by writers when the canonical key is present (nodatavals from rioxarray, _FillValue from CF).
  • Best-effort pass-through: preserved when the writer can reconstruct from canonical state, otherwise dropped (GeoKey-derived fields, image_description, extra_samples, colormap variants).

The contract version key _xrspatial_geotiff_contract is referenced in the docstring; population of that key lands in a later PR in the series.

Note on nodata

Issue #1988 (declared-vs-masked nodata split) is a prerequisite for tightening the nodata canonical semantics. The docstring records the intended semantics post-#1988: nodata is the declared file sentinel as stored in the GDAL_NODATA tag.

Test plan

  • pytest xrspatial/geotiff/tests/test_attrs_parity_1548.py -x -q (5 passed)

Extends the module docstring with the three-tier attrs contract from
issue #1984:

- Canonical keys (crs, crs_wkt, transform, nodata, raster_type,
  extra_tags, gdal_metadata, gdal_metadata_xml, x_resolution,
  y_resolution, resolution_unit, _xrspatial_geotiff_contract) are owned
  by xrspatial and survive round-trip.
- Compatibility aliases (nodatavals, _FillValue) are read for ecosystem
  interop but writers never emit them when the canonical key is set.
- Best-effort pass-through keys (GeoKey-derived fields, image_description,
  extra_samples, colormap variants) are preserved when the writer can
  reconstruct them from canonical state.

Docstring-only change. The contract version key is mentioned for
reference; it is populated by a later PR in the series.
@github-actions github-actions Bot added the performance PR touches performance-sensitive code label May 17, 2026
@brendancol
Copy link
Copy Markdown
Contributor Author

PR Review: geotiff: document attrs tier classification in _attrs.py

Blockers (must fix before merge)

  • xrspatial/geotiff/_attrs.py:27 — the transform definition is wrong. The docstring says tuple of (origin_x, origin_y, pixel_width, pixel_height) (a 4-tuple), but the value emitted by _transform_tuple_from_pixel_geometry is a rasterio-style 6-tuple (pixel_width, 0.0, origin_x, 0.0, pixel_height, origin_y). See xrspatial/geotiff/_coords.py:91-115. PR geotiff: add attrs contract documentation page (#1984) #2001 documents the same key with a different (also wrong) layout, so both need to be fixed together. Correct text: tuple of (pixel_width, 0.0, origin_x, 0.0, pixel_height, origin_y).

Suggestions (should fix, not blocking)

  • xrspatial/geotiff/_attrs.py:25crs_wkt is described as "Always emitted." The reader emits it only when geo_info.crs_wkt is not None (see _attrs.py:107). The user-guide page in PR geotiff: add attrs contract documentation page (#1984) #2001 says "Always present on read when any CRS information is available", which matches the code. Match that wording.
  • xrspatial/geotiff/_attrs.py:35gdal_metadata is the parsed dict and gdal_metadata_xml is the raw XML string. The writer prefers the XML form when both are present. A one-line note about that precedence would help.

Nits (optional improvements)

  • Worth documenting that extra_tags is omitted when no out-of-band tags are present, similar to how transform is omitted on no-georef files.

What looks good

Checklist

- transform: corrected layout from (origin_x, origin_y, pixel_width,
  pixel_height) to the rasterio-style 6-tuple
  (pixel_width, 0.0, origin_x, 0.0, pixel_height, origin_y) that
  _transform_tuple_from_pixel_geometry actually emits, and noted that
  it is omitted on files with no GeoTIFF transform tags.
- crs_wkt: replaced "Always emitted" (the reader only emits it when
  geo_info.crs_wkt is not None) with the more accurate "Present on
  read whenever any CRS information is available". Also dropped the
  unconfirmed "PROJ" qualifier on the WKT dialect.
- gdal_metadata_xml: noted that writers prefer this over the parsed
  dict when both attrs are present.
- extra_tags: noted that the attr is omitted when no out-of-band tags
  are present, matching the conditional emission in
  _populate_attrs_from_geo_info.
@brendancol brendancol merged commit cf53494 into main May 17, 2026
4 of 5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant