Skip to content

geotiff: add attrs pass-through locking test (#1984)#2004

Merged
brendancol merged 2 commits into
mainfrom
1984-pr6-attrs-passthrough-locking-test
May 17, 2026
Merged

geotiff: add attrs pass-through locking test (#1984)#2004
brendancol merged 2 commits into
mainfrom
1984-pr6-attrs-passthrough-locking-test

Conversation

@brendancol
Copy link
Copy Markdown
Contributor

@brendancol brendancol commented May 17, 2026

PR 6 of 7 on issue #1984 (attrs-contract roadmap).

Adds xrspatial/geotiff/tests/test_attrs_contract_passthrough_1984.py to pin the current round-trip behaviour of every key in the pass-through tier of the attrs contract. Tests only; no production code changes.

Background

The attrs contract has three tiers:

  1. Canonical: the writer consumes the attr, so round-trip is guaranteed.
  2. Pass-through: the writer does not consume the attr. The reader rebuilds the value from the GeoKey directory (or another TIFF tag) on read. Round-trip works only when the writer happens to emit a tag the reader can rebuild the attr from.
  3. Ignored: the writer never touches the attr; it is dropped silently.

This PR locks the current state of every pass-through key so the next PR has a list of promotion candidates and a regression net around whatever it touches.

Current state of every pass-through key

Key Source GeoKey / TIFF tag Round-trip
image_description tag 270 (ImageDescription) reconstructible
extra_samples tag 338 (ExtraSamples) reconstructible
colormap tag 320 (ColorMap, raw uint16) reconstructible
crs_name GTCitationGeoKey / ProjCitationGeoKey dropped
geog_citation GeogCitationGeoKey dropped
datum_code GeogGeodeticDatumGeoKey dropped
angular_units GeogAngularUnitsGeoKey dropped
linear_units ProjLinearUnitsGeoKey dropped
semi_major_axis GeogSemiMajorAxisGeoKey dropped
inv_flattening GeogInvFlatteningGeoKey dropped
projection_code ProjectionGeoKey dropped
vertical_crs VerticalCSTypeGeoKey dropped
vertical_citation VerticalCitationGeoKey dropped
vertical_units VerticalUnitsGeoKey dropped
colormap_rgba derived (reader gates on Photometric == 3) dropped
cmap derived (reader gates on Photometric == 3) dropped

The dropped rows reflect today's writer: build_geo_tags in xrspatial/geotiff/_geotags.py writes only the EPSG code into GeographicType / ProjectedCSType (plus GTCitationGeoKey on the user-defined WKT path). It does not emit the secondary GeoKeys the reader pulls those attrs from, so an EPSG round-trip cannot recover them. Per row, the call for review is whether this is intended behaviour or a gap to close in the next PR.

Additional invariants pinned

  • test_passthrough_dropped_when_no_crs: a file written without any CRS surfaces no pass-through attrs on read.
  • test_passthrough_does_not_promote_to_canonical: setting a pass-through key on a DataArray without crs / crs_wkt does not cause the writer to synthesise a CRS.

Test plan

pytest xrspatial/geotiff/tests/test_attrs_contract_passthrough_1984.py -x -q
pytest xrspatial/geotiff/tests -x -q

The new file's 18 cases pass locally. The broader geotiff suite keeps its pre-existing pass rate; two failures (test_predictor2_big_endian_gpu_1517 and test_size_param_validation_gpu_vrt_1776) reproduce on clean main without this PR's diff.

Pin the current best-effort round-trip behaviour of every pass-through
attr key so a later PR can decide which to promote to canonical.

Reconstructible on round-trip (writer emits a TIFF tag the reader
rebuilds the attr from):
  - image_description  (tag 270, ImageDescription)
  - extra_samples      (tag 338, ExtraSamples)
  - colormap           (tag 320, raw uint16 ColorMap triples)

Dropped on round-trip (writer never emits the GeoKey, reader has
nothing to rebuild from -- build_geo_tags only writes the EPSG code
into GeographicType / ProjectedCSType):
  - crs_name
  - geog_citation
  - datum_code
  - angular_units
  - linear_units
  - semi_major_axis
  - inv_flattening
  - projection_code
  - vertical_crs
  - vertical_citation
  - vertical_units
  - colormap_rgba   (reader gates on Photometric == 3)
  - cmap            (reader gates on Photometric == 3)

Also pin two cross-cutting invariants:
  - pass-through keys are absent in attrs when the source file has no
    CRS at all
  - setting a pass-through key on a DataArray without crs / crs_wkt
    does not cause the writer to synthesise a CRS

PR 6 of 7 on the attrs-contract roadmap. Tests only -- no production
code touched.
@github-actions github-actions Bot added the performance PR touches performance-sensitive code label May 17, 2026
@brendancol
Copy link
Copy Markdown
Contributor Author

PR Review: geotiff: add attrs pass-through locking test

Blockers (must fix before merge)

None.

Suggestions (should fix, not blocking)

  • xrspatial/geotiff/tests/test_attrs_contract_passthrough_1984.py:142 — the colormap case writes a 256-entry colormap with [0, 128, 255]-style values. The TIFF ColorMap spec stores RGB triples as scaled uint16 (0-65535). If to_geotiff rescales internally, pin or document the convention; otherwise a future change to that conversion rule could break this test for non-substantive reasons.
  • xrspatial/geotiff/tests/test_attrs_contract_passthrough_1984.py:233-237warnings.simplefilter('ignore') hides every warning from the writer. If the writer ever raises a warning that signals data loss (e.g. dropping a key), this test would miss it. Narrow the filter to the specific UserWarning from the CRS WKT path, or assert the expected warning explicitly.

Nits (optional improvements)

  • xrspatial/geotiff/tests/test_attrs_contract_passthrough_1984.py:62-79, 118-145_PASSTHROUGH_CASES and _ALL_PASSTHROUGH_KEYS carry the same set in two forms. A small assert {c[0] for c in _PASSTHROUGH_CASES} == set(_ALL_PASSTHROUGH_KEYS) at module load (or as its own test) would keep the two from drifting.

What looks good

  • Parametrisation captures all 16 pass-through keys with one of two outcomes, which makes the promotion-to-canonical decisions in the next PR mechanical.
  • test_passthrough_dropped_when_no_crs and test_passthrough_does_not_promote_to_canonical cover the two cross-cutting invariants the per-key matrix does not.
  • Failure messages quote the writer reconstruction logic, so a regression points at what changed.

Checklist

  • All 16 pass-through keys parametrised
  • Cross-cutting invariants covered
  • No-CRS case covered
  • No production code touched

- Added ``test_passthrough_cases_cover_all_keys`` so the parametrised
  case table and ``_ALL_PASSTHROUGH_KEYS`` cannot drift silently.
- Documented the TIFF ColorMap uint16 convention next to the
  ``colormap`` parametrise entry so a future fixture-vs-writer rescale
  fails with a clear note pointing at the comment.

Left the broad ``warnings.simplefilter('ignore')`` in
``test_passthrough_does_not_promote_to_canonical`` as-is for now: the
no-CRS path does not emit a CRS-WKT warning today, and narrowing to a
specific category risks silently re-introducing the warning sensitivity
the original comment was guarding against. Worth a follow-up once the
writer's warning categories settle.
@brendancol brendancol merged commit 772db99 into main May 17, 2026
4 of 5 checks passed
brendancol added a commit that referenced this pull request May 18, 2026
* geotiff: add attrs canonical-tier locking test (#1984)

PR 4 of 7 from the attrs-contract plan in #1984. Adds
test_attrs_contract_canonical_1984.py: a single fixture that exercises
every canonical attr (crs, crs_wkt, transform, nodata, raster_type,
extra_tags, gdal_metadata, gdal_metadata_xml, x_resolution / y_resolution /
resolution_unit, _xrspatial_geotiff_contract), round-trips it through
to_geotiff -> open_geotiff, and asserts presence + value per key.

Sibling locking tests already cover the other tiers: aliases (#2002),
pass-through (#2004), and per-backend stamping of the contract version
(#2003). The canonical assertion list is what #1985 (parity matrix) and
#1986 (round-trip invariants) will import.

No production-code changes.

* geotiff: address review on attrs canonical-tier locking test

Addresses self-review on PR #2009:

* Add per-backend canonical-key presence parametrisation
  (eager-numpy, dask-numpy, gpu, dask-gpu). The 7-PR plan in #1984
  called for "one fixture per backend" coverage; previously only the
  eager read path was exercised.
* Comment ``_CANONICAL_KEYS`` to explain the deliberate omission of
  ``raster_type`` (the implicit 'area' default is encoded as absence,
  so the constant cannot express it; the two dedicated raster_type
  tests handle both branches).
* Drop ``AREA_OR_POINT: 'Area'`` from the shared ``_GDAL_META``
  fixture so the point-branch test does not inherit an inconsistent
  GDAL_METADATA entry.
* Relax the crs_wkt substring check to a regex covering 'WGS 84',
  'WGS_1984', and 'WGS-84' so the assertion survives PROJ-version
  variation across CI platforms.
* Note in a comment that ``gdal_metadata['TIFFTAG_SOFTWARE']`` and the
  raw Software TIFF tag (305) in ``extra_tags`` are independent
  channels the writer does not synchronise.
brendancol added a commit that referenced this pull request May 18, 2026
…) (#2013)

The pass-through tier of the attrs contract (#1984, locked in #2004)
flagged ``linear_units`` and ``projection_code`` as dropped on a
write + read round-trip: ``build_geo_tags`` only emits the primary
``GEOKEY_PROJECTED_CS_TYPE`` and never the secondary projected
GeoKeys (``GEOKEY_PROJECTION``, ``GEOKEY_PROJ_LINEAR_UNITS``), so
the reader has nothing to rebuild the attrs from.

Wrap the two emission sites in ``_populate_attrs_from_geo_info`` so
each one fires a ``DeprecationWarning`` referencing #1984. The
attrs are still emitted during the deprecation cycle; removal is a
follow-up.

Docstring: move both keys from the pass-through tier to a new
"Deprecated" section. A sibling PR is adding the same section for
geographic GeoKey attrs and may collide on this docstring block.

Tests: ``test_attrs_pr7_deprecate_projected_1984.py`` covers the
two warnings plus a guard that the attrs still populate during the
deprecation cycle.
brendancol added a commit that referenced this pull request May 18, 2026
)

The six geographic-CRS GeoKey-derived attrs (crs_name, geog_citation,
datum_code, angular_units, semi_major_axis, inv_flattening) were
documented in the attrs contract as best-effort pass-through, but the
PR6 locking test (#2004) showed they never round-trip: build_geo_tags
only emits the primary GEOKEY_GEOGRAPHIC_TYPE, so the secondary
GeoKeys these attrs come from are never written back.

PR 7 of the issue #1984 plan deprecates the read-side emission. Each
emission now fires a DeprecationWarning with a stable wording. The
values still land in attrs for one release cycle so external callers
can migrate; removal is scheduled for a later release.

Changes:

- _attrs.py: add _DEPRECATED_GEOGRAPHIC_GEOKEY_ATTRS tuple,
  _deprecated_geographic_geokey_warning text helper, and
  _emit_deprecated_geographic_geokey emission helper. Route the six
  attrs through the helper in _populate_attrs_from_geo_info.
- _attrs.py docstring: move the six attrs out of the best-effort
  pass-through tier and into a new "Deprecated" tier section.
- New test_attrs_pr7_deprecate_geographic_1984.py with parametrised
  warning assertions, still-emits-during-deprecation assertions, a
  no-warning-when-field-absent guard, and a warning-text shape check.
- CHANGELOG.md: document the deprecation under Unreleased.
brendancol added a commit that referenced this pull request May 18, 2026
) (#2011)

* geotiff: deprecate geographic-CRS GeoKey attrs in DataArray.attrs (#1984)

The six geographic-CRS GeoKey-derived attrs (crs_name, geog_citation,
datum_code, angular_units, semi_major_axis, inv_flattening) were
documented in the attrs contract as best-effort pass-through, but the
PR6 locking test (#2004) showed they never round-trip: build_geo_tags
only emits the primary GEOKEY_GEOGRAPHIC_TYPE, so the secondary
GeoKeys these attrs come from are never written back.

PR 7 of the issue #1984 plan deprecates the read-side emission. Each
emission now fires a DeprecationWarning with a stable wording. The
values still land in attrs for one release cycle so external callers
can migrate; removal is scheduled for a later release.

Changes:

- _attrs.py: add _DEPRECATED_GEOGRAPHIC_GEOKEY_ATTRS tuple,
  _deprecated_geographic_geokey_warning text helper, and
  _emit_deprecated_geographic_geokey emission helper. Route the six
  attrs through the helper in _populate_attrs_from_geo_info.
- _attrs.py docstring: move the six attrs out of the best-effort
  pass-through tier and into a new "Deprecated" tier section.
- New test_attrs_pr7_deprecate_geographic_1984.py with parametrised
  warning assertions, still-emits-during-deprecation assertions, a
  no-warning-when-field-absent guard, and a warning-text shape check.
- CHANGELOG.md: document the deprecation under Unreleased.

* geotiff: tighten PR7 geographic deprecation test parametrisation (#1984)

Address review S2 + N1 on PR #2011.

S2: drop the redundant ``field`` column from ``_DEPRECATED_CASES``.
Every entry had ``attr == field`` because every deprecated attr is
stored on ``GeoInfo`` under the same name it lands in ``attrs`` under.
Collapse to ``(attr, value)`` and update the two parametrised tests
to match.

N1: add ``test_deprecated_cases_has_no_duplicates`` so a duplicate
row in ``_DEPRECATED_CASES`` can no longer be silently absorbed by
the set-based coverage check.

Test count goes 15 -> 16.

* geotiff: walk past internal frames in deprecation warning stacklevel (#1984)

Address review S3 on PR #2011.

The geographic-GeoKey deprecation helper passed ``stacklevel=2`` to
``warnings.warn``, which pointed at ``_populate_attrs_from_geo_info``
rather than the user's ``open_geotiff(...)`` call. The exact distance
to the user frame also varies by backend: the numpy path adds three
internal frames, the dask path adds four. A fixed integer can only be
right for one of those paths.

Replace the fixed level with a frame walk
(``_stacklevel_to_external_caller``) that skips every
``xrspatial.geotiff*`` frame -- minus ``xrspatial.geotiff.tests`` so
the unit tests can still pose as external callers -- and returns the
matching ``stacklevel``. The new
``test_warning_stacklevel_points_at_caller_file`` pins the
attribution.

Today the warning category is ``DeprecationWarning``, which Python
silences by default for library code, so the stacklevel mostly
affects test output. Get it right now so a later switch to
``FutureWarning`` does not surface the warning as if it came from
``_attrs.py``.

Scope note: the inline ``warnings.warn`` calls for the projected and
vertical tiers (still pass ``stacklevel=2``) are left alone in this
commit; the next commit folds them into the generic helper and picks
up the new walk transparently.

Test count goes 16 -> 17.

* geotiff: fold deprecation emission into a generic GeoKey helper (#1984)

Address review S1 + N2 on PR #2011.

N2: promote ``_emit_deprecated_geographic_geokey`` to a generic
``_emit_deprecated_geokey_attr(attrs, name, value, *, reason)`` shared
by the geographic, projected, and vertical deprecation tiers
(introduced by sibling PRs #2010 and #2013 as inline
``warnings.warn`` calls during the rebase onto current main). The
per-category reason clauses live in two module-level constants
(``_GEOKEY_DEPRECATION_REASON_HORIZONTAL_CRS`` for geographic /
projected; ``_GEOKEY_DEPRECATION_REASON_VERTICAL_CRS`` for vertical),
which keeps the wording in lockstep and lets the existing
``test_warning_message_format`` assertion continue to pin the
canonical text verbatim. ``_emit_deprecated_geographic_geokey`` is
kept as a thin shim so the geographic call sites stay readable; the
projected and vertical sites in ``_populate_attrs_from_geo_info``
now call the generic helper directly and inherit the
external-caller stacklevel walk from commit B for free.

S1: the rebased docstring "Deprecated" section already grew a
migration recipe in the conflict resolution; this commit additionally
mirrors the recipe into ``docs/source/user_guide/attrs_contract.rst``
so the user-facing docs carry the same pyproj one-liners
(``crs.ellipsoid.inverse_flattening``, ``crs.datum.to_epsg()`` etc.)
that the docstring lists.

No new tests. The existing 17 deprecation tests plus the 42 contract
tests all still pass; the warning text and category are unchanged.

Sibling PRs #2010 and #2013 are already merged on main, so no
sibling rebase is needed -- this PR is the consolidation point.
brendancol added a commit that referenced this pull request May 18, 2026
…1984) (#2014)

* geotiff: deprecate matplotlib colormap variants in DataArray.attrs (#1984)

Issue #1984 PR 7. The reader emits ``attrs['cmap']`` (when matplotlib
is importable) and ``attrs['colormap_rgba']`` whenever the source file
declares Photometric=3 (palette). The writer never selects
Photometric=3 from attrs alone, so PR 6 (#2004) locked the round-trip
drop for both keys as part of the attrs-contract pass-through tier.

This PR deprecates the read-side emission of both keys for one
release cycle. Callers who want a matplotlib ListedColormap should
build one from the canonical ``attrs['colormap']`` (raw uint16 RGB
triples from TIFF tag 320) instead. ``attrs['colormap']`` is the
canonical-tier replacement and is intentionally kept: it round-trips
through ``_merge_friendly_extra_tags`` already.

During the deprecation window both attrs still land on the returned
DataArray; only a DeprecationWarning is added. Removal is a follow-up
PR. The new attrs-contract docstring split moves both keys into a
"Deprecated" section so the contract stays explicit about which keys
have a future.

The new test file
``xrspatial/geotiff/tests/test_attrs_pr7_deprecate_colormap_variants_1984.py``
pins three behaviours: the warning fires on a Photometric=3 fixture
for ``colormap_rgba`` (always) and for ``cmap`` (when matplotlib is
installed); the attrs are still emitted; the plain
``attrs['colormap']`` key is unaffected.

Existing tests that read palette fixtures (the TestPalette block in
``test_features.py`` and ``test_colormap_round_trip`` in
``test_metadata_round_trip_1484.py``) ignore the new DeprecationWarning
locally to keep their reports clean.

* geotiff: factor shared _emit_deprecated_geokey_attr helper

PR review on #2014 flagged that the three colormap deprecation sites
copy-paste the same nine-line ``warnings.warn(...)`` block with only
the attr name and migration hint varying. Sibling PRs #2010 / #2013
have the same shape; PR #2011 already factors a slice-specific
helper. Introduce a generic helper in ``_attrs.py`` so all four PR 7
slices can share it.

``_emit_deprecated_geokey_attr(attrs, name, value, *, reason,
migration=None)`` builds the canonical warning text shape:

    xrspatial.geotiff: attrs['<name>'] is deprecated; <reason>.
    [<migration>.] It will be removed in a future release. See
    issue #1984.

then sets ``attrs[name] = value`` (read-side emission still alive for
the deprecation window). ``stacklevel=2`` lands the warning at the
caller of ``_populate_attrs_from_geo_info``; ``DeprecationWarning`` is
silenced for library code by default in Python's filters, so this is
mainly visible to test runners and developers who opt in.

The colormap-variants deprecation now routes ``cmap`` and the two
``colormap_rgba`` branches (matplotlib present + ImportError fallback)
through the helper, replacing 27 lines of copy-paste with three
calls. The emitted warning text is byte-identical to the previous
inline strings, so the existing tests (``pytest.warns`` plus the
TestPalette / test_metadata_round_trip filter regexes) keep passing
unchanged.

Sibling PRs #2010 and #2013 can adopt the helper on rebase; #2011's
existing geographic-specific helper can be inlined onto this one in
a follow-up.

Test plan:
- pytest xrspatial/geotiff/tests/test_attrs_pr7_deprecate_colormap_variants_1984.py -- 4 passed.
- pytest xrspatial/geotiff/tests/test_attrs_contract_*_1984.py xrspatial/geotiff/tests/test_metadata_round_trip_1484.py -- 57 passed.
- pytest xrspatial/geotiff/tests/test_features.py::TestPalette -- 7 passed.

* geotiff: address self-review on shared deprecation helper

Self-review on PR #2014 flagged one blocker and two suggestions on
the shared-helper refactor. This commit addresses them.

Blocker -- ``stacklevel`` regression (``_attrs.py:179``):
    Wrapping the inline ``warnings.warn(..., stacklevel=2)`` call in
    a helper shifted the warning location by one frame; the warning
    was surfacing at ``_attrs.py`` (the helper call site) instead of
    at the backend that called ``_populate_attrs_from_geo_info``.
    Bumped ``stacklevel=2`` to ``stacklevel=3`` so the warning is
    attributed to the backend frame, matching the pre-refactor
    behaviour (verified empirically: warning now surfaces at
    ``xrspatial/geotiff/__init__.py:518`` instead of ``_attrs.py:350``).

Suggestion -- helper name (``_attrs.py:152``):
    Renamed ``_emit_deprecated_geokey_attr`` to ``_emit_deprecated_attr``
    because the colormap variants (cmap, colormap_rgba) come from
    TIFF tag 320 (ColorMap), not from a GeoKey. The new name covers
    every PR 7 slice; docstring notes why the ``_geokey_`` qualifier
    was dropped.

Suggestion -- user-guide doc:
    Added a new "Deprecated keys" section to
    ``docs/source/user_guide/attrs_contract.rst`` so users reading
    the contract page see the new tier alongside the existing
    canonical / compatibility-alias / pass-through tiers. Mirrors
    the module docstring.

Suggestion -- direct unit test for the helper:
    Added ``test_emit_deprecated_attr_with_migration_text`` and
    ``test_emit_deprecated_attr_without_migration_text``. The first
    pins the exact warning text shape; the second pins the
    ``migration=None`` branch which the existing colormap tests do
    not exercise.

Nit -- migration hints:
    Lifted ``_colormap_reason`` / ``_colormap_migration`` to module-
    level constants and split them into ``_DEPRECATED_COLORMAP_REASON``,
    ``_DEPRECATED_CMAP_MIGRATION``, and
    ``_DEPRECATED_COLORMAP_RGBA_MIGRATION``. The new
    ``colormap_rgba``-specific migration ("Reshape attrs['colormap']
    to (n_colors, 3) and append an alpha channel") is more direct
    than the previous shared ``ListedColormap`` hint, which only
    matched ``cmap`` cleanly.

Test plan:
- pytest test_attrs_pr7_deprecate_colormap_variants_1984.py (6 passed)
- pytest test_attrs_contract_*_1984.py test_metadata_round_trip_1484.py
  test_features.py::TestPalette -- 70 passed total
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant