geotiff: validate VRT writer source compatibility (#1733)#1741
Conversation
There was a problem hiding this comment.
Pull request overview
This PR tightens xrspatial.geotiff._vrt.write_vrt by validating that all GeoTIFF sources in a VRT mosaic are mutually compatible, preventing silent generation of mis-typed or mis-positioned mosaics (issue #1733).
Changes:
- Add per-source compatibility validation in
write_vrtfor pixel size (rtol), dtype (sample_format + bps), band count, and CRS. - Update
write_vrtdocstring to document the stricter contract and nodata precedence. - Add regression tests covering several mismatch cases and tolerance acceptance.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
xrspatial/geotiff/_vrt.py |
Adds multi-source validation logic and updates the write_vrt docstring to match the new contract. |
xrspatial/geotiff/tests/test_vrt_writer_source_compat_1733.py |
New regression tests for source-compatibility validation behaviors. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| m_crs = m.get('crs_wkt') | ||
| if first_crs and m_crs and m_crs != first_crs: | ||
| raise ValueError( | ||
| f"VRT source {m['path']!r} has CRS WKT that does not " | ||
| f"match the first source. All sources in a VRT must " | ||
| f"share the same CRS." | ||
| ) |
There was a problem hiding this comment.
Addressed in ff56e14: changed the guard from first_crs and m_crs and m_crs != first_crs to (first_crs or m_crs) and m_crs != first_crs. Now any asymmetric setting (one source has a CRS, the other doesn't, or vice versa) is also treated as a mismatch and raises ValueError. Added regression tests for both directions of the asymmetric case plus the two-different-non-empty-CRS case in test_vrt_writer_source_compat_1733.py.
| # Compare magnitudes; pixel_height is negative for the common | ||
| # north-up case so a direct equality test would mis-flag a | ||
| # source with the opposite sign. |
There was a problem hiding this comment.
Addressed in ff56e14: rewrote the comment to describe what _pixel_size_mismatch actually does (relative-tolerance comparison on the magnitude) and noted that opposite signs are correctly treated as a mismatch because the magnitude ratio falls outside the tolerance. No reference to the misleading direct-equality framing remains.
| def test_mismatched_pixel_size_raises(tmp_path): | ||
| d = _unique_dir(tmp_path, "px") | ||
| a = os.path.join(d, "a.tif") | ||
| b = os.path.join(d, "b.tif") | ||
| _write_tif(a, h=4, w=4, dtype=np.float32, px=1.0, py=-1.0) |
There was a problem hiding this comment.
Addressed in ff56e14: added five new test cases in test_vrt_writer_source_compat_1733.py covering the CRS code path. test_mismatched_crs_raises checks two different non-empty CRS values, test_asymmetric_crs_raises_first_set_second_missing and test_asymmetric_crs_raises_first_missing_second_set cover both directions of the asymmetric case, and test_matching_crs_succeeds plus test_both_missing_crs_succeeds defend against an overly aggressive check. All 11 tests in the file pass.
write_vrt previously read metadata from every source but used only the first source's pixel size, sample format + bps (dtype), band count, and CRS to write the VRT. A mismatched source would silently produce a VRT that misplaced pixels or mis-typed data downstream. Enforce the docstring contract: every source must agree with the first on pixel size (within 1e-6 relative tolerance), sample format + bps, band count, and CRS WKT (when both sides set it). Mismatches raise ValueError naming the offending source and field. Document nodata precedence: caller arg wins over the first source's per-band nodata. Adds regression coverage in test_vrt_writer_source_compat_1733.py.
Addresses Copilot review on #1741: - CRS guard was only firing when both ``first_crs`` and ``m_crs`` were truthy. A later source with a missing or empty CRS would silently pass and the VRT would still be tagged with the first source's CRS, which can misplace data. Now any asymmetric setting (one side set, the other not) is treated as a mismatch. - Rewrite the ``_pixel_size_mismatch`` comment so it describes what the function actually does (relative-tolerance comparison, with opposite signs correctly flagged) instead of implying a broken equality check. - Add tests covering two different non-empty CRS values, both asymmetric directions, matching CRS, and both-missing CRS.
Summary
write_vrtpreviously used only the first source for pixel size, dtype, band count, and CRS, ignoring the rest ofsources_meta. Mismatched inputs silently produced a misplaced or mis-typed mosaic.ValueErrornaming the source and the field that disagreed.Closes #1733.
Test plan
test_vrt_writer_source_compat_1733.pycovers pixel-size, dtype, and band-count mismatches; same-dir success; float-tolerance acceptance; single-source case.