Skip to content

[GH-2043] Geopandas.GeoSeries: Implement is_valid_reason, make_valid#2044

Merged
jiayuasu merged 8 commits into
apache:masterfrom
petern48:is_valid_reason_series
Jul 6, 2025
Merged

[GH-2043] Geopandas.GeoSeries: Implement is_valid_reason, make_valid#2044
jiayuasu merged 8 commits into
apache:masterfrom
petern48:is_valid_reason_series

Conversation

@petern48
Copy link
Copy Markdown
Member

@petern48 petern48 commented Jul 2, 2025

Did you read the Contributor Guide?

Is this PR related to a ticket?

What changes were proposed in this PR?

How was this patch tested?

Added tests

Did this PR include necessary documentation updates?

  • Yes, I have updated the documentation.

@petern48 petern48 marked this pull request as ready for review July 3, 2025 22:35
@petern48 petern48 requested a review from jiayuasu as a code owner July 3, 2025 22:35
@petern48 petern48 requested a review from zhangfengcdt July 3, 2025 22:35
Copy link
Copy Markdown
Member

@zhangfengcdt zhangfengcdt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jiayuasu jiayuasu requested a review from Copilot July 6, 2025 17:01
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds full implementations for GeoSeries.is_valid_reason and GeoSeries.make_valid and covers them with new tests.

  • Implemented is_valid_reason to call the underlying Spark function ST_IsValidReason.
  • Implemented make_valid supporting the structure method via ST_MakeValid.
  • Added tests for both methods in two test modules, with version guards.

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
python/sedona/geopandas/geoseries.py Added implementations for is_valid_reason and make_valid.
python/tests/geopandas/test_match_geopandas_series.py Added tests for is_valid_reason and make_valid with version guards.
python/tests/geopandas/test_geoseries.py Added direct tests of is_valid_reason and make_valid.
Comments suppressed due to low confidence (2)

python/sedona/geopandas/geoseries.py:707

  • [nitpick] The example docblock for is_valid_reason shows 5 input geometries but only indexes 0–3 before None. Update the printed snippet to include the fifth index (4) for clarity.
        >>> s = GeoSeries(

python/tests/geopandas/test_match_geopandas_series.py:370

  • The alias ps is not imported in this test, so ps.Series will raise a NameError. Add import pyspark.pandas as ps or use the existing alias for pandas-on-Spark.
        assert isinstance(sgpd_result, ps.Series)

def test_is_valid_reason(self):
pass
# is_valid_reason was added in geopandas 1.0.0
if gpd.__version__ < "1.0.0":
Copy link

Copilot AI Jul 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comparing versions as plain strings can fail for multi-digit versions (e.g. "1.10" < "1.2"). Consider using packaging.version.parse or distutils.version.LooseVersion for reliable version checks.

Copilot uses AI. Check for mistakes.
import shapely

# 'structure' method requires shapely >= 2.1.0
if shapely.__version__ < "2.1.0":
Copy link

Copilot AI Jul 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using plain string comparison for shapely.__version__ can be incorrect; switch to a version parser (e.g., packaging.version.parse) to ensure accurate comparisons.

Suggested change
if shapely.__version__ < "2.1.0":
if parse(shapely.__version__) < parse("2.1.0"):

Copilot uses AI. Check for mistakes.
Comment on lines +495 to +500
sgpd_result = GeoSeries(geom).make_valid(method="structure")
gpd_result = gpd.GeoSeries(geom).make_valid(method="structure")
self.check_sgpd_equals_gpd(sgpd_result, gpd_result)

for _, geom in self.geoms:
sgpd_result = GeoSeries(geom).make_valid(
Copy link

Copilot AI Jul 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Passing a single geometry object directly to GeoSeries treats it as a sequence of coords. Wrap geom in a list (e.g. GeoSeries([geom])) to construct a series with one element.

Suggested change
sgpd_result = GeoSeries(geom).make_valid(method="structure")
gpd_result = gpd.GeoSeries(geom).make_valid(method="structure")
self.check_sgpd_equals_gpd(sgpd_result, gpd_result)
for _, geom in self.geoms:
sgpd_result = GeoSeries(geom).make_valid(
sgpd_result = GeoSeries([geom]).make_valid(method="structure")
gpd_result = gpd.GeoSeries(geom).make_valid(method="structure")
self.check_sgpd_equals_gpd(sgpd_result, gpd_result)
for _, geom in self.geoms:
sgpd_result = GeoSeries([geom]).make_valid(

Copilot uses AI. Check for mistakes.
@jiayuasu jiayuasu merged commit 3cf4f43 into apache:master Jul 6, 2025
26 checks passed
@petern48 petern48 deleted the is_valid_reason_series branch July 7, 2025 16:39
Kontinuation pushed a commit to Kontinuation/sedona that referenced this pull request Jan 21, 2026
…valid (apache#2044)

* Implement is_valid_reason

* Implement make_valid

* Skip match make_valid tests for shapely < 2.1.0

* Skip match is_valid_reason for gpd version < 1.0.0

* Copy over modified check_sgpd_equals_gpd

* Use sedona instead of geopandas in doc strings
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Geopandas.GeoSeries: Implement is_valid_reason, make_valid

4 participants