Skip to content

[SEDONA-730] Add to_wkt() to sedona.geopandas API#1914

Closed
prantogg wants to merge 12 commits into
apache:masterfrom
prantogg:add-geopandas-to_wkt
Closed

[SEDONA-730] Add to_wkt() to sedona.geopandas API#1914
prantogg wants to merge 12 commits into
apache:masterfrom
prantogg:add-geopandas-to_wkt

Conversation

@prantogg
Copy link
Copy Markdown
Contributor

Did you read the Contributor Guide?

Is this PR related to a ticket?

  • Yes, and the PR name follows the format [SEDONA-XXX] my subject.

What changes were proposed in this PR?

  • Adds to_wkt() for GeoDataFrame and GeoSeries

How was this patch tested?

  • Passes new and existing tests

Did this PR include necessary documentation updates?

@prantogg prantogg requested a review from jiayuasu as a code owner April 11, 2025 07:31
@prantogg prantogg requested review from jiayuasu and zhangfengcdt and removed request for jiayuasu April 11, 2025 07:31
select_expressions = []

# Read the keyword arguments
rounding_precision = kwargs.get("rounding_precision", None)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the default should be 6 as per shapely.to_wkt() to replicate GeoPandas behavior. What do you think?

And we should also add support for the output_dimension parameter.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO we should not stray sedona.geopandas' default behavior from sedona's default behavior. Open to suggestions cc. @jiayuasu @zhangfengcdt?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer to keep it as close as possible to the geopandas' behavior. The main goal of this project is to allow people to use their existing geopandas code with minimum changes in outcomes.

first_col = self.get_first_geometry_column()

# Read the keyword arguments
rounding_precision = kwargs.get("rounding_precision", None)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here

Copy link
Copy Markdown
Contributor

@furqaankhan furqaankhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left a few comments. Amazing start!

@jiayuasu
Copy link
Copy Markdown
Member

@prantogg sorry. Feng has refactored the design of these functions: #1916. Now the implementation of to_wkt() should be much simpler than before. Can you rebase this PR and reimplement the function?

@furqaankhan
Copy link
Copy Markdown
Contributor

@jiayuasu I think the new design will only help the functions that don't have kwargs to tune behavior, such as centroid, length, and area. But functions like to_wkt and to_wkb (which I am implementing) aren't relatively simple functions as kwargs are altering the behavior. Let's take to_wkb as an example, it has hex, byte_order, output_dimension, include_srid args to give a different output. We should keep functions like these as is and leverage the new design wherever we can.

@zhangfengcdt
Copy link
Copy Markdown
Member

@jiayuasu I think the new design will only help the functions that don't have kwargs to tune behavior, such as centroid, length, and area. But functions like to_wkt and to_wkb (which I am implementing) aren't relatively simple functions as kwargs are altering the behavior. Let's take to_wkb as an example, it has hex, byte_order, output_dimension, include_srid args to give a different output. We should keep functions like these as is and leverage the new design wherever we can.

I think the _process_geometry_columns should actually be able to handle the kwargs bjut it needs to map to the SQL ST function one to one and with the same order. I guess we should do some exercises to map the geopandas functions parameters to the sedona st functions.

@jiayuasu
Copy link
Copy Markdown
Member

@prantogg can you redo your PR according to the latest change on the master branch? Now you should be able to simplify your code

@prantogg
Copy link
Copy Markdown
Contributor Author

@prantogg can you redo your PR according to the latest change on the master branch? Now you should be able to simplify your code

Fixing this now

@furqaankhan
Copy link
Copy Markdown
Contributor

@jiayuasu I think this function can't be implemented using _process_geometry_column, as it might have to call a couple of ST functions before converting it to WKT. For example, there's a kwarg in shapely.to_wkt called output_dimension, that would basically mean there would be 2 ST functions called internally, ST_AsText(ST_Force3D(<geom-column>))

@prantogg prantogg closed this Apr 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants