Skip to content

[GH-2830] Adds Geography dual-dispatch to ST_Centroid#2852

Merged
jiayuasu merged 8 commits into
apache:masterfrom
zhangfengcdt:feature/geography.support.st_centroid
Apr 28, 2026
Merged

[GH-2830] Adds Geography dual-dispatch to ST_Centroid#2852
jiayuasu merged 8 commits into
apache:masterfrom
zhangfengcdt:feature/geography.support.st_centroid

Conversation

@zhangfengcdt
Copy link
Copy Markdown
Member

@zhangfengcdt zhangfengcdt commented Apr 22, 2026

Did you read the Contributor Guide?

Is this PR related to a ticket?

  • Yes, and the PR name follows the format [GH-XXX] my subject. Closes #<issue_number>

What changes were proposed in this PR?

  • Adds Geography dual-dispatch to ST_Centroid — returns a Geography point for Geography inputs (planar centroid, computed in projected lon/lat space, not on the sphere).
  • Use S2-spherical dispatcher S2 returns weighted (non-unit) centroids; final step normalizes via S2PAoint.normalize. Returns null if the centroid is undefined (empty input or vectors that cancel).

How was this patch tested?

  • mvn clean test -pl common -am -Dtest=FunctionTest
  • New Spark SQL case in GeographyFunctionTest
  • mvn spotless:apply run before commit; no format violations remain.

Did this PR include necessary documentation updates?

  • Yes, I have updated the documentation.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds Geography support for ST_Centroid via dual-dispatch so Geography inputs return a Geography point (planar centroid computed in projected lon/lat space), aligning Spark SQL behavior with the new common geography function set introduced in the Geography workstream.

Changes:

  • Add org.apache.sedona.common.geography.Functions.centroid(Geography) and corresponding unit tests.
  • Update Spark SQL ST_Centroid expression to dual-dispatch between Geometry and Geography implementations.
  • Add Spark SQL integration coverage + new Geography SQL docs entry for ST_Centroid.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
spark/common/src/main/scala/org/apache/spark/sql/sedona_sql/expressions/Functions.scala Adds Geography overload to ST_Centroid via InferredExpression dual-dispatch.
common/src/main/java/org/apache/sedona/common/geography/Functions.java Introduces Geography centroid implementation (JTS getCentroid() + wrap to WKBGeography).
common/src/test/java/org/apache/sedona/common/Geography/FunctionTest.java Adds unit tests for Geography centroid behavior and null handling.
spark/common/src/test/scala/org/apache/sedona/sql/geography/GeographyFunctionTest.scala Adds Spark SQL integration test for Geography centroid (currently uses an incompatible formatter).
docs/api/sql/geography/Geography-Functions/ST_Centroid.md Adds Geography SQL doc page for ST_Centroid (example currently uses an incompatible formatter).
docs/api/sql/geography/Geography-Functions.md Adds ST_Centroid to the Geography functions index table.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread docs/api/sql/geography/Geography-Functions/ST_Centroid.md Outdated
@zhangfengcdt zhangfengcdt marked this pull request as ready for review April 22, 2026 18:44
@zhangfengcdt zhangfengcdt requested a review from jiayuasu as a code owner April 22, 2026 18:44
@jiayuasu
Copy link
Copy Markdown
Member

@zhangfengcdt Are you sure this should go back to JTS and use planar geometry to get the centroid? Does S2 have something spherical?

S2 returns weighted (non-unit) centroids; final step normalizes via S2PAoint.normalize. Returns
null if the centroid is undefined (empty input or vectors that cancel).
@zhangfengcdt
Copy link
Copy Markdown
Member Author

@zhangfengcdt Are you sure this should go back to JTS and use planar geometry to get the centroid? Does S2 have something spherical?

Yes, you're right! I switched it to S2, need a bit more work on that though - S2 returns a non-unit weighted vector, so the result is normalized via S2Point.normalize before being wrapped as SinglePointGeography.

@jiayuasu jiayuasu merged commit df3f740 into apache:master Apr 28, 2026
43 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants