Skip to content

[GH-3073] Add geography support to SedonaFlink measurement and output functions#3074

Merged
jiayuasu merged 3 commits into
apache:masterfrom
jiayuasu:flink-geography-functions
Jun 21, 2026
Merged

[GH-3073] Add geography support to SedonaFlink measurement and output functions#3074
jiayuasu merged 3 commits into
apache:masterfrom
jiayuasu:flink-geography-functions

Conversation

@jiayuasu

Copy link
Copy Markdown
Member

Did you read the Contributor Guide?

Is this PR related to a ticket?

What changes were proposed in this PR?

Part of the geography parity effort for SedonaFlink (#3054), building on the type serializer (#3058) and constructors (#3061).

This adds Geography support to the SedonaFlink measurement and output functions, so geography columns can be measured and formatted, not just constructed. Each of the following gains a Geography eval overload in flink/.../expressions/Functions.java, delegating to org.apache.sedona.common.geography.Functions:

  • ST_Area (geodesic, m²)
  • ST_Length (geodesic, m)
  • ST_Distance (geodesic, m)
  • ST_Buffer (geog, radius[, useSpheroid | parameters] — 3 overloads)
  • ST_Centroid
  • ST_Envelope (geog, splitAtAntiMeridian)
  • ST_NPoints
  • ST_NumGeometries
  • ST_GeometryType
  • ST_AsText
  • ST_AsEWKT

Each overload uses @DataTypeHint(value = "RAW", rawSerializer = GeographyTypeSerializer.class, bridgedTo = Geography.class). Flink resolves geometry vs geography by the RAW bridgedTo type — the same mechanism ST_AsText already uses to support Box2D/Box3D — so a single registered function name serves both types and no new Catalog entries are required.

This mirrors Spark, where these functions accept either Geometry or Geography. Geography predicates (ST_Contains, ST_Intersects, ST_Within, ST_Equals, ST_DWithin) are tracked as a follow-up.

How was this patch tested?

Added GeographyFunctionTest (12 tests) exercising each function end-to-end through the Flink Table API, asserting against the common.geography.Functions reference values, plus testGeometryStillWorks confirming the geometry overload still resolves on the shared function. No regressions: FunctionTest (205 geometry tests), GeographyConstructorTest, GeographyTypeSerializerTest, and ModuleTest all pass.

Did this PR include necessary documentation updates?

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds Geography parity for SedonaFlink’s measurement and output scalar functions by introducing Geography-typed eval overloads in the existing Flink UDF wrappers, delegating to org.apache.sedona.common.geography.Functions. This enables geodesic measurement/formatting on geography columns while keeping the same function names shared with geometry.

Changes:

  • Added Geography overloads for ST_Area, ST_Length, ST_Distance, ST_Buffer (3 overloads), ST_Centroid, ST_Envelope (with splitAtAntiMeridian), ST_NPoints, ST_NumGeometries, ST_GeometryType, ST_AsText, and ST_AsEWKT.
  • Added an end-to-end Flink Table API test suite (GeographyFunctionTest) covering the new Geography function paths and verifying geometry overload resolution still works.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
flink/src/main/java/org/apache/sedona/flink/expressions/Functions.java Adds Geography eval overloads using RAW + GeographyTypeSerializer, delegating to common.geography.Functions.
flink/src/test/java/org/apache/sedona/flink/GeographyFunctionTest.java Adds Table API integration tests for Geography measurement/output functions and shared-name overload resolution.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +180 to +188
@Test
public void testBuffer() throws Exception {
String wkt = "POINT (0 0)";
Object out = eval(wkt, call(Functions.ST_Buffer.class.getSimpleName(), $("geog"), lit(1000.0)));
Geography expected =
org.apache.sedona.common.geography.Functions.buffer(
Constructors.geogFromWKT(wkt, 4326), 1000.0);
assertEquals(expected.toEWKT(), ((Geography) out).toEWKT());
}

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added testBufferWithParameters (the (Geography, radius, String) overload, asserted end-to-end against the common reference) and testBufferUseSpheroidThrows (the (Geography, radius, boolean) overload, asserting the clear IllegalArgumentException — which also confirms Flink resolves the boolean argument to this overload rather than the String one). Commit 779e766.

Comment on lines +169 to +178
@Test
public void testEnvelope() throws Exception {
String wkt = "LINESTRING (0 0, 2 3)";
Object out =
eval(wkt, call(Functions.ST_Envelope.class.getSimpleName(), $("geog"), lit(false)));
Geography expected =
org.apache.sedona.common.geography.Functions.getEnvelope(
Constructors.geogFromWKT(wkt, 4326), false);
assertEquals(expected.toEWKT(), ((Geography) out).toEWKT());
}

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added testEnvelopeSplitAtAntiMeridian using an antimeridian-crossing input (LINESTRING (170 10, -170 20)) with splitAtAntiMeridian=true, asserting the result matches the common reference and is a MULTIPOLYGON (the split path). Commit 779e766.

@jiayuasu jiayuasu merged commit efeff95 into apache:master Jun 21, 2026
19 checks passed
@jiayuasu jiayuasu added this to the sedona-1.9.1 milestone Jun 21, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

SedonaFlink: add geography support to measurement and output functions

2 participants