[SEDONA-738] Add moran i autocorrelation.#1975
Conversation
|
I need to add docs |
|
and update the description of the MR |
|
Thanks for this Pawel. I think we need an ST Function for this. Currently it is asymmetric with the other stats functions in this regard. I also believe that we should deprecate these function call interfaces for geostats functions and tell users to use the ST functions instead. If we go this direction I would advise against making python and public scala functions for morans I (other than ST functions for the df interfaces of course) |
|
@james-willis Is your SQL function framework able to support Pawel's Moran'I case as well? Maybe you can point him to the right place for this? |
|
Sure. We should be able to wrap the function in this PR with an implementation of a PhysicalFunction. Here are the existing ones. Here is the PR that added SQL support to geostats |
There was a problem hiding this comment.
Pull Request Overview
This PR introduces Moran’s I spatial autocorrelation to the Scala, Java, and Python APIs, including implementations, wrappers, tests, and documentation.
- Added Scala implementation and Java result container for Moran’s I
- Exposed Moran’s I in Python with wrapper and updated weighting functions
- Added unit tests (Scala and Python) and extended SQL API documentation
Reviewed Changes
Copilot reviewed 12 out of 13 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| spark/common/src/main/scala/org/apache/sedona/stats/autocorelation/Moran.scala | Core Scala implementation of Moran’s I |
| spark/common/src/main/java/org/apache/sedona/stats/autocorrelation/MoranResult.java | Java container class for result values |
| spark/common/src/main/scala/org/apache/sedona/stats/Weighting.scala | Added addDistanceBandColumnPython bridge method |
| spark/common/src/test/scala/org/apache/sedona/stats/autocorellation/MoranTest.scala | Scala tests for positive, negative, and zero cases |
| spark/common/src/test/scala/org/apache/sedona/stats/autocorellation/AutoCorrelationFixtures.scala | Test fixtures for auto-correlation scenarios |
| python/sedona/spark/stats/autocorrelation/moran.py | Python wrapper for Moran’s I |
| python/tests/stats/test_moran.py | Python integration tests for Moran’s I |
| python/sedona/spark/stats/weighting.py | Updated Python weighting functions to call wrapper |
| python/sedona/spark/stats/hotspot_detection/getis_ord.py | Fixed DataFrame wrapping in g_local call |
| python/sedona/spark/register/java_libs.py | Registered Moran in JVM libs enum |
| python/sedona/spark/stats/autocorrelation/init.py | Initialized Python autocorrelation package |
| docs/api/stats/sql.md | Documented MoranI algorithm and usage |
Comments suppressed due to low confidence (3)
spark/common/src/test/scala/org/apache/sedona/stats/autocorellation/MoranTest.scala:1
- The package name 'autocorellation' is misspelled and does not match the implementation package 'autocorrelation'. Please rename the directory and package declaration to 'autocorrelation' so the tests compile correctly.
/*
docs/api/stats/sql.md:142
- [nitpick] Minor grammatical issue: remove 'the' before '1' to read 'close to 1', and consider rephrasing to improve clarity.
location and non-spatial attribute. When the value is close to the 1 it
docs/api/stats/sql.md:146
- Grammar correction: change 'has' to 'have' since 'values' is plural.
correlation. Negative correlation means that close values has dissimilar values.
…ation/AutoCorrelationFixtures.scala Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
|
@Imbruced I will merge this for now but the doc needs significant improvement. I believe we need some comprehensive end-to-end examples of each Stat function |
|
Filed a ticket for ST function: #2115 |
* SEDONA-738 Add moran i autocorrelation. * SEDONA-738 Fix unit tests. * SEDONA-738 Fix unit tests. * SEDONA-738 Fix unit tests. * SEDONA-738 Fix unit tests. * SEDONA-738 Fix unit tests. * SEDONA-738 Fix scala 2.13 issue * Update spark/common/src/test/scala/org/apache/sedona/stats/autocorellation/AutoCorrelationFixtures.scala Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Fix typos * Update doc --------- Co-authored-by: Jia Yu <jiayu@apache.org> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* SEDONA-738 Add moran i autocorrelation. * SEDONA-738 Fix unit tests. * SEDONA-738 Fix unit tests. * SEDONA-738 Fix unit tests. * SEDONA-738 Fix unit tests. * SEDONA-738 Fix unit tests. * SEDONA-738 Fix scala 2.13 issue * Update spark/common/src/test/scala/org/apache/sedona/stats/autocorellation/AutoCorrelationFixtures.scala Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Fix typos * Update doc --------- Co-authored-by: Jia Yu <jiayu@apache.org> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Did you read the Contributor Guide?
Is this PR related to a ticket?
SEDONA-738
What changes were proposed in this PR?
MoranI index
How was this patch tested?
Unit tests
Did this PR include necessary documentation updates?
Yes