Skip to content

[#10983] feat(lance): Upgrade lance to 4.0.1 and lance-namespace to 0.7.5#11060

Open
yuqi1129 wants to merge 12 commits into
apache:mainfrom
yuqi1129:issue_10983
Open

[#10983] feat(lance): Upgrade lance to 4.0.1 and lance-namespace to 0.7.5#11060
yuqi1129 wants to merge 12 commits into
apache:mainfrom
yuqi1129:issue_10983

Conversation

@yuqi1129
Copy link
Copy Markdown
Contributor

What changes were proposed in this pull request?

Upgrades the Lance dependencies:

  • lance-core: 2.0.14.0.1
  • lance-namespace-core: 0.4.50.7.5

API-breaking changes addressed:

  1. TransactionSourcedTransaction in LancePartitionStatisticStorage: Dataset.newTransactionBuilder() now returns SourcedTransaction.Builder instead of Transaction.Builder.
  2. CreateEmptyTableRequest/ResponseDeclareTableRequest/Response: createEmptyTable() was removed from the LanceNamespace interface; all usages migrated to declareTable().
  3. TableApi.createTable() signature: Two new optional parameters (properties, storageOptions) added; pass null for both since Gravitino reads from HTTP headers.
  4. AlterColumnsEntry.rename is now JsonNullable<String>: Added a JAX-RS @Provider (JsonNullableMapperProvider) that registers JsonNullableModule with the ObjectMapper used by the lance-rest-server.

Testability fix:

  • Introduced a package-private DatasetHolder wrapper class in LancePartitionStatisticStorage so the Caffeine cache holds DatasetHolder instead of Dataset directly. lance 4.0.1's Dataset has many JNI native methods that prevent Mockito's inline mock-maker from instrumenting it.

Why are the changes needed?

lance 4.0.1 and lance-namespace 0.7.5 are the latest stable releases. Keeping dependencies up to date ensures we receive bug fixes, performance improvements, and security patches.

Fix: #10983

Does this PR introduce any user-facing change?

No. The REST API contract is unchanged. The internal migration from createEmptyTable to declareTable is transparent to users.

How was this patch tested?

All unit tests in :core:test, :lance:lance-common:test, and :lance:lance-rest-server:test pass (including the previously failing testCloseReleasesCachedDatasetBeforeAllocator and testAlterColumns).

- Bump lance-core from 2.0.1 to 4.0.1 in gradle/libs.versions.toml
- Bump lance-namespace-core from 0.4.5 to 0.7.5 in gradle/libs.versions.toml

API changes addressed:
- Transaction -> SourcedTransaction in LancePartitionStatisticStorage
  (Transaction.commit() removed; newTransactionBuilder() now returns
   SourcedTransaction.Builder)
- CreateEmptyTableRequest/Response -> DeclareTableRequest/Response
  (createEmptyTable() removed from LanceNamespace interface)
- TableApi.createTable() gains two new String params (properties,
  storageOptions); pass null for both since Gravitino reads from headers
- AlterColumnsEntry.rename is now JsonNullable<String>; register
  JsonNullableModule via new JsonNullableMapperProvider for correct
  Jackson serialization in the lance-rest-server

Testability fix:
- Introduce DatasetHolder wrapper in LancePartitionStatisticStorage so
  that the cached Dataset is mockable in tests; lance 4.0.1 Dataset
  has many JNI native methods that prevent Mockito's inline mock maker
  from instrumenting the class directly

Closes apache#10983

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings May 12, 2026 15:11
@yuqi1129 yuqi1129 marked this pull request as draft May 12, 2026 15:12
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR upgrades the Lance/Lance-namespace dependencies and updates Gravitino’s Lance integration to accommodate upstream API changes (declare table APIs, transaction builder type change, OpenAPI JsonNullable handling), including test adjustments to keep the suite stable with Lance 4.x’s JNI-heavy classes.

Changes:

  • Bumped lance-core to 4.0.1 and lance-namespace-core to 0.7.5 and migrated call sites (e.g., declareTable, updated TableApi.createTable signature).
  • Added a JAX-RS @Provider to register JsonNullableModule for correct (de)serialization of JsonNullable<T> fields in lance-namespace models.
  • Refactored LancePartitionStatisticStorage dataset caching to wrap Dataset in a mockable holder and updated related tests.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
lance/lance-rest-server/src/test/java/org/apache/gravitino/lance/service/rest/TestLanceNamespaceOperations.java Updates REST tests for declare/create-empty changes and registers JsonNullableModule in the test ObjectMapper.
lance/lance-rest-server/src/test/java/org/apache/gravitino/lance/integration/test/LanceRESTServiceIT.java Migrates integration tests from createEmptyTable to declareTable and updates TableApi.createTable call signature.
lance/lance-rest-server/src/main/java/org/apache/gravitino/lance/service/rest/LanceTableOperations.java Updates create-empty endpoint to return the new response type (DeclareTableResponse).
lance/lance-rest-server/src/main/java/org/apache/gravitino/lance/service/rest/JsonNullableMapperProvider.java Adds a Jersey ContextResolver<ObjectMapper> registering JsonNullableModule.
lance/lance-common/src/main/java/org/apache/gravitino/lance/common/ops/LanceTableOperations.java Updates the deprecated createEmptyTable return type to align with the new model classes.
lance/lance-common/src/main/java/org/apache/gravitino/lance/common/ops/gravitino/GravitinoLanceTableOperations.java Adapts the Gravitino-backed implementation to return DeclareTableResponse for the deprecated path.
gradle/libs.versions.toml Bumps Lance dependency versions.
core/src/test/java/org/apache/gravitino/stats/storage/TestLancePartitionStatisticStorage.java Adjusts tests to mock a dataset holder rather than Dataset directly.
core/src/main/java/org/apache/gravitino/stats/storage/LancePartitionStatisticStorage.java Switches to caching a DatasetHolder, updates transaction builder usage, and adds safe closing behavior.

- Transaction -> SourcedTransaction (variable type)
- writeParams(Map) -> transactionProperties(Map) on SourcedTransaction.Builder
- dataset.commitTransaction(trans) -> trans.commit()

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 13, 2026

Code Coverage Report

Overall Project 66.03% 🟢
Files changed 66.56% 🟢

Module Coverage
aliyun 1.72% 🔴
api 47.13% 🟢
authorization-common 85.96% 🟢
aws 1.08% 🔴
azure 2.47% 🔴
catalog-common 10.2% 🔴
catalog-fileset 80.02% 🟢
catalog-glue 83.41% 🟢
catalog-hive 81.83% 🟢
catalog-jdbc-clickhouse 79.18% 🟢
catalog-jdbc-common 43.93% 🟢
catalog-jdbc-doris 80.28% 🟢
catalog-jdbc-hologres 54.03% 🟢
catalog-jdbc-mysql 79.23% 🟢
catalog-jdbc-oceanbase 78.38% 🟢
catalog-jdbc-postgresql 82.05% 🟢
catalog-jdbc-starrocks 78.27% 🟢
catalog-kafka 77.01% 🟢
catalog-lakehouse-generic 45.14% 🟢
catalog-lakehouse-hudi 79.1% 🟢
catalog-lakehouse-iceberg 86.98% 🟢
catalog-lakehouse-paimon 76.85% 🟢
catalog-model 77.72% 🟢
cli 44.51% 🟢
client-java 77.96% 🟢
common 50.0% 🟢
core 82.19% +0.2% 🟢
filesystem-hadoop3 76.97% 🟢
flink 0.0% 🔴
flink-common 43.17% 🟢
flink-runtime 0.0% 🔴
gcp 14.12% 🔴
hadoop-common 10.39% 🔴
hive-metastore-common 46.83% 🟢
iceberg-common 55.46% 🟢
iceberg-rest-server 69.61% 🟢
idp-basic 94.68% 🟢
integration-test-common 0.0% 🔴
jobs 66.17% 🟢
lance-common 19.95% -5.82% 🔴
lance-rest-server 61.66% +22.99% 🟢
lineage 53.02% 🟢
optimizer 82.95% 🟢
optimizer-api 21.95% 🔴
server 85.8% 🟢
server-common 71.21% 🟢
spark 32.79% 🔴
spark-common 39.09% 🔴
trino-connector 35.14% 🔴
Files
Module File Coverage
core LancePartitionStatisticStorage.java 97.74% 🟢
lance-common LanceTableOperations.java 0.0% 🔴
GravitinoLanceTableOperations.java 0.0% 🔴
lance-rest-server LanceTableOperations.java 96.09% 🟢
JsonNullableMapperProvider.java 0.0% 🔴

@yuqi1129 yuqi1129 self-assigned this May 13, 2026
@yuqi1129 yuqi1129 marked this pull request as ready for review May 13, 2026 12:05
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 16 out of 16 changed files in this pull request and generated 5 comments.

Comment thread clients/client-python/build.gradle.kts
Comment thread clients/client-python/tests/integration/test_lance_ray.py Outdated
Comment thread clients/client-python/tests/integration/test_lance_ray.py Outdated
@yuqi1129
Copy link
Copy Markdown
Contributor Author

@FANNG1
Can you help to review this one?

@FANNG1
Copy link
Copy Markdown
Contributor

FANNG1 commented May 15, 2026

Thanks for adding the matrix runners and documenting how to re-verify the Lance compatibility ranges. I think the CI strategy here needs to be called out explicitly before we merge this.

My concern is that the full compatibility matrix should probably not become part of the default PR validation path. These tests are expensive, depend on external package ecosystems, and can become noisy when PyPI / Maven resolution changes upstream. That makes them a poor fit for the main per-PR signal.

What seems more sustainable is a two-layer approach:

  1. Keep one pinned, known-good Lance combination in the normal CI path as the regression guard.
  2. Keep the full multi-version matrix as a separate automation path: scheduled workflow, release-time validation, and manual dispatch when we need to re-verify or extend the compatibility claims.

That would still keep the matrix exercised continuously enough to avoid script rot, but without making every PR pay the cost or absorb upstream ecosystem flakiness. If we go this direction, I also think the matrix runner should avoid floating dependencies by default (for example, not resolving lance-namespace as unpinned latest).

Can we align on that execution model in this PR? Right now the tasks and docs are being added, but the intended automation boundary is still unclear.

@yuqi1129
Copy link
Copy Markdown
Contributor Author

@FANNG1

Thanks for your reply.

By default, we will only run one version of lance-spark and lance-ray to verify whether it works fine. As you point out, there are several versions of lance-spark and lance-ray, and it aims to provide a full test in future upgrades and in a possible CI environment. Is there acceptable for us?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Improvement] Upgrade LanceDB dependency in Gravitino

3 participants