Skip to content

[#11232][#11233] improvement(core): Global EntityChangeLogPoller with CatalogManager cache invalidation and automatic retention cleanup#11254

Open
yuqi1129 wants to merge 11 commits into
apache:mainfrom
yuqi1129:improvement/global-entity-change-log-poller
Open

[#11232][#11233] improvement(core): Global EntityChangeLogPoller with CatalogManager cache invalidation and automatic retention cleanup#11254
yuqi1129 wants to merge 11 commits into
apache:mainfrom
yuqi1129:improvement/global-entity-change-log-poller

Conversation

@yuqi1129
Copy link
Copy Markdown
Contributor

What changes were proposed in this pull request?

  1. Extract entity change log polling into a shared EntityChangeLogPoller in core with a listener-based dispatch pattern, replacing the per-consumer polling in JcasbinChangePoller.
  2. Add CatalogChangeLogListener that invalidates CatalogManager's local catalog cache when catalog ALTER/RENAME/DROP changes are consumed, providing eventual consistency across HA nodes.
  3. Add automatic retention cleanup for entity_change_log — expired rows are pruned periodically (default: 1-hour interval, 1-day retention), with configurable settings.
  4. Write change log on all catalog updates (not just renames) so that property-only changes also trigger cross-node cache invalidation.

Why are the changes needed?

In a multi-node deployment, catalog changes on one server do not invalidate the CatalogManager cache on peer servers until local TTL eviction. This can leave peers using stale catalog entities or class loaders. Additionally, entity_change_log grows unboundedly without automatic cleanup.

Fix: #11232
Fix: #11233

Does this PR introduce any user-facing change?

Yes — three new configuration keys:

  • gravitino.entityChangeLog.pollIntervalSecs (default 3)
  • gravitino.entityChangeLog.retentionSecs (default 86400)
  • gravitino.entityChangeLog.cleanupIntervalSecs (default 3600)

How was this patch tested?

  • TestEntityChangeLogPoller — unit tests for dispatch, cursor advancement, listener failure semantics, retention pruning, and cleanup interval gating.
  • TestCatalogManager.testCatalogChangeLogListenerInvalidatesCatalogCache — verifies cache invalidation on entity change.
  • TestEntityChangeLogService — extended to cover ALTER (non-rename) change log writing.
  • TestJcasbinChangePoller — updated for listener-based entity change delivery.

…gPoller with CatalogManager cache invalidation and automatic retention cleanup

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 27, 2026 07:24
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR centralizes entity_change_log consumption into a single process-wide poller in core, adds listener-driven cache invalidation for CatalogManager in HA deployments, and introduces periodic retention cleanup so entity_change_log doesn’t grow unbounded.

Changes:

  • Introduces a global EntityChangeLogPoller with listener registration and periodic retention pruning.
  • Adds CatalogChangeLogListener and wires it (and JcasbinChangePoller) into GravitinoEnv / JcasbinAuthorizer for cross-node cache invalidation.
  • Extends catalog update flow to write change-log entries on all catalog updates (not only renames) and adds new configuration keys for poll/retention/cleanup intervals.

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
server-common/src/test/java/org/apache/gravitino/server/authorization/jcasbin/TestJcasbinChangePoller.java Updates tests to drive entity-change handling via listener callback rather than DB polling.
server-common/src/main/java/org/apache/gravitino/server/authorization/jcasbin/JcasbinChangePoller.java Converts entity-change handling into an EntityChangeLogListener callback; keeps owner polling local.
server-common/src/main/java/org/apache/gravitino/server/authorization/jcasbin/JcasbinAuthorizer.java Registers/unregisters the authorizer’s poller as a global change-log listener when available.
server-common/src/main/java/org/apache/gravitino/server/authorization/jcasbin/JcasbinAuthorizationLookups.java Refactors owner lookup loader path and alters negative-caching behavior for missing owners.
core/src/test/java/org/apache/gravitino/storage/relational/TestEntityChangeLogPoller.java Adds unit tests for global poller dispatch, cursor advancement, and retention pruning gating.
core/src/test/java/org/apache/gravitino/storage/relational/service/TestEntityChangeLogService.java Extends service tests to verify ALTER (non-rename) emits an entity change log record.
core/src/test/java/org/apache/gravitino/catalog/TestCatalogManager.java Adds a unit test asserting catalog cache invalidation on consumed catalog change records.
core/src/main/java/org/apache/gravitino/storage/relational/service/CatalogMetaService.java Writes catalog change-log entries on any successful catalog update (not only rename).
core/src/main/java/org/apache/gravitino/storage/relational/EntityChangeLogPoller.java New global poller implementation with listener dispatch and periodic retention cleanup.
core/src/main/java/org/apache/gravitino/storage/relational/EntityChangeLogListener.java New listener interface for batches consumed from entity_change_log.
core/src/main/java/org/apache/gravitino/GravitinoEnv.java Creates/starts/stops the global poller for relational stores and registers catalog listener.
core/src/main/java/org/apache/gravitino/Configs.java Adds new configuration keys and defaults for poll interval, retention, and cleanup interval.
core/src/main/java/org/apache/gravitino/catalog/CatalogChangeLogListener.java New listener that invalidates CatalogManager’s local catalog cache on catalog changes.

yuqi1129 and others added 2 commits May 27, 2026 15:49
…alogChangeLogListener

When CatalogManager alters/drops a catalog locally, it already updates the
cache correctly. The EntityChangeLogPoller would later pick up the same
change and redundantly invalidate the fresh cache entry, closing the
IsolatedClassLoader and causing NoClassDefFoundError on the next operation.

Track local mutation counts per catalog identifier so the listener can
distinguish local changes (skip) from remote HA changes (invalidate).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 27, 2026

Code Coverage Report

Overall Project 66.65% +0.2% 🟢
Files changed 74.53% 🟢

Module Coverage
aliyun 1.72% 🔴
api 46.82% 🟢
authorization-common 85.96% 🟢
aws 3.66% 🔴
azure 2.47% 🔴
catalog-common 10.04% 🔴
catalog-fileset 80.33% 🟢
catalog-glue 66.08% 🟢
catalog-hive 79.55% 🟢
catalog-jdbc-clickhouse 80.02% 🟢
catalog-jdbc-common 45.31% 🟢
catalog-jdbc-doris 80.28% 🟢
catalog-jdbc-hologres 54.03% 🟢
catalog-jdbc-mysql 79.23% 🟢
catalog-jdbc-oceanbase 78.38% 🟢
catalog-jdbc-postgresql 82.29% 🟢
catalog-jdbc-starrocks 78.51% 🟢
catalog-kafka 77.01% 🟢
catalog-lakehouse-generic 44.89% 🟢
catalog-lakehouse-hudi 79.1% 🟢
catalog-lakehouse-iceberg 85.65% 🟢
catalog-lakehouse-paimon 79.29% 🟢
catalog-model 77.72% 🟢
cli 44.51% 🟢
client-java 77.91% 🟢
common 49.99% 🟢
core 82.33% -0.71% 🟢
filesystem-hadoop3 76.97% 🟢
flink 0.0% 🔴
flink-common 41.2% 🟢
flink-runtime 0.0% 🔴
gcp 14.12% 🔴
hadoop-common 10.39% 🔴
hive-metastore-common 53.26% 🟢
iceberg-common 54.98% +10.71% 🟢
iceberg-rest-server 70.98% 🟢
idp-basic 85.99% 🟢
integration-test-common 0.0% 🔴
jobs 66.17% 🟢
lance-common 20.83% 🔴
lance-rest-server 60.27% 🟢
lineage 53.02% 🟢
optimizer 82.87% 🟢
optimizer-api 21.95% 🔴
server 85.87% 🟢
server-common 73.1% +1.71% 🟢
spark 32.79% 🔴
spark-common 39.75% 🔴
trino-connector 39.44% 🔴
Files
Module File Coverage
core CatalogMetaService.java 98.88% 🟢
Configs.java 98.01% 🟢
CatalogChangeLogListener.java 81.82% 🟢
CatalogManager.java 67.59% 🟢
EntityChangeLogPoller.java 65.62% 🟢
GravitinoEnv.java 12.7% 🔴
EntityChangeLogListener.java 0.0% 🔴
iceberg-common IcebergConfig.java 100.0% 🟢
server-common JcasbinAuthorizer.java 83.59% 🟢
JcasbinAuthorizationLookups.java 80.0% 🟢
JcasbinChangePoller.java 52.14% 🔴

yuqi1129 and others added 7 commits May 27, 2026 21:40
Resolve conflicts in JcasbinChangePoller: keep EntityChangeLogListener approach
from this PR (entity changes dispatched by global EntityChangeLogPoller) while
adopting the keyset (updated_at, id) owner cursor from main/apache#11088 which
correctly tracks soft-deletes alongside inserts.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…n TestJcasbinAuthorizationLookups

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ITINO_HOME

The gravitino.sh.template was updated to include \${GRAVITINO_HOME} in the
PID lookup. The multi-instance test's patch_pid_grep function was asserting the
old pattern still existed and failing. Skip the patch when the script already
contains the home-scoped filter.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
loadOwner now returns Optional.empty() instead of throwing
NoSuchOwnerException, so Caffeine can store the negative result.
Subsequent requests with a fresh AuthorizationRequestContext find the
absent owner in the shared cache and skip the DB query entirely.

Also update the same-context test: putCount is now 1 (absent result IS
stored in the shared cache on first load) and rename the test to reflect
the new behaviour.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@yuqi1129 yuqi1129 force-pushed the improvement/global-entity-change-log-poller branch from ce88587 to e333894 Compare May 29, 2026 12:39
@yuqi1129 yuqi1129 force-pushed the improvement/global-entity-change-log-poller branch from e333894 to 9c66d11 Compare May 29, 2026 12:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

2 participants