feat: implement fast FQN prefix-based bulk deletion to fix race conditions and slow cascade by mohityadav766 · Pull Request #27108 · open-metadata/OpenMetadata

mohityadav766 · 2026-04-06T20:26:24Z

Summary

Problem 1 — Speed: Recursive cascade deletion walks the entity tree one entity at a time. For large services (1M+ tables), this takes hours. Each entity triggers individual queries for relationships, extensions, tags, threads, usage, and search events.
Problem 2 — Race condition: Ingestion can run concurrently with deletion. While deletion slowly walks the tree, ingestion creates new entities under the service. When deletion finally removes the root, those newly-created entities have a broken parent reference — orphaned, invisible in the UI, and impossible to clean up.
Solution: FQN prefix-based bulk deletion. Every entity's FQN encodes its full ancestry (service.database.schema.table). A single indexed LIKE 'prefix.%' query deletes an entire subtree atomically. Works at any hierarchy level (deleting a Schema uses the same path as deleting a full DatabaseService).

Changes

Schema

entity_relationship: added fromFQNHash and toFQNHash columns with indexes (MySQL + Postgres)
SQL migration 1.14.1: schema DDL + Java backfill migration that populates hashes for all existing rows

DAO Layer

EntityRelationshipDAO: 8-arg insert() with COALESCE (existing callers unaffected via 6-arg default overload), backfill methods, deleteAllByFqnHashPrefix()
EntityDAO: findIdsByFqnHashPrefix() and deleteBatch() default methods (all entity DAOs inherit automatically)
EntityTimeSeriesDAO: deleteByFqnHashPrefix() default method
CollectionDAO: UsageDAO.deleteBatch(), FeedDAO.findByEntityIds()

Repository Layer

EntityRepository: FQN-bearing addRelationship() overload, addServiceRelationship() now populates FQN hashes, deleteTimeSeriesByFqnPrefix() hook, getLockManager() / callPreDelete() / invalidateEntity() bridge methods
TableRepository, DatabaseSchemaRepository: storeRelationships() passes FQN strings
FeedRepository: deleteByAboutBatch() for bulk thread deletion

Orchestration

PrefixDeletionService (new): 7-phase hard-delete orchestration
1. Acquire HierarchicalLockManager deletion lock (closes ingestion race window in <1ms)
2. Collect all descendant UUIDs via findIdsByFqnHashPrefix() across all entity repos
3. Bulk-delete dependency tables: entity_relationship, field_relationship, entity_extension, tag_usage, entity_usage, threads
4. Run entity-specific hooks (preDelete on root, deleteTimeSeriesByFqnPrefix per repo)
5. Bulk-delete from entity tables per type
6. Emit search deletion event for root entity
7. Release lock
LockManagerInitializer: now also initializes PrefixDeletionService on startup

REST Endpoint

EntityResource: deletePrefixHardById() protected base method (async, 202 Accepted, WebSocket notification on completion)
DatabaseServiceResource: DELETE /api/v1/services/databaseServices/prefix/{id} endpoint

Implementation Notes

Hard delete only — soft delete still uses the existing tree walk to preserve relationship data for restoration
FQN hashes on new relationships: Only CONTAINS relationships (written via addServiceRelationship and storeRelationships) get FQN hashes on insert to avoid adding 2 extra DB lookups per row during bulk ingestion. The migration backfills all existing data.
COALESCE on conflict: The entity_relationship insert uses COALESCE(:fromFQNHash, fromFQNHash) so re-ingestion never overwrites already-populated hashes with null.

Test plan

mvn test-compile -pl openmetadata-service -DskipTests — compiles clean
mvn spotless:apply -pl openmetadata-service — no formatting changes
Create a DatabaseService with nested Databases → Schemas → Tables, call DELETE /prefix/{id}, verify all entities return 404 and entity_relationship has no rows with the service's FQN hash prefix
Verify concurrent ingestion during deletion leaves no orphaned entities
Compare deletion timing against recursive delete on same dataset

Copilot

Pull request overview

This PR introduces a faster, prefix-based hard deletion flow aimed at preventing orphaned relationships when ingestion and cascading deletes run concurrently (particularly for large service hierarchies). It does so by adding FQN-hash metadata to relationships, providing a new API endpoint for prefix deletion, and implementing a server-side bulk deletion orchestrator.

Changes:

Added /prefix/{id} hard-delete endpoint for DatabaseService that triggers an async prefix-based deletion job.
Added fromFQNHash / toFQNHash columns + indexes to entity_relationship and a v1.14.1 migration utility to backfill existing rows.
Introduced PrefixDeletionService plus supporting DAO/repository changes for prefix ID discovery and batch deletes.

Reviewed changes

Copilot reviewed 18 out of 18 changed files in this pull request and generated 8 comments.

Show a summary per file

File	Description
openmetadata-service/src/main/java/org/openmetadata/service/resources/services/database/DatabaseServiceResource.java	Adds new REST endpoint to start prefix-based hard delete for DatabaseService.
openmetadata-service/src/main/java/org/openmetadata/service/resources/EntityResource.java	Adds async orchestration method `deletePrefixHardById` calling `PrefixDeletionService`.
openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/PrefixDeletionService.java	New service coordinating bulk prefix-based deletion across tables.
openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/EntityDAO.java	Adds DAO helpers for finding IDs by FQN-hash prefix and batch deletes.
openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/EntityRepository.java	Extends relationship insertion to persist relationship FQN hashes; adds hook helpers used by prefix deletion.
openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/CollectionDAO.java	Extends `entity_relationship` inserts to include new hash columns; adds delete-by-prefix helpers and thread-entity batch lookup.
openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/EntityTimeSeriesDAO.java	Adds delete-by-FQN-hash-prefix helper for time-series tables.
openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/FeedRepository.java	Adds batch delete of threads by about-entity IDs (used by prefix deletion).
openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/TableRepository.java	Updates CONTAINS relationship insertion to populate relationship FQN hashes.
openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/DatabaseSchemaRepository.java	Updates CONTAINS relationship insertion to populate relationship FQN hashes.
openmetadata-service/src/main/java/org/openmetadata/service/initialization/LockManagerInitializer.java	Initializes `PrefixDeletionService` alongside the existing hierarchical lock manager.
openmetadata-service/src/main/java/org/openmetadata/service/migration/utils/v1141/MigrationUtil.java	Adds backfill logic for populating new relationship hash columns for existing rows.
openmetadata-service/src/main/java/org/openmetadata/service/migration/mysql/v1141/Migration.java	Runs v1141 backfill for MySQL migrations.
openmetadata-service/src/main/java/org/openmetadata/service/migration/postgres/v1141/Migration.java	Runs v1141 backfill for Postgres migrations.
bootstrap/sql/schema/mysql.sql	Adds `fromFQNHash`/`toFQNHash` + indexes to base MySQL schema.
bootstrap/sql/schema/postgres.sql	Adds `fromfqnhash`/`tofqnhash` + indexes to base Postgres schema.
bootstrap/sql/migrations/native/1.14.1/mysql/schemaChanges.sql	Adds columns + indexes to `entity_relationship` for MySQL migrations.
bootstrap/sql/migrations/native/1.14.1/postgres/schemaChanges.sql	Adds columns + indexes to `entity_relationship` for Postgres migrations.

Copilot · 2026-04-06T20:33:56Z

+    ExecutorService executorService = AsyncService.getInstance().getExecutorService();
+    executorService.submit(
+        RequestLatencyContext.wrapWithContext(
+            () -> {
+              try {
+                PrefixDeletionService.getInstance().deletePrefixHard(entity, userName);


PrefixDeletionService.getInstance() can return null if the lock manager initializer didn’t run (or CollectionDAO was unavailable), which will NPE here. Please either make getInstance() lazily initialize a no-lock instance, or guard here and return a clear 503/500 explaining prefix deletion isn’t available until initialization completes.

Suggested change

ExecutorService executorService = AsyncService.getInstance().getExecutorService();

executorService.submit(

RequestLatencyContext.wrapWithContext(

() -> {

try {

PrefixDeletionService.getInstance().deletePrefixHard(entity, userName);

PrefixDeletionService prefixDeletionService = PrefixDeletionService.getInstance();

if (prefixDeletionService == null) {

return Response.status(Response.Status.SERVICE_UNAVAILABLE)

.type(MediaType.TEXT_PLAIN)

.entity("Prefix deletion is not available until service initialization completes.")

.build();

}

ExecutorService executorService = AsyncService.getInstance().getExecutorService();

executorService.submit(

RequestLatencyContext.wrapWithContext(

() -> {

try {

prefixDeletionService.deletePrefixHard(entity, userName);

Copilot · 2026-04-06T20:33:57Z

+  private void deleteDependencyTables(String rootFqn, String fqnHashPrefix, List<String> allIds) {
+    CollectionDAO dao = Entity.getCollectionDAO();
+    dao.relationshipDAO().deleteAllByFqnHashPrefix(fqnHashPrefix);
+    dao.fieldRelationshipDAO().deleteAllByPrefix(rootFqn);
+    dao.entityExtensionDAO().deleteAllBatch(allIds);
+    dao.tagUsageDAO().deleteTagLabelsByTargetPrefix(rootFqn);
+    dao.usageDAO().deleteBatch(allIds);
+    List<UUID> allUuids = allIds.stream().map(UUID::fromString).toList();
+    Entity.getFeedRepository().deleteByAboutBatch(allUuids);


This code uses deleteAllByFqnHashPrefix(...) as the sole cleanup for entity_relationship. Any relationship rows with NULL fromFQNHash/toFQNHash (e.g., inserted via existing addRelationship(...) overloads that don’t pass FQNs) will not be deleted, leaving orphaned relationships. Consider adding a fallback delete by descendant IDs (batch delete) or ensuring all relationship inserts always populate the hash columns.

Copilot · 2026-04-06T20:33:57Z

+  private void deleteEntityTables(
+      Map<String, List<UUID>> descendantsByType, EntityInterface rootEntity) {
+    for (Map.Entry<String, List<UUID>> entry : descendantsByType.entrySet()) {
+      String entityType = entry.getKey();
+      List<String> ids = entry.getValue().stream().map(UUID::toString).toList();
+      try {
+        Entity.getEntityRepository(entityType).getDao().deleteBatch(ids);
+      } catch (Exception e) {
+        LOG.warn(
+            "Failed to delete {} entities of type {}: {}", ids.size(), entityType, e.getMessage());
+      }
+    }


deleteBatch(ids) is implemented as a single WHERE id IN (<ids>) statement. For large services this can exceed DB parameter limits / max packet size and become the new bottleneck or fail outright. Please chunk batch deletes (e.g., 500/1000 IDs per statement) similarly to RelationshipDAO.batchDeleteRelationships(...), for both entity tables and dependent tables that delete by IN (<ids>).

Copilot · 2026-04-06T20:33:57Z

+        Entity.getEntityRepository(entityType).deleteTimeSeriesByFqnPrefix(fqnHashPrefix);
+      } catch (Exception e) {
+        LOG.debug("deleteTimeSeriesByFqnPrefix failed for type {}: {}", entityType, e.getMessage());
+      }
+    }
+  }
+


deleteTimeSeriesByFqnPrefix(...) is a no-op in EntityRepository, and there are currently no repository overrides, so this loop won’t delete any time-series rows (e.g., profiler/data quality/query cost/workflow time-series). Prefix hard-delete will leave orphaned time-series data. Implement overrides in the repositories that own time-series tables to call the appropriate EntityTimeSeriesDAO.deleteByFqnHashPrefix(...) (or equivalent).

Suggested change

Entity.getEntityRepository(entityType).deleteTimeSeriesByFqnPrefix(fqnHashPrefix);

} catch (Exception e) {

LOG.debug("deleteTimeSeriesByFqnPrefix failed for type {}: {}", entityType, e.getMessage());

}

}

}

EntityRepository<?> repository = Entity.getEntityRepository(entityType);

if (hasTimeSeriesPrefixDeletionOverride(repository)) {

repository.deleteTimeSeriesByFqnPrefix(fqnHashPrefix);

} else {

LOG.warn(

"Repository {} for entity type {} does not override deleteTimeSeriesByFqnPrefix; "

+ "prefix hard delete may leave orphaned time-series data",

repository.getClass().getName(),

entityType);

}

} catch (Exception e) {

LOG.debug("deleteTimeSeriesByFqnPrefix failed for type {}: {}", entityType, e.getMessage());

}

}

}

private boolean hasTimeSeriesPrefixDeletionOverride(EntityRepository<?> repository) {

try {

return repository

.getClass()

.getMethod("deleteTimeSeriesByFqnPrefix", String.class)

.getDeclaringClass()

!= EntityRepository.class;

} catch (NoSuchMethodException e) {

return false;

}

}

Copilot · 2026-04-06T20:33:57Z

+/**
+ * Orchestrates fast prefix-based hard deletion of an entity and all its descendants.
+ *
+ * <p>Uses 7 phases to atomically delete an entire subtree by FQN prefix rather than walking the
+ * entity tree one-by-one. This is orders of magnitude faster for large hierarchies and eliminates
+ * the race condition where concurrent ingestion creates orphaned entities during slow cascade
+ * deletion.
+ *


The class-level Javadoc says this deletion is "atomically" deleting the subtree, but the implementation isn’t wrapped in a single transaction and also catches/logs exceptions while continuing. This can leave the DB in a partially-deleted state while still sending a "complete" notification. Either adjust the documentation/notification semantics to reflect best-effort behavior, or introduce a transactional boundary and fail-fast semantics where feasible.

Copilot · 2026-04-06T20:33:58Z

+  public void deleteByAboutBatch(List<UUID> entityIds) {
+    if (nullOrEmpty(entityIds)) {
+      return;
+    }
+    List<String> ids = entityIds.stream().map(UUID::toString).toList();
+    List<String> threadIds = listOrEmpty(dao.feedDAO().findByEntityIds(ids));
+    for (String threadId : threadIds) {
+      try {
+        deleteThreadInternal(UUID.fromString(threadId));
+      } catch (Exception ex) {
+        // Continue deletion
+      }
+    }


deleteByAboutBatch silently swallows exceptions per-thread. This makes it very hard to diagnose partial deletions during prefix hard-delete and can leave threads/relationships behind without any signal. At minimum log at DEBUG/WARN with the threadId/exception, or aggregate failures and report them back to the caller/job notification.

Copilot · 2026-04-06T20:33:58Z

+  /**
+   * Hard-deletes the given root entity and all descendants whose FQN starts with the root's FQN.
+   * Works at any hierarchy level (service, database, schema, etc.).
+   */
+  public void deletePrefixHard(EntityInterface rootEntity, String deletedBy) {
+    String rootFqn = rootEntity.getFullyQualifiedName();
+    String fqnHashPrefix = FullyQualifiedName.buildHash(rootFqn);
+    DeletionLock lock = acquireLock(rootEntity, deletedBy);
+    try {
+      Map<String, List<UUID>> descendantsByType = collectDescendantIds(fqnHashPrefix);
+      List<String> allIds = buildAllIds(rootEntity.getId(), descendantsByType);
+      deleteDependencyTables(rootFqn, fqnHashPrefix, allIds);
+      runEntityHooks(rootEntity, deletedBy, fqnHashPrefix);
+      deleteEntityTables(descendantsByType, rootEntity);
+      emitDeleteEvent(rootEntity);
+    } finally {
+      releaseLock(lock, rootEntity);
+    }
+  }


This introduces a new deletion code path (prefix hard-delete) with non-trivial DB side effects across many tables, but there are no unit/integration tests validating correctness (e.g., no orphan rows in entity_relationship/field_relationship/time-series, and behavior vs existing recursive hard delete). Please add focused tests for PrefixDeletionService (mock DAO + verification) and at least one integration test covering database service subtree deletion.

gitar-bot · 2026-04-06T20:40:55Z

+  private DeletionLock acquireLock(EntityInterface entity, String deletedBy) {
+    if (lockManager == null) {
+      return null;
+    }
+    try {
+      return lockManager.acquireDeletionLock(entity, deletedBy, true);
+    } catch (Exception e) {
+      LOG.warn(
+          "Could not acquire deletion lock for {}: {}",
+          entity.getFullyQualifiedName(),
+          e.getMessage());
+      return null;
+    }
+  }


🚨 Bug: Lock failure silently ignored — deletion proceeds unprotected

The acquireLock method (line 84-97) catches all exceptions (including EntityLockedException thrown on contention) and returns null. The caller deletePrefixHard (line 71) never checks whether the lock was actually acquired — it proceeds with all 7 deletion phases regardless. This completely defeats the stated goal of eliminating race conditions with concurrent ingestion.

If the lock cannot be acquired (e.g., another deletion is in progress, or the lock manager is unavailable), the deletion runs without protection, allowing ingestion to create new entities in the subtree while it's being deleted — exactly the orphan scenario this PR aims to fix.

Suggested fix:

private DeletionLock acquireLock(EntityInterface entity, String deletedBy) { if (lockManager == null) { throw new IllegalStateException("Lock manager not initialized"); } // Let EntityLockedException propagate to the caller return lockManager.acquireDeletionLock(entity, deletedBy, true); }

_{Was this helpful? React with 👍 / 👎 | Reply gitar fix to apply this suggestion}

github-actions · 2026-04-06T23:45:36Z

🔴 Playwright Results — 1 failure(s), 30 flaky

✅ 3593 passed · ❌ 1 failed · 🟡 30 flaky · ⏭️ 207 skipped

Shard	Passed	Failed	Flaky	Skipped
🟡 Shard 1	454	0	3	2
🟡 Shard 2	633	0	10	32
🔴 Shard 3	641	1	7	26
🟡 Shard 4	623	0	4	47
🟡 Shard 5	608	0	1	67
🟡 Shard 6	634	0	5	33

Genuine Failures (failed on all attempts)

❌ Features/RestoreEntityInheritedFields.spec.ts › Validate restore with Inherited domain and data products assigned (shard 3)

�[31mTest timeout of 180000ms exceeded.�[39m

🟡 30 flaky test(s) (passed on retry)

Flow/Tour.spec.ts › Tour should work from welcome screen (shard 1, 1 retry)
Flow/Tour.spec.ts › Tour should work from URL directly (shard 1, 1 retry)
Pages/UserCreationWithPersona.spec.ts › Create user with persona and verify on profile (shard 1, 1 retry)
Features/BulkEditEntity.spec.ts › Glossary (shard 2, 2 retries)
Features/BulkImport.spec.ts › Keyboard Delete selection (shard 2, 1 retry)
Features/ChangeSummaryBadge.spec.ts › AI badge should NOT appear for manually-edited descriptions (shard 2, 1 retry)
Features/DataProductRenameConsolidation.spec.ts › Rename then change owner - assets should be preserved (shard 2, 1 retry)
Features/DataQuality/TestCaseImportExportE2eFlow.spec.ts › Admin: Complete export-import-validate flow (shard 2, 1 retry)
Features/DataQuality/TestCaseResultPermissions.spec.ts › User with only VIEW cannot PATCH results (shard 2, 1 retry)
Features/DomainTierCertificationVoting.spec.ts › DataProduct - Certification assign, update, and remove (shard 2, 1 retry)
Features/DomainTierCertificationVoting.spec.ts › DataProduct - UpVote and DownVote (shard 2, 1 retry)
Features/ExploreQuickFilters.spec.ts › tier with assigned asset appears in dropdown, tier without asset does not (shard 2, 1 retry)
Features/Glossary/GlossaryHierarchy.spec.ts › should cancel move operation (shard 2, 1 retry)
Features/Permissions/GlossaryPermissions.spec.ts › Team-based permissions work correctly (shard 3, 1 retry)
Features/RestoreEntityInheritedFields.spec.ts › Validate restore with Inherited domain and data products assigned (shard 3, 1 retry)
Features/RestoreEntityInheritedFields.spec.ts › Validate restore with Inherited domain and data products assigned (shard 3, 1 retry)
Features/RestoreEntityInheritedFields.spec.ts › Validate restore with Inherited domain and data products assigned (shard 3, 2 retries)
Features/RestoreEntityInheritedFields.spec.ts › Validate restore with Inherited domain and data products assigned (shard 3, 1 retry)
Features/RTL.spec.ts › Verify Following widget functionality (shard 3, 1 retry)
Pages/Customproperties-part2.spec.ts › Date Time (shard 3, 1 retry)
Pages/Customproperties-part2.spec.ts › entityReferenceList shows item count, scrollable list, no expand toggle (shard 4, 1 retry)
Pages/DataContractsSemanticRules.spec.ts › Validate Owner Rule Any_In (shard 4, 1 retry)
Pages/Entity.spec.ts › User as Owner Add, Update and Remove (shard 4, 1 retry)
Pages/Entity.spec.ts › Update displayName (shard 4, 1 retry)
Pages/ExploreTree.spec.ts › Verify Database and Database Schema available in explore tree (shard 5, 1 retry)
Pages/Lineage/LineageFilters.spec.ts › Verify lineage schema filter selection (shard 6, 1 retry)
Pages/ODCSImportExport.spec.ts › Multi-object ODCS contract - object selector shows all schema objects (shard 6, 1 retry)
Pages/ServiceEntity.spec.ts › Tier Add, Update and Remove (shard 6, 1 retry)
Pages/Users.spec.ts › Permissions for table details page for Data Consumer (shard 6, 1 retry)
Pages/Users.spec.ts › Reset Password for Data Steward (shard 6, 1 retry)

📦 Download artifacts

How to debug locally

# Download playwright-test-results-<shard> artifact and unzip
npx playwright show-trace path/to/trace.zip    # view trace

github-actions · 2026-04-09T08:48:13Z

The Java checkstyle failed.

Please run mvn spotless:apply in the root of your repository and commit the changes to this PR.
You can also use pre-commit to automate the Java code formatting.

You can install the pre-commit hooks with make install_test precommit_install.

Copilot

Pull request overview

Copilot reviewed 22 out of 22 changed files in this pull request and generated 11 comments.

Comments suppressed due to low confidence (1)

bootstrap/sql/migrations/native/1.13.0/mysql/schemaChanges.sql:142

These schema changes are being appended to the existing 1.13.0 migration. If a database was already upgraded past 1.13.0, the migration framework will not re-run 1.13.0, so the new columns/indexes will never be applied.

To ensure upgrades work correctly, the DDL should be added to the current/new migration version (e.g. 1.14.1) rather than modifying an older migration script.

-- Add FQN hash columns to entity_relationship to enable fast prefix-based bulk deletion.
-- This allows deleting all relationships for an entire entity subtree in a single indexed query
-- instead of walking the tree entity-by-entity.
ALTER TABLE entity_relationship
    ADD COLUMN IF NOT EXISTS fromFQNHash VARCHAR(768) DEFAULT NULL,
    ADD COLUMN IF NOT EXISTS toFQNHash   VARCHAR(768) DEFAULT NULL;

CREATE INDEX IF NOT EXISTS idx_er_from_fqn_hash ON entity_relationship (fromFQNHash(768));
CREATE INDEX IF NOT EXISTS idx_er_to_fqn_hash   ON entity_relationship (toFQNHash(768));

Copilot · 2026-04-09T08:54:35Z

+  private void deleteDependencyTables(String rootFqn, String fqnHashPrefix, List<String> allIds) {
+    CollectionDAO dao = Entity.getCollectionDAO();
+    dao.relationshipDAO().deleteAllByFqnHashPrefix(fqnHashPrefix);
+    dao.fieldRelationshipDAO().deleteAllByPrefix(rootFqn);
+    dao.entityExtensionDAO().deleteAllBatch(allIds);
+    dao.tagUsageDAO().deleteTagLabelsByTargetPrefix(rootFqn);
+    dao.usageDAO().deleteBatch(allIds);
+    List<UUID> allUuids = allIds.stream().map(UUID::fromString).toList();
+    Entity.getFeedRepository().deleteByAboutBatch(allUuids);
+  }


This deletion path builds allIds for the entire subtree and then issues multiple ... WHERE id IN (<ids>) calls (entity_extension, entity_usage, and thread lookup via deleteByAboutBatch). For large services (the stated 1M+ tables use case), these unchunked IN-lists can exceed database parameter limits and/or cause very large allocations.

Recommend chunking allIds consistently (similar to batchDeleteRelationships), or adding DAO methods that delete by FQN hash prefix directly to avoid materializing/passing millions of IDs through the application.

Copilot · 2026-04-09T08:54:36Z

+  @Transaction
+  public void deleteByAboutBatch(List<UUID> entityIds) {
+    if (nullOrEmpty(entityIds)) {
+      return;
+    }
+    List<String> ids = entityIds.stream().map(UUID::toString).toList();
+    List<String> threadIds = listOrEmpty(dao.feedDAO().findByEntityIds(ids));
+    for (String threadId : threadIds) {
+      try {
+        deleteThreadInternal(UUID.fromString(threadId));
+      } catch (Exception ex) {
+        // Continue deletion
+      }
+    }
+  }


deleteByAboutBatch still performs per-thread deletes in a loop and first fetches thread IDs with a single IN (<entityIds>) query. With prefix deletion, entityIds can be extremely large, which can exceed DB bind limits and makes this method a bottleneck.

Consider chunking entityIds for findByEntityIds, and then using the existing deleteThreadsInBatch(...) to delete threads/relationships in bulk instead of calling deleteThreadInternal(...) repeatedly.

Copilot · 2026-04-09T08:54:36Z

+  private static void backfillForEntityType(Handle handle, String entityType) {
+    EntityRepository<?> repo = Entity.getEntityRepository(entityType);
+    String hashCol = repo.getDao().getNameHashColumn();
+    if (!FQNHASH_COL.equals(hashCol) && !NAMEHASH_COL.equals(hashCol)) {
+      return;
+    }
+    String tableName = repo.getDao().getTableName();
+    int offset = 0;
+    int processed;
+    do {
+      processed = processEntityBatch(handle, tableName, hashCol, offset);
+      offset += processed;
+    } while (processed == BATCH_SIZE);
+    LOG.info("Backfilled FQN hashes for entity type {}: {} rows", entityType, offset);
+  }
+
+  private static int processEntityBatch(Handle handle, String tableName, String hashCol, int offset) {
+    String sql =
+        "SELECT id, "
+            + hashCol
+            + " FROM "
+            + tableName
+            + " WHERE "
+            + hashCol
+            + " IS NOT NULL LIMIT :limit OFFSET :offset";
+    List<Map<String, Object>> rows =
+        handle.createQuery(sql).bind("limit", BATCH_SIZE).bind("offset", offset).mapToMap().list();
+    for (Map<String, Object> row : rows) {
+      backfillRow(handle, row, hashCol);
+    }
+    return rows.size();


This migration utility uses LIMIT/OFFSET without an ORDER BY. That makes pagination non-deterministic (rows can be skipped/duplicated) and OFFSET becomes very slow on large tables.

For a backfill intended to handle very large installations, prefer a deterministic keyset pagination (e.g., WHERE id > :lastId ORDER BY id LIMIT :limit) and consider restricting to entities that actually have fqnHash (since prefix deletion is based on FQN hashes).

Copilot · 2026-04-09T08:54:36Z

+
+-- Add FQN hash columns to entity_relationship to enable fast prefix-based bulk deletion.
+-- This allows deleting all relationships for an entire entity subtree in a single indexed query
+-- instead of walking the tree entity-by-entity.
+ALTER TABLE entity_relationship
+    ADD COLUMN IF NOT EXISTS fromFQNHash VARCHAR(768) DEFAULT NULL,
+    ADD COLUMN IF NOT EXISTS toFQNHash   VARCHAR(768) DEFAULT NULL;
+
+CREATE INDEX IF NOT EXISTS idx_er_from_fqn_hash ON entity_relationship (fromFQNHash);
+CREATE INDEX IF NOT EXISTS idx_er_to_fqn_hash   ON entity_relationship (toFQNHash);


These schema changes are being appended to the existing 1.13.0 migration. If a database was already upgraded past 1.13.0, the migration framework will not re-run 1.13.0, so the new columns/indexes will never be applied.

To ensure upgrades work correctly, the DDL should be added to the current/new migration version (e.g. 1.14.1) rather than modifying an older migration script.

Suggested change

-- Add FQN hash columns to entity_relationship to enable fast prefix-based bulk deletion.

-- This allows deleting all relationships for an entire entity subtree in a single indexed query

-- instead of walking the tree entity-by-entity.

ALTER TABLE entity_relationship

ADD COLUMN IF NOT EXISTS fromFQNHash VARCHAR(768) DEFAULT NULL,

ADD COLUMN IF NOT EXISTS toFQNHash VARCHAR(768) DEFAULT NULL;

CREATE INDEX IF NOT EXISTS idx_er_from_fqn_hash ON entity_relationship (fromFQNHash);

CREATE INDEX IF NOT EXISTS idx_er_to_fqn_hash ON entity_relationship (toFQNHash);

Copilot · 2026-04-09T08:54:37Z

@@ -0,0 +1 @@
+-- No schema changes in this version.


The PR description indicates the new entity_relationship columns/indexes are introduced in 1.14.1, but this 1.14.1 migration is marked as having no schema changes. As written, upgrading from 1.14.0 → 1.14.1 would not apply the required DDL.

Move the ALTER TABLE ... ADD COLUMN fromFQNHash/toFQNHash + index creation into the 1.14.1 schemaChanges files (or whichever version is intended to ship this feature).

Copilot · 2026-04-09T08:54:37Z

+public class Migration extends MigrationProcessImpl {
+
+  public Migration(MigrationFile migrationFile) {
+    super(migrationFile);
+  }


The backfill is currently wired into the v1130 migration path, but v1141 (1.14.1) migration has no runDataMigration() and therefore won't backfill fromFQNHash/toFQNHash when upgrading to 1.14.1. Combined with the DDL living in older native migrations, this risks leaving entity_relationship hashes unpopulated on many upgrade paths.

Consider moving the backfill invocation into the v1141 migration (and keeping older migrations immutable).

Suggested change

public class Migration extends MigrationProcessImpl {

public Migration(MigrationFile migrationFile) {

super(migrationFile);

}

public class Migration extends MigrationProcessImpl {

private final MigrationFile migrationFile;

public Migration(MigrationFile migrationFile) {

super(migrationFile);

this.migrationFile = migrationFile;

}

@Override

public void runDataMigration() {

new org.openmetadata.service.migration.postgres.v1130.Migration(migrationFile)

.runDataMigration();

}

Copilot · 2026-04-09T08:54:38Z

+
+  public Migration(MigrationFile migrationFile) {
+    super(migrationFile);
+  }


The backfill is currently wired into the v1130 migration path, but v1141 (1.14.1) migration has no runDataMigration() and therefore won't backfill fromFQNHash/toFQNHash when upgrading to 1.14.1. Combined with the DDL living in older native migrations, this risks leaving entity_relationship hashes unpopulated on many upgrade paths.

Consider moving the backfill invocation into the v1141 migration (and keeping older migrations immutable).

Suggested change

}

}

@Override

public void runDataMigration() {

super.runDataMigration();

executeSQL(

"""

UPDATE entity_relationship

SET

fromFQNHash = MD5(fromFQN),

toFQNHash = MD5(toFQN)

WHERE

(fromFQNHash IS NULL OR fromFQNHash = '')

OR (toFQNHash IS NULL OR toFQNHash = '')

""");

}

Copilot · 2026-04-09T08:54:38Z

+    if (lockManager == null) {
+      return null;
+    }
+    try {
+      return lockManager.acquireDeletionLock(entity, deletedBy, true);
+    } catch (Exception e) {
+      LOG.warn(
+          "Could not acquire deletion lock for {}: {}",
+          entity.getFullyQualifiedName(),
+          e.getMessage());
+      return null;


If the deletion lock cannot be acquired, the code currently logs and proceeds with the bulk delete anyway. That re-opens the ingestion race condition this feature is intended to eliminate (and could still create orphans while the delete is running).

Consider failing the operation when the deletion lock cannot be acquired (or retrying/acquiring in blocking mode), so callers get a clear failure instead of a best-effort delete.

Suggested change

if (lockManager == null) {

return null;

}

try {

return lockManager.acquireDeletionLock(entity, deletedBy, true);

} catch (Exception e) {

LOG.warn(

"Could not acquire deletion lock for {}: {}",

entity.getFullyQualifiedName(),

e.getMessage());

return null;

String entityFqn = entity.getFullyQualifiedName();

if (lockManager == null) {

throw new IllegalStateException(

String.format("Cannot hard delete %s because the deletion lock manager is not available", entityFqn));

}

try {

DeletionLock deletionLock = lockManager.acquireDeletionLock(entity, deletedBy, true);

if (deletionLock == null) {

throw new IllegalStateException(

String.format("Cannot hard delete %s because the deletion lock could not be acquired", entityFqn));

}

return deletionLock;

} catch (Exception e) {

throw new IllegalStateException(

String.format("Cannot hard delete %s because the deletion lock could not be acquired", entityFqn), e);

Copilot · 2026-04-09T08:54:38Z

+    try {
+      MigrationUtil.backfillRelationshipFqnHashes(handle);
+    } catch (Exception e) {
+      LOG.error(
+          "Failed to backfill FQN hashes in entity_relationship during v1130 migration. "
+              + "Fast prefix deletion may not work correctly for pre-existing relationships.",
+          e);
+    }


Running the relationship FQN-hash backfill in the v1130 migration means it will not run for environments that are already beyond 1.13.0 (the common case for a 1.14.1 upgrade). This can leave fromFQNHash/toFQNHash NULL and break prefix deletion.

To make upgrades reliable, wire the backfill into the migration version that actually introduces the columns (e.g. v1141) rather than v1130.

Suggested change

try {

MigrationUtil.backfillRelationshipFqnHashes(handle);

} catch (Exception e) {

LOG.error(

"Failed to backfill FQN hashes in entity_relationship during v1130 migration. "

+ "Fast prefix deletion may not work correctly for pre-existing relationships.",

e);

}

Copilot · 2026-04-09T08:54:39Z

+    try {
+      MigrationUtil.backfillRelationshipFqnHashes(handle);
+    } catch (Exception e) {
+      LOG.error(
+          "Failed to backfill FQN hashes in entity_relationship during v1130 migration. "
+              + "Fast prefix deletion may not work correctly for pre-existing relationships.",
+          e);
+    }


Running the relationship FQN-hash backfill in the v1130 migration means it will not run for environments that are already beyond 1.13.0 (the common case for a 1.14.1 upgrade). This can leave fromFQNHash/toFQNHash NULL and break prefix deletion.

To make upgrades reliable, wire the backfill into the migration version that actually introduces the columns (e.g. v1141) rather than v1130.

Suggested change

try {

MigrationUtil.backfillRelationshipFqnHashes(handle);

} catch (Exception e) {

LOG.error(

"Failed to backfill FQN hashes in entity_relationship during v1130 migration. "

+ "Fast prefix deletion may not work correctly for pre-existing relationships.",

e);

}

Adds 11 unit tests covering the correlated-subquery backfill logic introduced in MigrationUtil.backfillRelationshipFqnHashes(): - MigrationUtilTest: 7 tests verifying empty entity list no-ops, fqnHash/nameHash entity updates, unrecognised hash column skip, time-series entity exception swallowing, multi-entity failure isolation, and correct CAST SQL generation - mysql/v1130/MigrationTest: 2 tests verifying runDataMigration delegates to backfillRelationshipFqnHashes and swallows errors - postgres/v1130/MigrationTest: same 2 tests for Postgres variant Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

@benchmark

PrefixDeletionIT - three integration tests verifying the DELETE /v1/services/databaseServices/prefix/{id} endpoint: - single service with 5 tables is fully erased (service, database, schema, all tables return 404 after deletion) - service with 2×2 databases/schemas and 12 tables all erased - empty service (no databases) is deleted cleanly PrefixDeletionBenchmarkIT - disabled manual benchmark tagged @benchmark that creates equal hierarchies and times: - old: DELETE /{id}?hardDelete=true&recursive=true - new: DELETE /prefix/{id} Logs speedup factor; configurable table count via -Dtest.scenario=N. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Previously only DatabaseService had DELETE /prefix/{id}. Added the same endpoint to DatabaseResource and DatabaseSchemaResource so callers can bulk-hard-delete at any level of the service hierarchy: DELETE /v1/services/databaseServices/prefix/{id} (existing) DELETE /v1/databases/prefix/{id} (new) DELETE /v1/databaseSchemas/prefix/{id} (new) All three delegate to EntityResource.deletePrefixHardById() which runs PrefixDeletionService.deletePrefixHard() — same FQN hash prefix logic, same async execution, same auth checks. Updated PrefixDeletionIT with two new integration tests that verify the sibling-isolation guarantee: deleting a database leaves sibling databases intact, and deleting a schema leaves sibling schemas and their tables intact. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

@delete

Previously DELETE /prefix/{id} was only wired in DatabaseServiceResource (and manually added to Database/DatabaseSchema). Adding it to each of the 60+ entity resource subclasses individually would be maintenance-heavy and error-prone. JAX-RS inherits endpoint annotations from parent classes when the subclass method has no annotations of its own, so adding @delete @path("/prefix/{id}") once to EntityResource.deletePrefixHardById() exposes the endpoint on every entity resource automatically: DELETE /v1/databases/prefix/{id} DELETE /v1/databaseSchemas/prefix/{id} DELETE /v1/tables/prefix/{id} DELETE /v1/services/databaseServices/prefix/{id} DELETE /v1/glossaries/prefix/{id} ... (every EntityResource subclass) Removed the now-redundant overrides from DatabaseServiceResource, DatabaseResource, and DatabaseSchemaResource. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

The prefix delete endpoint returns 202 immediately and processes the deletion in a background thread. Asserting right after the response would always race against the job and produce false failures. Replace assertEntityGone() (which asserted synchronously) with awaitGone() backed by Awaitility — polls every 1s for up to 30s until the GET throws, matching the async job lifecycle. Also separate the mixed descendant ID list into typed lists (dbIds, schemaIds, tableIds) so each is fetched via the correct SDK client. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…rvice is not null LockManagerInitializer.initialize() was never called, leaving PrefixDeletionService.getInstance() returning null. Any call to DELETE /prefix/{id} would immediately NullPointerException in EntityResource.deletePrefixHardById(). Call LockManagerInitializer.initialize() in OpenMetadataApplication.run() after Entity.initializeRepositories() so that both CollectionDAO and all entity repositories are available when the lock manager and PrefixDeletionService are wired up. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…nflicts The old code called DatabaseTestFactory.create(ns, ...) and DatabaseSchemaTestFactory.create(ns, ...) inside loops — both always produced ns.prefix("db") / ns.prefix("schema"), so the second iteration hit a 409 ConflictException: Entity already exists. Fix: - Add DatabaseTestFactory.createWithName(ns, serviceFqn, baseName) and DatabaseSchemaTestFactory.createWithName(ns, databaseFqn, baseName) so callers can provide a distinct base name that gets namespaced. - Rewrite every loop in PrefixDeletionIT and PrefixDeletionBenchmarkIT to pass a loop-index-qualified base name ("db0", "db1", "sc00", ...). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…on time - Default topology changed to 5 × 5 × 400 = 10,000 tables per service (~10,031 total entities), all configurable via -Dtest.databases/schemas/tables - timeNewDelete now polls with Awaitility until the service is actually gone so the reported elapsed time covers real deletion, not just the 202 handoff - Log messages improved to show total entity count and seeding time estimate - Removed unused HashMap/Map imports, replaced with Duration/UUID/Awaitility Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Replace sequential triple-nested loop with a 3-phase parallel approach: 1. All databases created concurrently 2. All schemas created concurrently (after databases resolve) 3. All tables created concurrently (after schemas resolve) Default 32 threads (tunable via -Dtest.seedThreads). At ~50ms/REST call and 32 threads, 10k tables seed in ~20s instead of ~8 min sequentially. Also logs per-entity average seed time for diagnostics. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…greSQL The rest of the codebase uses VARCHAR(36) for UUID columns in PostgreSQL. BindUUID binds UUID values as strings (via UUID.toString()), which works with VARCHAR(36) but fails with native UUID columns because PostgreSQL rejects the implicit text = uuid comparison: Hint: No operator matches the given name and argument types. Changing id and entityId to VARCHAR(36) makes the schema consistent with every other entity table and removes the type cast error. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

… on PostgreSQL" This reverts commit 87aa7f3.

…greSQL The entity_deletion_lock table was created in 1.9.0 with native UUID columns for id and entityId. The rest of the codebase uses VARCHAR(36) for all UUID columns so that BindUUID (which binds via UUID.toString()) can compare them without an explicit type cast. Without this, PostgreSQL rejects the WHERE clause in DeletionLockDAO: Hint: No operator matches the given name and argument types. You might need to add explicit type casts. ALTER TABLE migrates existing rows in place — UUID::VARCHAR produces the standard 8-4-4-4-12 string representation. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

PostgreSQL limits prepared statements to 65,535 parameters. With 10k+ entities each having multiple extension rows, deleteAllBatch was generating statements with 100k+ parameters and failing with: PSQLException: PreparedStatement can have at most 65,535 parameters. Wrap both EntityExtensionDAO.deleteAllBatch and UsageDAO.deleteBatch with default interface methods that chunk into 50,000-ID slices before delegating to the raw @SqlUpdate methods. 50k is safely below the limit while keeping the number of round trips minimal. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

@operation

The codebase pattern is: EntityResource provides the implementation with no JAX-RS annotations; each subclass explicitly declares the endpoint with @DELETE/@Path/@operation so it appears in the Swagger UI and is routed correctly by JAX-RS. - Strip @DELETE/@Path/@operation from EntityResource.deletePrefixHardById so it becomes a plain implementation method (same as delete/deleteByIdAsync) - Add explicit @delete /prefix/{id} override in DatabaseServiceResource that calls the base implementation Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

@delete

…asses Add the hard-delete-by-FQN-prefix endpoint to all 69 EntityResource subclasses following the established pattern: each resource class explicitly declares @delete @path("/prefix/{id}") with entity-specific @operation metadata and delegates to EntityResource.deletePrefixHardById. Resources covered: - 13 service resources (database, storage, messaging, pipeline, ml, metadata, search, security, mcp, llm, drive, dashboard, api) - Database hierarchy (database, schema, table, stored procedure) - Data entities (topic, container, search index, pipeline, ml model, dashboard, chart, dashboard data model, api collection, api endpoint) - Team/user/role (team, user, role, persona, policy, bot) - Classification/taxonomy (classification, tag, glossary, glossary term, domain, data product) - Tests/quality (test suite, test case, test definition, test connection definition, data contract, kpi, metric, query) - Apps/workflows (app, marketplace, workflow, workflow definition, ingestion pipeline, event subscription, notification template) - AI (ai application, ai governance policy, llm model, mcp server, prompt template) - Drives (directory, file, spreadsheet, worksheet) - Misc (type, report, doc store, data insight chart, system chart, web analytic event, learning resource) Also adds missing imports (DELETE, PathParam, Parameter, ApiResponse, Schema) to the 5 files that lacked them. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

DataInsightSystemChartResource was missing java.util.UUID and TestConnectionDefinitionResource was missing jakarta.ws.rs.core.Response, both required by the new deletePrefixHardById endpoint. Spotted during mvn clean package compilation. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

The 70 subclass overrides were calling deletePrefixHardById(uriInfo, securityContext, id) which resolved to the overriding method itself (infinite recursion), not the base class implementation. This caused: java.lang.StackOverflowError at DatabaseSchemaResource.deletePrefixHardById(...) at DatabaseSchemaResource.deletePrefixHardById(...) ... Changed all 70 call sites to super.deletePrefixHardById(...) so they delegate to EntityResource's implementation as intended. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

@sqlquery

…r limit With 72k+ entity IDs from a large service hierarchy, findByEntityIds was generating a prepared statement with 72,157 parameters, exceeding PostgreSQL's 65,535 limit: PSQLException: PreparedStatement can have at most 65,535 parameters. Same fix as EntityExtensionDAO and UsageDAO: rename the raw @sqlquery to findByEntityIdsChunk and wrap it with a default method that collects results across 50,000-ID slices. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

After bulk-deleting entities from the DB, call SearchRepository.deleteByEntityTypeFqnPrefix for the root entity type and every descendant type found during the scan. This removes all matching documents from each type's ES index using a single FQN prefix wildcard query per index, preventing deleted entities from continuing to appear in search results. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…letion Previously only the root entity's preDelete hook ran. This adds two new extension points to EntityRepository: - preDeleteByFqnHashPrefix(fqnHashPrefix, deletedBy): called before DB deletion for each entity type; entities are still in the DB so external systems can be cleaned up. IngestionPipelineRepository overrides this to call pipelineServiceClient.deletePipeline() for every matching Airflow pipeline and removes pipeline status time-series records. - postDeleteByFqnHashPrefix(fqnHashPrefix): called after DB deletion for post-deletion metadata work that does not require loading deleted entities. PrefixDeletionService.runEntityHooks now invokes preDeleteByFqnHashPrefix for every entity type (alongside the existing deleteTimeSeriesByFqnPrefix loop), and a new runPostEntityHooks phase fires postDeleteByFqnHashPrefix for every type after the entity tables are cleared. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Adds two new extension points to EntityRepository: - preDeleteByFqnHashPrefix(fqnHashPrefix, deletedBy): runs before DB deletion; entities still exist so external systems can be cleaned up - postDeleteByFqnHashPrefix(fqnHashPrefix): runs after DB deletion for metadata-only post-deletion work Concrete overrides added: - IngestionPipelineRepository: calls Airflow pipelineServiceClient.deletePipeline() and removes pipeline status time-series records for each matching pipeline - DataContractRepository: deletes the linked TestSuite for each matching contract that has quality expectations (prevents orphaned TestSuite entities) - WorkflowDefinitionRepository: calls WorkflowHandler.deleteWorkflowDefinition() for each matching workflow definition - WorkflowRepository: calls SecretsManagerFactory.deleteSecretsFromWorkflow() for each matching workflow (prevents orphaned secrets) Also adds dao.entityExtensionTimeSeriesDao().deleteByFqnHashPrefix() to deleteDependencyTables so all entity_extension_time_series rows (pipeline status, data contract results, etc.) are bulk-cleaned in one pass. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

@SQL

…tion MySQL 8.x does not support ADD COLUMN IF NOT EXISTS (that is a MariaDB extension). Replace with the SET @SQL / PREPARE / EXECUTE / DEALLOCATE pattern already used elsewhere in this migration file to conditionally add fromFQNHash and toFQNHash columns to entity_relationship. CREATE INDEX IF NOT EXISTS is supported by MySQL 8.0+ so those lines are left unchanged. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Copilot

Pull request overview

Copilot reviewed 104 out of 104 changed files in this pull request and generated 7 comments.

Copilot · 2026-04-10T19:59:36Z

+  default List<UUID> findIdsByFqnHashPrefix(String fqnHashPrefix) {
+    if (!"fqnHash".equals(getNameHashColumn())) {
+      return List.of();
+    }
+    return findIdsByFqnHashPrefixInternal(getTableName(), "fqnHash", fqnHashPrefix + ".%").stream()
+        .map(UUID::fromString)
+        .toList();
+  }


findIdsByFqnHashPrefix currently returns IDs only when getNameHashColumn() equals "fqnHash", but many entity tables use nameHash while still storing the FQN hash via @BindFQN (i.e., FullyQualifiedName.buildHash). This means prefix deletion will miss descendants for those entity types (e.g., teams/sub-teams) even though the hashes are prefix-searchable. Consider using getNameHashColumn() as the column to LIKE against (or allow both fqnHash and nameHash).

Copilot · 2026-04-10T19:59:36Z

        // Set it on EntityRepository
        EntityRepository.setLockManager(lockManager);

+        // Initialize PrefixDeletionService with the same lock manager
+        PrefixDeletionService.initialize(lockManager);
+
        initialized = true;
        LOG.info("Hierarchical lock manager initialized successfully");



PrefixDeletionService.initialize(lockManager) is only invoked on the successful lock-manager path. If initialization fails, PrefixDeletionService.getInstance() can remain null and prefix-delete endpoints will NPE. To keep prefix deletion working even without locking, initialize PrefixDeletionService in the failure/skip paths as well (e.g., with a null lock manager) or make getInstance() lazily create a no-lock instance.

Copilot · 2026-04-10T19:59:36Z

+  public void deleteByEntityTypeFqnPrefix(String entityType, String fqnPrefix) {
+    if (!checkIfIndexingIsSupported(entityType)) {
+      return;
+    }
+    if (!getSearchClient().isClientAvailable()) {
+      return;
+    }


deleteByEntityTypeFqnPrefix returns early when the search client is unavailable, which can leave stale documents with no retry path. Other delete methods enqueue to SearchIndexRetryQueue when ES/OS is down; consider doing the same here (at least for the root delete request) so prefix deletions eventually clean up search indexes.

Copilot · 2026-04-10T19:59:37Z

+      operationId = "deleteApiServicePrefixHard",
+      summary = "Hard-delete a API service and all descendants by FQN prefix",
+      description =
+          "Bulk hard-delete this API service and all descendants whose FQN starts with this "
+              + "entity's FQN. Significantly faster than recursive delete for large hierarchies.",


Grammar: the OpenAPI summary/description uses "Hard-delete a API service"; since "API" starts with a vowel sound, this should be "an API service" (same for the description string).

Copilot · 2026-04-10T19:59:37Z

+      operationId = "deleteAppPrefixHard",
+      summary = "Hard-delete a app and all descendants by FQN prefix",
+      description =
+          "Bulk hard-delete this app and all descendants whose FQN starts with this "
+              + "entity's FQN. Significantly faster than recursive delete for large hierarchies.",


Grammar: the OpenAPI summary/description uses "Hard-delete a app"; this should be "an app" (and likewise in the description) for correct English in generated API docs.

Copilot · 2026-04-10T19:59:37Z

+      operationId = "deleteIngestionPipelinePrefixHard",
+      summary = "Hard-delete a ingestion pipeline and all descendants by FQN prefix",
+      description =
+          "Bulk hard-delete this ingestion pipeline and all descendants whose FQN starts with this "
+              + "entity's FQN. Significantly faster than recursive delete for large hierarchies.",


Grammar: the OpenAPI summary/description uses "Hard-delete a ingestion pipeline"; this should be "an ingestion pipeline" (and likewise in the description) for correct English in generated API docs.

Copilot · 2026-04-10T19:59:37Z

+      operationId = "deleteEventSubscriptionPrefixHard",
+      summary = "Hard-delete a event subscription and all descendants by FQN prefix",
+      description =
+          "Bulk hard-delete this event subscription and all descendants whose FQN starts with this "
+              + "entity's FQN. Significantly faster than recursive delete for large hierarchies.",


Grammar: the OpenAPI summary/description uses "Hard-delete a event subscription"; this should be "an event subscription" (and likewise in the description) for correct English in generated API docs.

The internal migration runner handles duplicate index checks. Plain CREATE INDEX is simpler and works across all MySQL 8.x versions. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Remove the information_schema conditional workarounds. The internal migration runner handles idempotency, so plain DDL is sufficient. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Copilot

Pull request overview

Copilot reviewed 104 out of 104 changed files in this pull request and generated 6 comments.

Comments suppressed due to low confidence (2)

openmetadata-service/src/main/java/org/openmetadata/service/initialization/LockManagerInitializer.java:57

If CollectionDAO is unavailable (or initialization throws), PrefixDeletionService is never initialized, but the new /prefix/{id} endpoints still exist and will NPE when invoked. Consider initializing PrefixDeletionService even when the lock manager can't be constructed (e.g., with null lock manager), or exposing an availability check so the REST layer can return a clear error instead of accepting the job and failing later.

      try {
        LOG.info("Initializing hierarchical lock manager for entity deletion optimization");

        // Get the collection DAO
        var collectionDAO = Entity.getCollectionDAO();
        if (collectionDAO == null) {
          LOG.warn("CollectionDAO not available, skipping lock manager initialization");
          return;
        }

        // Initialize the lock manager
        HierarchicalLockManager lockManager =
            new HierarchicalLockManager(collectionDAO.deletionLockDAO());

        // Set it on EntityRepository
        EntityRepository.setLockManager(lockManager);

        // Initialize PrefixDeletionService with the same lock manager
        PrefixDeletionService.initialize(lockManager);

        initialized = true;
        LOG.info("Hierarchical lock manager initialized successfully");

      } catch (Exception e) {
        LOG.error("Failed to initialize hierarchical lock manager: {}", e.getMessage(), e);
        // Continue without locking for backward compatibility
      }

bootstrap/sql/migrations/native/1.13.0/mysql/schemaChanges.sql:142

Adding new DDL to an older, already-shipped migration version (1.13.0) is unsafe for upgrades: databases that have already applied 1.13.0 will not re-run this file, so they’ll never get the new fromFQNHash/toFQNHash columns/indexes required by prefix deletion. Please introduce this DDL in the latest migration version (and keep 1.13.0 immutable) so upgrades are deterministic.

-- Add FQN hash columns to entity_relationship to enable fast prefix-based bulk deletion.
-- This allows deleting all relationships for an entire entity subtree in a single indexed query
-- instead of walking the tree entity-by-entity.
ALTER TABLE entity_relationship
    ADD COLUMN fromFQNHash VARCHAR(768) DEFAULT NULL,
    ADD COLUMN toFQNHash   VARCHAR(768) DEFAULT NULL;

CREATE INDEX idx_er_from_fqn_hash ON entity_relationship (fromFQNHash(768));
CREATE INDEX idx_er_to_fqn_hash   ON entity_relationship (toFQNHash(768));

Copilot · 2026-04-11T17:20:01Z

+    if (!"fqnHash".equals(getNameHashColumn())) {
+      return List.of();
+    }
+    return findIdsByFqnHashPrefixInternal(getTableName(), "fqnHash", fqnHashPrefix + ".%").stream()


findIdsByFqnHashPrefix currently bails out unless getNameHashColumn() is exactly "fqnHash". Several entity tables use nameHash while still hashing the entity FQN (e.g., team_entity has only nameHash), so descendant collection for prefix deletion will silently miss those entity types. Consider querying against getNameHashColumn() (for both fqnHash and nameHash) or otherwise documenting/enforcing that only fqnHash-backed entities are eligible for prefix deletion.

Suggested change

if (!"fqnHash".equals(getNameHashColumn())) {

return List.of();

}

return findIdsByFqnHashPrefixInternal(getTableName(), "fqnHash", fqnHashPrefix + ".%").stream()

String nameHashColumn = getNameHashColumn();

if (!"fqnHash".equals(nameHashColumn) && !"nameHash".equals(nameHashColumn)) {

return List.of();

}

return findIdsByFqnHashPrefixInternal(getTableName(), nameHashColumn, fqnHashPrefix + ".%")

.stream()

Copilot · 2026-04-11T17:20:01Z

 import jakarta.ws.rs.*;
+import jakarta.ws.rs.DELETE;
+import jakarta.ws.rs.PathParam;
 import jakarta.ws.rs.core.*;


Redundant imports: this file already imports jakarta.ws.rs.*, so explicitly importing jakarta.ws.rs.DELETE and jakarta.ws.rs.PathParam is unnecessary and can violate checkstyle/spotless rules. Please remove the redundant explicit imports.

Copilot · 2026-04-11T17:20:01Z

 import jakarta.ws.rs.*;
+import jakarta.ws.rs.DELETE;
+import jakarta.ws.rs.PathParam;
 import jakarta.ws.rs.core.*;


Redundant imports: this file already imports jakarta.ws.rs.*, so explicitly importing jakarta.ws.rs.DELETE and jakarta.ws.rs.PathParam is unnecessary and can violate checkstyle/spotless rules. Please remove the redundant explicit imports.

Copilot · 2026-04-11T17:20:02Z

+      operationId = "deleteApiServicePrefixHard",
+      summary = "Hard-delete a API service and all descendants by FQN prefix",
+      description =
+          "Bulk hard-delete this API service and all descendants whose FQN starts with this "
+              + "entity's FQN. Significantly faster than recursive delete for large hierarchies.",


Grammar in the OpenAPI summary: "Hard-delete a API service..." should be "Hard-delete an API service...".

Copilot · 2026-04-11T17:20:02Z

+  @Operation(
+      operationId = "deleteIngestionPipelinePrefixHard",
+      summary = "Hard-delete a ingestion pipeline and all descendants by FQN prefix",
+      description =
+          "Bulk hard-delete this ingestion pipeline and all descendants whose FQN starts with this "
+              + "entity's FQN. Significantly faster than recursive delete for large hierarchies.",


Grammar in the OpenAPI summary: "Hard-delete a ingestion pipeline..." should be "Hard-delete an ingestion pipeline...".

Copilot · 2026-04-11T17:20:02Z

+-- Add FQN hash columns to entity_relationship to enable fast prefix-based bulk deletion.
+-- This allows deleting all relationships for an entire entity subtree in a single indexed query
+-- instead of walking the tree entity-by-entity.
+ALTER TABLE entity_relationship
+    ADD COLUMN IF NOT EXISTS fromFQNHash VARCHAR(768) DEFAULT NULL,
+    ADD COLUMN IF NOT EXISTS toFQNHash   VARCHAR(768) DEFAULT NULL;
+
+CREATE INDEX IF NOT EXISTS idx_er_from_fqn_hash ON entity_relationship (fromFQNHash);
+CREATE INDEX IF NOT EXISTS idx_er_to_fqn_hash   ON entity_relationship (toFQNHash);
+


Adding new DDL to an older, already-shipped migration version (1.13.0) is unsafe for upgrades: databases that have already applied 1.13.0 will not re-run this file, so they’ll never get the new fromFQNHash/toFQNHash columns/indexes required by prefix deletion. Please introduce this DDL in the latest migration version (and keep 1.13.0 immutable) so upgrades are deterministic.

Suggested change

-- Add FQN hash columns to entity_relationship to enable fast prefix-based bulk deletion.

-- This allows deleting all relationships for an entire entity subtree in a single indexed query

-- instead of walking the tree entity-by-entity.

ALTER TABLE entity_relationship

ADD COLUMN IF NOT EXISTS fromFQNHash VARCHAR(768) DEFAULT NULL,

ADD COLUMN IF NOT EXISTS toFQNHash VARCHAR(768) DEFAULT NULL;

CREATE INDEX IF NOT EXISTS idx_er_from_fqn_hash ON entity_relationship (fromFQNHash);

CREATE INDEX IF NOT EXISTS idx_er_to_fqn_hash ON entity_relationship (toFQNHash);

gitar-bot · 2026-04-13T06:07:14Z

Code Review 🚫 Blocked 6 resolved / 8 findings

Fast FQN prefix-based bulk deletion implementation fixes race conditions but is blocked by a critical lock failure that silently ignores contention and allows unprotected deletion to proceed, plus a minor redundant search index operation for the root entity type.

🚨 Bug: Lock failure silently ignored — deletion proceeds unprotected

📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/PrefixDeletionService.java:84-97 📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/PrefixDeletionService.java:68-81

The acquireLock method (line 84-97) catches all exceptions (including EntityLockedException thrown on contention) and returns null. The caller deletePrefixHard (line 71) never checks whether the lock was actually acquired — it proceeds with all 7 deletion phases regardless. This completely defeats the stated goal of eliminating race conditions with concurrent ingestion.

If the lock cannot be acquired (e.g., another deletion is in progress, or the lock manager is unavailable), the deletion runs without protection, allowing ingestion to create new entities in the subtree while it's being deleted — exactly the orphan scenario this PR aims to fix.

Suggested fix

private DeletionLock acquireLock(EntityInterface entity, String deletedBy) {
  if (lockManager == null) {
    throw new IllegalStateException("Lock manager not initialized");
  }
  // Let EntityLockedException propagate to the caller
  return lockManager.acquireDeletionLock(entity, deletedBy, true);
}

💡 Performance: Redundant search index delete for root entity type

📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/PrefixDeletionService.java:175-186

cleanSearchIndex calls deleteByEntityTypeFqnPrefix for rootType explicitly, then iterates descendantTypes which already includes the root type (since collectDescendantIds uses findIdsByFqnHashPrefix and the root entity's FQN matches its own prefix). This results in two identical delete-by-prefix calls to the search backend for the root entity type.

Not a correctness bug (the call is idempotent), but it's a wasted round-trip to the search cluster.

Suggested fix

private void cleanSearchIndex(EntityInterface rootEntity, Set<String> descendantTypes) {
    SearchRepository searchRepo = Entity.getSearchRepository();
    if (searchRepo == null) {
      return;
    }
    String rootFqn = rootEntity.getFullyQualifiedName();
    // descendantTypes already includes the root's own type
    for (String entityType : descendantTypes) {
      searchRepo.deleteByEntityTypeFqnPrefix(entityType, rootFqn);
    }
  }

✅ 6 resolved

✅ Performance: Migration backfill does row-by-row UPDATEs on entity_relationship

📄 openmetadata-service/src/main/java/org/openmetadata/service/migration/utils/v1141/MigrationUtil.java:75-89 📄 openmetadata-service/src/main/java/org/openmetadata/service/migration/utils/v1141/MigrationUtil.java:62-76
MigrationUtil.backfillRow() issues 2 individual UPDATE statements per entity row (one for fromFQNHash, one for toFQNHash). For deployments with millions of entities, this results in millions of individual UPDATE roundtrips. Consider using batch UPDATE with JOIN to update many rows at once, e.g.:
UPDATE entity_relationship er
JOIN <entity_table> e ON er.fromId = e.id
SET er.fromFQNHash = e.fqnHash
WHERE er.fromFQNHash IS NULL
This would reduce millions of queries to one per entity type.

✅ Bug: Async prefix delete test has no wait — assertions are racy

📄 openmetadata-integration-tests/src/test/java/org/openmetadata/it/tests/PrefixDeletionIT.java:68-76 📄 openmetadata-integration-tests/src/test/java/org/openmetadata/it/tests/PrefixDeletionIT.java:96-102 📄 openmetadata-integration-tests/src/test/java/org/openmetadata/it/tests/PrefixDeletionBenchmarkIT.java:124-135
The prefix delete endpoint returns 202 Accepted and performs deletion asynchronously via an ExecutorService. However, all three tests in PrefixDeletionIT immediately assert that entities are gone after prefixDeleteService() returns, with no polling, retry, or sleep mechanism. This will cause flaky test failures whenever the background deletion task hasn't completed by the time assertions run.

The benchmark test (PrefixDeletionBenchmarkIT) has the same issue — timeNewDelete() measures only the HTTP round-trip time (always ~milliseconds for a 202), not the actual deletion duration, making the benchmark meaningless.

✅ Bug: Test uses tables().get() for databases and schemas, masking failures

📄 openmetadata-integration-tests/src/test/java/org/openmetadata/it/tests/PrefixDeletionIT.java:99-102
In prefixDelete_serviceWithMultipleDatabasesAndSchemas, the allDescendantIds list contains IDs of databases, schemas, and tables. But line 101 calls tables().get(finalId.toString()) for every ID. Fetching a Database ID via the Tables API will always throw a 404/not-found exception regardless of whether the entity was actually deleted. This means the assertion passes even if databases and schemas weren't deleted, making the test ineffective for verifying cross-type bulk deletion.

✅ Bug: SLF4J does not support {:.2f} — speedup prints as literal text

📄 openmetadata-integration-tests/src/test/java/org/openmetadata/it/tests/PrefixDeletionBenchmarkIT.java:89
SLF4J only recognizes {} as a placeholder. The {:.2f} on line 89 is Python/Rust format syntax and will be printed literally (e.g., Speedup : {:.2f}x) instead of formatting the double to 2 decimal places. Lines 86-88 correctly use {}.

✅ Bug: Prefix deletion misses relationships with NULL FQN hashes

📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/EntityRepository.java:4893-4898 📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/CollectionDAO.java:1585-1588 📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/PrefixDeletionService.java:128 📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/PrefixDeletionService.java:126-135 📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/EntityRepository.java:4896-4898 📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/PrefixDeletionService.java:172-181 📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/PrefixDeletionService.java:154-168 📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/PrefixDeletionService.java:172-180 📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/PrefixDeletionService.java:154-165 📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/CollectionDAO.java:6186-6187
Only 3 out of ~45 addRelationship call sites were updated to pass FQN strings (addServiceRelationship, TableRepository.storeRelationships, DatabaseSchemaRepository.storeRelationships). The remaining ~42 call sites across 24 repositories still use the old 7-parameter overload, which passes null for fromFQNHash/toFQNHash. While the migration backfills existing rows, any NEW relationships created after migration through the unchanged code paths will have NULL hash columns. deleteAllByFqnHashPrefix() will not match these rows, leaving orphaned relationships — the exact problem this PR aims to solve.

Consider either: (a) updating the default insert() overload to look up FQN hashes from entity tables at insert time, or (b) updating all addRelationship call sites, or (c) adding a trigger/interceptor that populates the hash columns on INSERT when they're NULL.

...and 1 more resolved from earlier reviews

🤖 Prompt for agents

Code Review: Fast FQN prefix-based bulk deletion implementation fixes race conditions but is blocked by a critical lock failure that silently ignores contention and allows unprotected deletion to proceed, plus a minor redundant search index operation for the root entity type.

1. 🚨 Bug: Lock failure silently ignored — deletion proceeds unprotected
   Files: openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/PrefixDeletionService.java:84-97, openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/PrefixDeletionService.java:68-81

   The `acquireLock` method (line 84-97) catches all exceptions (including `EntityLockedException` thrown on contention) and returns `null`. The caller `deletePrefixHard` (line 71) never checks whether the lock was actually acquired — it proceeds with all 7 deletion phases regardless. This completely defeats the stated goal of eliminating race conditions with concurrent ingestion.
   
   If the lock cannot be acquired (e.g., another deletion is in progress, or the lock manager is unavailable), the deletion runs without protection, allowing ingestion to create new entities in the subtree while it's being deleted — exactly the orphan scenario this PR aims to fix.

   Suggested fix:
   private DeletionLock acquireLock(EntityInterface entity, String deletedBy) {
     if (lockManager == null) {
       throw new IllegalStateException("Lock manager not initialized");
     }
     // Let EntityLockedException propagate to the caller
     return lockManager.acquireDeletionLock(entity, deletedBy, true);
   }

2. 💡 Performance: Redundant search index delete for root entity type
   Files: openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/PrefixDeletionService.java:175-186

   `cleanSearchIndex` calls `deleteByEntityTypeFqnPrefix` for `rootType` explicitly, then iterates `descendantTypes` which already includes the root type (since `collectDescendantIds` uses `findIdsByFqnHashPrefix` and the root entity's FQN matches its own prefix). This results in two identical delete-by-prefix calls to the search backend for the root entity type.
   
   Not a correctness bug (the call is idempotent), but it's a wasted round-trip to the search cluster.

   Suggested fix:
   private void cleanSearchIndex(EntityInterface rootEntity, Set<String> descendantTypes) {
       SearchRepository searchRepo = Entity.getSearchRepository();
       if (searchRepo == null) {
         return;
       }
       String rootFqn = rootEntity.getFullyQualifiedName();
       // descendantTypes already includes the root's own type
       for (String entityType : descendantTypes) {
         searchRepo.deleteByEntityTypeFqnPrefix(entityType, rootFqn);
       }
     }

Options

Display: compact → Showing less information.

Comment with these commands to change:

`Compact`
`gitar display:verbose`

_{Was this helpful? React with 👍 / 👎 | Gitar}

sonarqubecloud · 2026-04-13T07:09:01Z

Quality Gate passed for 'open-metadata-ingestion'

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

Copilot AI review requested due to automatic review settings April 6, 2026 20:26

github-actions bot added backend safe to test Add this label to run secure Github workflows on PRs labels Apr 6, 2026

Copilot started reviewing on behalf of mohityadav766 April 6, 2026 20:27 View session

mohityadav766 changed the title ~~Improve Deletion Process (vibe-kanban)~~ feat: implement fast FQN prefix-based bulk deletion to fix race conditions and slow cascade (Vibe Kanban) Apr 6, 2026

mohityadav766 changed the title ~~feat: implement fast FQN prefix-based bulk deletion to fix race conditions and slow cascade (Vibe Kanban)~~ feat: implement fast FQN prefix-based bulk deletion to fix race conditions and slow cascade Apr 6, 2026

Copilot AI reviewed Apr 6, 2026

View reviewed changes