Fixes #4003: bulk + async restore for large entity hierarchies#27997
Conversation
EntityRepository.restoreEntity walked descendants synchronously, taking 4+ minutes on a 12k-table database and exceeding typical proxy timeouts. restoreChildren now groups CONTAINS children by type and dispatches one bulkRestoreSubtree per type, batching DB writes, version history, change events, and cache invalidation; the existing ES cascade handles descendant index updates in one update_by_query. Adds an async option (?async=true) on the deep-hierarchy restore endpoints that returns 202 Accepted with a job id and runs the restore on AsyncService, emitting WebSocket notifications on restoreEntityChannel. Java SDK adds .restore().async().execute() fluent builders on Tables/Databases plus restoreServerAsync on EntityServiceBase; Python SDK mirrors this with restore_request().with_async().execute() and restore_async() helpers on BaseEntity, exposing a new AsyncJobResponse type. Tests: EntityRepositoryRestoreTest verifies the per-type grouping and bulk dispatch path; RestoreFluentAPITest covers the Java SDK fluent behavior; RestoreHierarchyIT exercises sync and async restore against a real DB→schemas→tables tree end-to-end; test_restore_async.py covers the Python SDK paths. Fixes #4003 Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
The Python checkstyle failed. Please run You can install the pre-commit hooks with |
There was a problem hiding this comment.
Pull request overview
This PR addresses restore timeouts for large entity hierarchies by replacing per-entity recursive restores with a bulk restore path, and adds an optional server-side async restore mode that returns 202 Accepted and notifies completion via WebSockets. It also updates the Java and Python SDKs to expose the async restore option and adds unit/integration tests covering the new behavior.
Changes:
- Implement bulk subtree restore (
bulkRestoreSubtree) and type-grouped child restore dispatch to reduce DB/ES work per restore. - Add
?async=truerestore option on deep-hierarchy endpoints, backed byAsyncServiceand a newrestoreEntityChannelWebSocket notification channel. - Update Java/Python SDKs with async restore helpers + fluent builders, and add unit/integration tests for sync/async restore flows.
Reviewed changes
Copilot reviewed 23 out of 23 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/EntityRepository.java | Adds grouped restoreChildren and new bulkRestoreSubtree bulk restore implementation plus extension hook. |
| openmetadata-service/src/main/java/org/openmetadata/service/resources/EntityResource.java | Adds restoreEntity(..., async) overload and server-side async restore execution + 202 response. |
| openmetadata-service/src/main/java/org/openmetadata/service/util/WebsocketNotificationHandler.java | Adds restore completion/failure WebSocket notification helpers. |
| openmetadata-service/src/main/java/org/openmetadata/service/socket/WebSocketManager.java | Introduces RESTORE_ENTITY_CHANNEL constant. |
| openmetadata-service/src/main/java/org/openmetadata/service/util/RestoreEntityResponse.java | New 202 response DTO for async restore requests. |
| openmetadata-service/src/main/java/org/openmetadata/service/util/RestoreEntityMessage.java | New WebSocket payload DTO for restore status notifications. |
| openmetadata-service/src/main/java/org/openmetadata/service/resources/databases/DatabaseResource.java | Wires ?async=true into database restore endpoint + OpenAPI 202 response. |
| openmetadata-service/src/main/java/org/openmetadata/service/resources/databases/DatabaseSchemaResource.java | Wires ?async=true into schema restore endpoint + OpenAPI 202 response. |
| openmetadata-service/src/main/java/org/openmetadata/service/resources/databases/TableResource.java | Wires ?async=true into table restore endpoint + OpenAPI 202 response. |
| openmetadata-service/src/main/java/org/openmetadata/service/resources/databases/StoredProcedureResource.java | Wires ?async=true into stored procedure restore endpoint + OpenAPI 202 response. |
| openmetadata-service/src/main/java/org/openmetadata/service/resources/services/database/DatabaseServiceResource.java | Wires ?async=true into database service restore endpoint + OpenAPI 202 response. |
| openmetadata-service/src/main/java/org/openmetadata/service/resources/dashboards/DashboardResource.java | Wires ?async=true into dashboard restore endpoint + OpenAPI 202 response. |
| openmetadata-service/src/main/java/org/openmetadata/service/resources/datamodels/DashboardDataModelResource.java | Wires ?async=true into data model restore endpoint + OpenAPI 202 response. |
| openmetadata-service/src/main/java/org/openmetadata/service/resources/storages/ContainerResource.java | Wires ?async=true into container restore endpoint + OpenAPI 202 response. |
| openmetadata-sdk/src/main/java/org/openmetadata/sdk/services/EntityServiceBase.java | Adds restoreServerAsync(...) calling PUT /restore?async=true. |
| openmetadata-sdk/src/main/java/org/openmetadata/sdk/models/AsyncJobResponse.java | New Java SDK model for 202 async job response. |
| openmetadata-sdk/src/main/java/org/openmetadata/sdk/fluent/Tables.java | Adds fluent restore builder with .async().execute() for server-side async restore. |
| openmetadata-sdk/src/main/java/org/openmetadata/sdk/fluent/Databases.java | Adds fluent restore builder with .async().execute() for server-side async restore. |
| ingestion/src/metadata/sdk/entities/base.py | Adds Python AsyncJobResponse + restore fluent operation + restore_async. |
| openmetadata-service/src/test/java/org/openmetadata/service/jdbi3/EntityRepositoryRestoreTest.java | Unit tests for grouped restore children + bulk restore no-op behaviors. |
| openmetadata-sdk/src/test/java/org/openmetadata/sdk/fluent/RestoreFluentAPITest.java | Unit tests validating fluent restore routes to sync vs server-async SDK calls. |
| ingestion/tests/unit/sdk/test_restore_async.py | Python unit tests for restore_async and fluent restore request behavior. |
| openmetadata-integration-tests/src/test/java/org/openmetadata/it/tests/RestoreHierarchyIT.java | Integration test validating sync restore and async restore (202 + eventual restored state). |
Comments suppressed due to low confidence (1)
openmetadata-service/src/main/java/org/openmetadata/service/resources/EntityResource.java:790
repository.restoreEntity(...)can returnnullwhen the entity is not in the deleted state (see EntityRepository.restoreEntity catch). The sync restore path immediately dereferencesresponse.getEntity()which will throw NPE and return 500. Handle the null case explicitly (e.g., return 404/400 with a clear message) before callingrestoreFromSearch/addHref.
OperationContext operationContext =
new OperationContext(entityType, MetadataOperation.EDIT_ALL);
authorizer.authorize(securityContext, operationContext, getResourceContextById(id));
PutResponse<T> response =
repository.restoreEntity(securityContext.getUserPrincipal().getName(), id);
repository.restoreFromSearch(response.getEntity());
addHref(uriInfo, response.getEntity());
LOG.info(
…cade Two follow-up improvements to the bulk restore introduced for #4003. Batched findTo per tree level: bulkRestoreSubtree previously issued one findTo per parent during recursion, which at the schemas → tables level of a 12k-table database meant 12k DB round trips just to enumerate children. The new bulkRestoreContainedChildren helper does one findToBatchAllTypes per tree level regardless of fan-out, then groups the results by child type and dispatches to each repo's bulkRestoreSubtree. DashboardRepository's chart-restore logic moves from the now-bypassed restoreChildren override to the existing restoreAdditionalChildren extension hook so it still runs both for direct dashboard restores and when dashboards are descendants of a larger restore. Symmetric bulk soft-delete cascade: deleteByName/deleteById had the same per-entity recursion that this PR fixed for restore — soft- deleting a database with 12k tables ran 12k recursive Entity.deleteEntity calls, each writing one row + one ES update + one change event. New bulkSoftDeleteSubtree mirrors bulkRestoreSubtree: one batched findToBatchAllTypes per level, deferred-store DB writes, batched version history, batched change events, batched cache invalidation; per-descendant ES writes are skipped because the existing deleteFromSearch cascade flips the deleted flag on descendant indexes in one update_by_query. deleteChildren(List, hardDelete=false, ...) now dispatches to the bulk path; hard-delete keeps the existing batchDeleteChildren path. New softDeleteAdditionalChildren extension hook mirrors restoreAdditionalChildren; DashboardRepository's chart soft-delete migrates onto it for the same reason. Tests: extends EntityRepositoryRestoreTest with cases that verify findToBatchAllTypes is invoked exactly once per level (not once per parent) for both bulk operations, plus the existing grouping/dispatch shape for the soft-delete entry point. Extends RestoreHierarchyIT with a recursive soft-delete cascade assertion. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
The Java checkstyle failed. Please run You can install the pre-commit hooks with |
|
The Python checkstyle failed. Please run You can install the pre-commit hooks with |
🟡 Playwright Results — all passed (12 flaky)✅ 4146 passed · ❌ 0 failed · 🟡 12 flaky · ⏭️ 90 skipped
🟡 12 flaky test(s) (passed on retry)
How to debug locally# Download playwright-test-results-<shard> artifact and unzip
npx playwright show-trace path/to/trace.zip # view trace |
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…eadsInBatch mohityadav766 flagged that the new deleteByAbout(List<UUID>) overload was a fake batch — it just looped over the single-entity deleteByAbout(UUID), which itself looped over deleteThreadInternal(threadId) one thread at a time. So a DatabaseService with N descendants × M threads each would still produce N×M individual DELETEs against thread_entity / entity_relationship / field_relationship. Switch both overloads to a single batched path: - Add CollectionDAO.FeedDAO.findByEntityIds(tableName, List<String>) so we pull every thread id for the whole input batch in one IN-list query. - Reuse the existing deleteThreadsInBatch(List<UUID>) helper (which already fires one bulk DELETE per table — thread_entity, entity_relationship, field_relationship) instead of looping deleteThreadInternal. - Collapse the single-entity overload into deleteByAbout(List.of(id)) so both callers go through the same code. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ram limits Reviewer flagged that the new findByEntityIds + deleteThreadsInBatch path passes the full id list straight into an IN-clause, where JDBI's @BindList expands every element into its own bind parameter. For the 12k-descendant hierarchies this PR specifically targets, that blows past SQL Server's ~2100-parameter ceiling and bloats MySQL's max_allowed_packet budget. Chunk both paths to 500 ids per query (matches the existing EntityRepository.RELATION_DELETE_BATCH_SIZE used for the same reason on the relationship side): - deleteByAbout(List<UUID>) walks entityIds in 500-id chunks against findByEntityIds, accumulates thread ids across chunks, then hands them to deleteThreadsInBatch. - deleteThreadsInBatch(List<UUID>) walks threadIds in 500-id chunks against deleteAllByThreadIds / deleteAllByPrefixes / deleteByIds and sums the per-chunk delete counts. A per-chunk find failure no longer aborts the whole cleanup — log it and move on, matching the original "skip if thread storage is unavailable" contract. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copilot reviewer flagged that the deleteByAbout(List<UUID>) refactor lost the per-thread try/catch that previously made legacy feed cleanup best-effort. A malformed thread id (UUID.fromString throws IllegalArgumentException) or a DAO/runtime failure inside deleteThreadsInBatch would now propagate up the caller's hard-delete @transaction and roll the whole cascade back — even though the old per-thread loop deliberately swallowed those so legacy feed issues never blocked entity deletion. Re-add both guards: - Parse thread ids defensively: skip + LOG.warn on each malformed id rather than throwing IllegalArgumentException out of the cleanup. - Wrap deleteThreadsInBatch in try/catch so a bulk DELETE failure logs at warn level and lets the parent hard-delete commit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copilot reviewer flagged EntityDAO.deleteByIds(List<UUID>) as an unbounded IN-clause: JDBI's @BindList expands every id into a separate bind parameter, and the bulk hard-delete path now walks 12k+ entity hierarchies, so a single DELETE statement would blow past SQL Server's ~2100-parameter ceiling and MySQL's max_allowed_packet budget. Mirror the findEntitiesByIds chunking already used in this file — 30k ids per chunk, short-circuit when the list fits in one chunk, sum per-chunk delete counts otherwise. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Address two Copilot quality flags: 1. The 30k IN-list chunk size was duplicated as a magic number in four EntityDAO methods (findReferencesByFqns, deleteByIds, findEntitiesByIds, findEntityByNames). Hoist to a single MAX_IN_LIST_CHUNK_SIZE constant on the interface so all four stay in sync if the limit needs tuning. 2. EntityDAO.updateFqn interpolated user-supplied prefixes into raw SQL with only apostrophe + double-quote escaping. MySQL's default mode treats `\` as a string-literal escape char, and Postgres does too when standard_conforming_strings is off, so an unescaped `\n` in newPrefix would be parsed as a newline before the REGEXP_REPLACE / REPLACE ever saw it. Name validation currently forbids these characters, but defense-in-depth is cheap. Added ListFilter.escapeBackslashAndApostrophe (backslash first, then apostrophe — order matters, otherwise the `\\` we just inserted gets re-doubled). escape() now composes through this helper, so the existing LIKE-aware path also picks up backslash escaping. updateFqn's MySQL replacement string and both Postgres prefixes route through the new helper before any further (LIKE / double-quote) escape pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…fore bulk restore/soft-delete updaters Copilot reviewer flagged that bulkRestoreSubtree (and by extension bulkSoftDeleteSubtree) builds EntityUpdaters from entities loaded via raw storage JSON (loadForBulk -> dao.findEntitiesByIds -> no setFields call). Entities whose updater rewrites relationships unconditionally — e.g. DashboardRepository.DashboardUpdater.entitySpecificUpdate which does deleteFrom(... HAS ...) + addRelationship for every charts/dataModels entry — would diff a null original.charts against a null updated.charts, fire the compareAndUpdate lambda (shouldCompare returns true for PUT/SOFT_DELETE with patchedFields=null), wipe the HAS rows, and re-add nothing. Net: every chart link disappears as a side effect of restoring or soft-deleting the parent — well before restoreAdditionalChildren / softDeleteAdditionalChildren walk the relation set. This is exactly the failure mode the single-entity restoreEntity already documents and guards against by calling setFieldsInternal(original, putFields, ALL) before building its updater. The bulk replacement needs the same contract. - New helper hydrateRelationsForBulkUpdater(List<T>) loops setFieldsInternal with Include.ALL. Per-entity rather than setFieldsInBulk because the bulk fetchers (e.g. DashboardRepository.batchFetchCharts) hard-code NON_DELETED when batching HAS lookups, which would still hide cascade-deleted charts from the restore path. Restore batches are typically one subtree level, so the extra DB hits are acceptable. - Hooked into bulkRestoreSubtree right before buildBulkUpdaters, and into applyBulkSoftDelete with the same rationale. EntityRepositoryRestoreTest 13/13, AsyncServiceTest 11/11 still pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- EntityRepository.hydrateRelationsForBulkUpdater: log hydration failures at warn level with entity id/type instead of swallowing them silently, so any subsequent HAS-row wipe is correlatable in production logs (gitar-bot empty-catch flag). - EntityDAO.MAX_IN_LIST_CHUNK_SIZE Javadoc: drop the misleading SQL Server reference (not a supported connection type — its 2100-parameter ceiling would need a much smaller constant) and document the MySQL / PostgreSQL constraints that actually apply to this codebase (Copilot doc accuracy flag). - FeedRepository.FEED_IN_BATCH_SIZE comment: same — replace SQL Server hedging with MySQL/PostgreSQL framing and call out why the per-statement budget is tighter than the EntityDAO constant (three IN-list statements per chunk). - EntityResource.restoreEntity null branch: drop the redundant find(id, ALL) probe — EntityRepository.restoreEntity now calls find(id, ALL) up front and throws EntityNotFoundException for missing ids, so a null response can only mean "exists but not deleted" → 400. Updated the comment to match (Copilot stale-comment flag). EntityRepositoryRestoreTest 13/13, AsyncServiceTest 11/11 still pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
bulkHardDeleteSubtree was walking the cascade, deleting DB rows, but never
firing the per-entity Elasticsearch removal that the legacy Entity.deleteEntity
path triggered via delete()'s top-level deleteFromSearch dispatch. The bulk
replacement is the only path that walks cascaded children now, so a missing
call leaves stale ES docs that survive the hard delete.
Concrete symptom: Playwright Domains.spec.ts:533 ("Should clear assets from
data products after deletion of data product in Domain") consistently failed
on this branch. The test creates Domain + DataProduct(PW_DataProduct_Sales),
recursively hard-deletes the Domain, then re-creates Domain + DataProduct
with the same names. The UI search for PW_DataProduct_Sales then returned
TWO rows: the new DataProduct from the re-create plus a stale ES doc for the
"deleted" cascade child whose DB row was actually gone. The test died on
selectDataProduct's strict-mode locator match.
Add deleteFromSearch(entity, true) alongside postDelete inside the
end-of-walk loop. Mirrors the per-entity contract — legacy delete() called
postDelete then deleteFromSearch in that order, the bulk path now does the
same per cascaded entity.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…lus regression tests
Copilot reviewer flagged that EntityDAO.updateFqn's MySQL path used
escapeBackslashAndApostrophe(newPrefix), which only covers the SQL
string-literal layer. The value is then fed to REGEXP_REPLACE's
replacement argument, which has its own escape semantics — `\X` is a
backreference / escape sequence, so a backslash that survived the SQL layer
gets re-interpreted by the regex engine. For an input like "foo\1bar",
"\1" would be treated as a capture-group backreference rather than a
literal "\1" in the rewritten FQN; for an input like "foo\bar", the regex
engine would consume the backslash as the start of an undefined escape.
Add ListFilter.escapeForMySqlRegexReplacement(s) that composes both layers:
Step 1 — regex replacement: \ → \\ (regex engine emits literal \)
Step 2 — SQL string-literal: \\ → \\\\, ' → ''
Net: one input \ → four \ in the SQL statement text → two \ for the regex
engine → one literal \ in the replacement output. Apostrophes only matter
for the SQL layer, so they're left to step 2's existing single-pass
doubling (composing escapeBackslashAndApostrophe twice would re-escape
apostrophes — chose explicit two-step composition to avoid that pitfall).
EntityDAO.updateFqn's MySQL replacement now routes through the new helper.
The Postgres path stays on escapeBackslashAndApostrophe + escapeDoubleQuotes
because PG embeds the value in a static REPLACE call, not a regex
replacement — the regex layer doesn't apply there.
Test coverage in ListFilterTest:
- test_escapeBackslashAndApostrophe_* — existing primitive
- test_escape_alsoDoublesBackslashesViaBackslashAndApostrophe — regression
guard that escape() picks up the backslash hardening transparently
- test_escapeForMySqlRegexReplacement_* — new helper, including
a "\1 backreference look-alike" case to guard the original concern
directly
All 22 ListFilterTest cases pass. 46/46 across
EntityRepositoryRestoreTest + AsyncServiceTest + ListFilterTest.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… generic EntityRestorer The fluent restore API previously existed only on Tables and Databases. User flagged the gap — restore (sync + server-side async via PUT /restore?async=true) is supported by the server for every entity-type, and the Java SDK base service (EntityServiceBase.restore / restoreServerAsync) is already shared across all of them. The fluent layer was the only thing missing. Rather than copy-paste TableRestorer/AsyncTableRestorer for each entity, hoist the pattern into a single generic helper: - common/EntityRestorer<T> — sync .execute() returns T, .async() switches mode - common/AsyncEntityRestorer<T> — .execute() returns AsyncJobResponse Tables and Databases now wire the generic helper too (drops the entity-specific restorer pair on each — ~80 lines removed). restore() now lives on 26 additional fluent classes' Finder builders: AIApplications, AIGovernancePolicies, Charts, Classifications, Containers, DashboardDataModels, Dashboards, DataContracts, DataProducts, DatabaseSchemas, Domains, Glossaries, GlossaryTerms, LLMServices, McpServers, Metrics, MlModels, Pipelines, PromptTemplates, Queries, SearchIndexes, StoredProcedures, Tags, Teams, Topics, Users. Each is a 3-line wrap-and-delegate; future restore-capable entity-types are a one-method addition. Tests in RestoreFluentAPITest extended from 4 → 12: kept the original Tables / Databases sync+async pairs, added per-fluent route-through assertions for DatabaseSchemas, Dashboards, Pipelines, Topics, MlModels, Containers, Glossaries, Domains. They all go through the same EntityRestorer<T>, so the representative sample is enough to catch a typo in any single fluent's wire-up without re-asserting the same plumbing 26 times. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`findByEntityIds` now selects DISTINCT ids — currently a no-op given the PK + single-valued entityId column, but cheap insurance against future joins or schema shifts that could fan out duplicates. `FeedRepository.deleteByAbout` accumulates into a LinkedHashSet so a caller passing an entityIds list with duplicates (or any future change that allows per-chunk overlap) can't inflate the downstream IN lists in `deleteThreadsInBatch` (3 DELETEs per 500-id chunk). Addresses Copilot review comments 3276652759 and 3276652802 on #27997. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Code Review ✅ Approved 14 resolved / 14 findingsImplements bulk-restore and async entity hierarchy processing to resolve timeout issues, while hardening MySQL regex escaping in FQN updates. All identified architectural and safety concerns have been successfully addressed. ✅ 14 resolved✅ Bug: bulkRestoreSubtree calls restoreChildren outside @transaction boundary
✅ Bug: uriInfo used in async lambda after request scope ends
✅ Quality: Fluent restorer classes duplicate identical pattern across entities
✅ Quality: bulkRestore/SoftDeleteContainedChildren are near-identical duplicates
✅ Quality: bulkSoftDeleteSubtree exceeds 15-line method limit (83 lines)
...and 9 more resolved from earlier reviews OptionsDisplay: compact → Showing less information. Comment with these commands to change:
Was this helpful? React with 👍 / 👎 | Gitar |
|



Describe your changes:
Fixes #4003
EntityRepository.restoreEntitywalked the descendant tree synchronously inside one HTTP request — on a database with 12k tables / 1.1k stored procedures the user reported 4m37s response times that exceeded typical proxy / ALB idle timeouts. This PR replaces the per-entity recursion with an iterative bulk path and adds a server-side async option for hierarchies large enough that the synchronous response would still hit timeouts.Type of change:
High-level design:
Bulk synchronous path.
restoreChildrennow groups CONTAINS children by entity type and dispatches a singlebulkRestoreSubtree(ids, updatedBy)per type rather than loopingEntity.restoreEntityper descendant.bulkRestoreSubtreereuses the existing deferred-store bulk infrastructure (updateMany,entityExtensionDAO.insertMany,invalidateMany,insertChangeEventsBatch) for one batched DB write per type. Per-descendant ES writes go throughEntityLifecycleEventDispatcher.onEntitiesUpdatedsoSearchIndexHandler.updateEntitiesIndexpicks up HAS-style children (charts) that the top-level cascade query doesn't reach. Subclass extension hooksrestoreAdditionalChildren/softDeleteAdditionalChildrenare provided for repos that link non-CONTAINS related entities.Symmetric bulk soft-delete.
bulkSoftDeleteSubtreemirrors the restore path — same shape, same hook contract — and is dispatched fromdeleteChildren(List, hardDelete=false, …).Async option. New
EntityResource.restoreEntityAsync(...)returns 202 Accepted with aRestoreEntityResponse(jobId, message)and runs the restore onAsyncService.AsyncServiceis now a one-line wrapper aroundExecutors.newThreadPerTaskExecutor(virtual)— back-pressure under load comes from the JDBI connection pool (with its own timeout), not from a custom semaphore wrapper. Completion / failure is broadcast on theRESTORE_ENTITY_CHANNELWebSocket channel; the user id is resolved on the request thread before submitting so the notification doesn't depend on the request-scopedSecurityContext. The single-argrestoreEntity(uriInfo, sec, id)helper reads?async=truedirectly fromuriInfo, so SDK callers get universal async support regardless of which Resource subclass forwarded the parameter.SDK updates. Java SDK adds an
AsyncJobResponsemodel andEntityServiceBase.restoreServerAsync(id); new fluent builders onTablesandDatabasesgiveTables.find(id).restore().execute()(sync, returnsTable) andTables.find(id).restore().async().execute()(returnsAsyncJobResponse) with type-safe terminal operations. Python SDK mirrors this withTables.restore_async(id)and the fluentTables.restore_request(id).with_async().execute()plus anAsyncJobResponsedataclass. Both SDKs validate the response carries a non-nulljobIdand raise a clear error if not (in case an older server ignores the param).Ingestion integration. Legacy
OpenMetadataclient getsdelete_async/restore_asynchelpers.DeleteEntitycarries adispatch_asyncflag andMetadataRestSink.delete_entityroutes to the async client method when set.delete_entity_from_source/delete_entity_by_nameread theOM_INGESTION_DELETE_ASYNCenv var by default so operators can flip an ingestion run to fire-and-forget without per-connector schema changes.Tests:
Use cases covered
?async=true, observe 202 + jobId, then poll until the database flips todeleted=false.restoreChildrencorrectly groups mixed-type children (DatabaseSchema + StoredProcedure) and dispatches onebulkRestoreSubtreeper type with no per-entityEntity.restoreEntitycalls.bulkRestoreSubtreeis a no-op for null / empty IDs; when entities exist but none are deleted, the additional-children hook still fires (re-entered cascade reconciliation).Tables.find(id).restore().async().execute()andDatabases.find(id).restore().async().execute()route torestoreServerAsync; without.async()they go to the sync path.Tables.restore_async(id)appends?async=true, returns anAsyncJobResponse; the fluentrestore_request(id).with_async().execute()does the same.OpenMetadata.delete_async/restore_asyncURL shape coverage.Unit tests
openmetadata-service/src/test/java/.../jdbi3/EntityRepositoryRestoreTest.javaopenmetadata-service/src/test/java/.../util/AsyncServiceTest.javaopenmetadata-sdk/src/test/java/.../fluent/RestoreFluentAPITest.javaingestion/tests/unit/sdk/test_restore_async.pyingestion/tests/unit/test_ometa_restore.pyBackend integration tests
openmetadata-integration-tests/for new/changed API endpoints.openmetadata-integration-tests/src/test/java/.../tests/RestoreHierarchyIT.javaIngestion integration tests
Playwright (UI) tests
Manual testing performed
mvn -pl openmetadata-service,openmetadata-sdk -DskipTests compile— clean build.mvn -pl openmetadata-integration-tests test-compile— clean build.mvn -pl openmetadata-service test -Dtest='EntityRepositoryRestoreTest,AsyncServiceTest'— 20/20 pass.mvn -pl openmetadata-sdk test -Dtest=RestoreFluentAPITest— 4/4 pass.UI screen recording / screenshots:
Not applicable.
Checklist:
Fixes <issue-number>: <short explanation>Fixes #<issue-number>above.RestoreHierarchyITbuilds the multi-level hierarchy and exercises both the bulk-sync and async paths).🤖 Generated with Claude Code
Summary by Gitar
LinkedHashSetinFeedRepository.deleteByAboutto defensively de-duplicate thread IDs across chunks during legacy feed cleanup.DISTINCTtoCollectionDAO.findByEntityIdsas a defensive measure to ensure unique thread ID retrieval regardless of future schema changes.This will update automatically on new commits.