FIXES #20341 : Add reverse ingestion Workflow cleanup to Data Retention app#26300
FIXES #20341 : Add reverse ingestion Workflow cleanup to Data Retention app#26300Siddhanttimeline wants to merge 16 commits intoopen-metadata:mainfrom
Conversation
On each run, deletes, reverse-ingestion workflows (workflowType = 'REVERSE_INGESTION') whose status is Successful or Failed and whose updatedAt is older than the configured retention period, in batches of 10,000.
|
Hi there 👋 Thanks for your contribution! The OpenMetadata team will review the PR shortly! Once it has been labeled as Let us know if you need any help! |
|
Hi there 👋 Thanks for your contribution! The OpenMetadata team will review the PR shortly! Once it has been labeled as Let us know if you need any help! |
|
Hi there 👋 Thanks for your contribution! The OpenMetadata team will review the PR shortly! Once it has been labeled as Let us know if you need any help! |
…' into feature/workflow-retention-20341
🔴 Playwright Results — 1 failure(s), 20 flaky✅ 3642 passed · ❌ 1 failed · 🟡 20 flaky · ⏭️ 111 skipped
Genuine Failures (failed on all attempts)❌
|
…' into feature/workflow-retention-20341
|
|
|
|
Code Review ✅ Approved 3 resolved / 3 findingsIntegrates reverse ingestion workflow cleanup into the Data Retention app, resolving potential replication failures, missing configuration fields, and redundant test files. ✅ 3 resolved✅ Edge Case: MySQL DELETE with ORDER BY may fail under row-based replication
✅ Quality: Empty test file added with no actual tests
✅ Bug: NPE on existing configs missing new retention period fields
OptionsDisplay: compact → Showing less information. Comment with these commands to change:
Was this helpful? React with 👍 / 👎 | Gitar |
|
|
|



Describe your changes:
Fixes #20341
Data Retention – Workflow Cleanup for Reverse Metadata
Summary
This change extends the Data Retention application to automatically clean up
Workflowentities created by reverse metadata (reverse ingestion) workflows, preventing unbounded growth of these records.Changes
Retention configuration
Schema updates
reverseIngestionWorkflowRetentionPeriod(days).reverseIngestionWorkflowRetentionPeriodas required, with a default of 30 days.Default app configuration
auditLogRetentionPeriod: 90reverseIngestionWorkflowRetentionPeriod: 30.Backend cleanup logic
DataRetention app
CollectionDAO.WorkflowDAOintoDataRetention.reverse_ingestion_workflows.config.getReverseIngestionWorkflowRetentionPeriod().cleanReverseIngestionWorkflows(int retentionPeriod):executeWithStatsTracking("reverse_ingestion_workflows", ...)to batch‑delete fromautomations_workflowwhere:workflowType = 'REVERSE_INGESTION'status IN ('Successful', 'Failed')updatedAtis older than the cutoff.DAO changes
CollectionDAO.WorkflowDAO:int deleteReverseIngestionWorkflowsBeforeCutoff(long cutoffTs, int limit);automations_workflowfiltered by:workflowType = 'REVERSE_INGESTION'status IN ('Successful', 'Failed')updatedAt < :cutoffTsORDER BY updatedAt LIMIT :limitfor batching.ctidsubselect fromautomations_workflowwith the same filters on:workflowtype,status,updatedatORDER BY updatedat LIMIT :limitfor safe batched deletion.I worked on ... because ...
Type of change:
Checklist:
Fixes <issue-number>: <short explanation>