Skip to content

Fixes #26737: Remove stale dbt automated tags#27368

Open
mohitjeswani01 wants to merge 7 commits intoopen-metadata:mainfrom
mohitjeswani01:fix/26737-dbt-automated-tag-removal
Open

Fixes #26737: Remove stale dbt automated tags#27368
mohitjeswani01 wants to merge 7 commits intoopen-metadata:mainfrom
mohitjeswani01:fix/26737-dbt-automated-tag-removal

Conversation

@mohitjeswani01
Copy link
Copy Markdown

@mohitjeswani01 mohitjeswani01 commented Apr 14, 2026

Description

Fixes #26737

When dbt removed tags from schema.yml, the corresponding AUTOMATED tags (labelType=Automated, appliedBy="ingestion-bot") on tables/columns were not getting cleaned up and stayed forever.

I changed TableRepository.addDataModel so that:

  • For DBT data models only, it drops AUTOMATED tags applied by ingestion-bot that are no longer present in the incoming dbt tags.
  • The same logic is applied at both table and column level.

I also added TableRepositoryDataModelTagTest to cover the main table/column scenarios and to make sure non‑dbt and non‑ingestion‑bot tags are left untouched.

Type of change:

  • Bug fix

Checklist:

  • I have read the CONTRIBUTING document.
  • My PR title is Fixes #26737: Remove stale dbt automated tags
  • I have commented on my code, particularly in hard-to-understand areas.
  • For JSON Schema changes: I updated the migration scripts or explained why it is not needed.
  • I have added a test that covers the exact scenario we are fixing. For complex issues, comment the issue number in the test for future reference.

Summary by Gitar

  • Refactored Table Profiler Configuration:
    • Added support for StaticSamplingConfig within TableRepository.addTableProfilerConfig.
  • Cache management:
    • Added explicit invalidateCacheForEntity call in addDataModel to ensure fresh state after tag merges.
  • Removed unused code:
    • Removed Suggestion related methods and imports from TableRepository.

This will update automatically on new commits.

Copilot AI review requested due to automatic review settings April 14, 2026 22:06
@github-actions
Copy link
Copy Markdown
Contributor

Hi there 👋 Thanks for your contribution!

The OpenMetadata team will review the PR shortly! Once it has been labeled as safe to test, the CI workflows
will start executing and we'll be able to make sure everything is working as expected.

Let us know if you need any help!

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes a dbt ingestion edge case where previously-applied AUTOMATED tags from ingestion-bot were not removed from tables/columns when the tags were deleted from dbt schema.yml.

Changes:

  • Added DBT-only cleanup in TableRepository.addDataModel to remove stale AUTOMATED tags applied by ingestion-bot at both table and column level.
  • Added a new unit test class covering stale-tag removal vs. preservation behavior for DBT vs non-DBT models.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/TableRepository.java Adds DBT-scoped stale automated-tag cleanup during data model application for tables and columns.
openmetadata-service/src/test/java/org/openmetadata/service/jdbi3/TableRepositoryDataModelTagTest.java Adds tests for the new stale-tag removal logic and related scenarios.

@github-actions
Copy link
Copy Markdown
Contributor

Hi there 👋 Thanks for your contribution!

The OpenMetadata team will review the PR shortly! Once it has been labeled as safe to test, the CI workflows
will start executing and we'll be able to make sure everything is working as expected.

Let us know if you need any help!

@mohitjeswani01
Copy link
Copy Markdown
Author

Hi @harshach

I’ve addressed all feedback from gitar-bot and Copilot (removed the duplicated helper in tests, wired tests to call TableRepository.removeStaleDbtAutomatedTags directly, cleaned up imports, and fixed the Javadoc).

Could you please add the safe-to-test label to trigger the CI 🙏

@PubChimps PubChimps added the safe to test Add this label to run secure Github workflows on PRs label Apr 16, 2026
@github-actions
Copy link
Copy Markdown
Contributor

The Java checkstyle failed.

Please run mvn spotless:apply in the root of your repository and commit the changes to this PR.
You can also use pre-commit to automate the Java code formatting.

You can install the pre-commit hooks with make install_test precommit_install.

@mohitjeswani01
Copy link
Copy Markdown
Author

Hi @PubChimps 👋

I see Issue #26737 has been closed as completed with my PR #27368 linked.

If there are any changes required on my end, please let me know and I'll
address them.

I'm actively monitoring CI and ready to fix any failures right away on your command..
Thank you! 🙏

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 16, 2026

🔴 Playwright Results — 1 failure(s), 17 flaky

✅ 3964 passed · ❌ 1 failed · 🟡 17 flaky · ⏭️ 86 skipped

Shard Passed Failed Flaky Skipped
✅ Shard 1 299 0 0 4
🔴 Shard 2 740 1 4 8
🟡 Shard 3 752 0 3 7
🟡 Shard 4 756 0 3 18
🟡 Shard 5 685 0 2 41
🟡 Shard 6 732 0 5 8

Genuine Failures (failed on all attempts)

Features/Glossary/GlossaryWorkflow.spec.ts › should start term as Draft when glossary has reviewers (shard 2)
Error: �[2mexpect(�[22m�[31mlocator�[39m�[2m).�[22mtoHaveText�[2m(�[22m�[32mexpected�[39m�[2m)�[22m failed

Locator:  locator('[data-row-key*="DraftTerm1777482228857"]').locator('.status-badge')
Expected: �[32m"Draft"�[39m
Received: �[31m"In Review"�[39m
Timeout:  15000ms

Call log:
�[2m  - Expect "toHaveText" with timeout 15000ms�[22m
�[2m  - waiting for locator('[data-row-key*="DraftTerm1777482228857"]').locator('.status-badge')�[22m
�[2m    18 × locator resolved to <div class="status-badge inReview" data-testid=""PW%'038d8c8c.Silly32dc9d60".DraftTerm1777482228857-status">…</div>�[22m
�[2m       - unexpected value "In Review"�[22m

🟡 17 flaky test(s) (passed on retry)
  • Features/ActivityAPI.spec.ts › Activity event is created when description is updated (shard 2, 1 retry)
  • Features/ActivityAPI.spec.ts › Activity event is created when owner is added (shard 2, 1 retry)
  • Features/ColumnBulkOperations.spec.ts › should show no results when searching for nonexistent column (shard 2, 1 retry)
  • Features/DataQuality/ColumnLevelTests.spec.ts › Column Values Missing Count To Be Equal (shard 2, 1 retry)
  • Features/RTL.spec.ts › Verify Following widget functionality (shard 3, 1 retry)
  • Flow/ObservabilityAlerts.spec.ts › Alert operations for a user with and without permissions (shard 3, 1 retry)
  • Flow/PersonaFlow.spec.ts › Set default persona for team should work properly (shard 3, 1 retry)
  • Pages/CustomProperties.spec.ts › Timestamp (shard 4, 1 retry)
  • Pages/DataContractsSemanticRules.spec.ts › Validate Owner Rule Not_In (shard 4, 1 retry)
  • Pages/DataContractsSemanticRules.spec.ts › Validate DataProduct Rule Any_In (shard 4, 1 retry)
  • Pages/EntityDataConsumer.spec.ts › Tier Add, Update and Remove (shard 5, 1 retry)
  • Pages/EntityDataSteward.spec.ts › User as Owner Add, Update and Remove (shard 5, 1 retry)
  • Pages/Lineage/DataAssetLineage.spec.ts › Column lineage for dashboard -> table (shard 6, 1 retry)
  • Pages/Lineage/LineageFilters.spec.ts › Verify Impact Analysis service filter selection (shard 6, 1 retry)
  • Pages/ServiceEntity.spec.ts › Announcement create, edit & delete (shard 6, 1 retry)
  • Pages/ServiceEntity.spec.ts › Tier Add, Update and Remove (shard 6, 1 retry)
  • Pages/Users.spec.ts › Check permissions for Data Steward (shard 6, 1 retry)

📦 Download artifacts

How to debug locally
# Download playwright-test-results-<shard> artifact and unzip
npx playwright show-trace path/to/trace.zip    # view trace

Copilot AI review requested due to automatic review settings April 16, 2026 21:03
@mohitjeswani01 mohitjeswani01 force-pushed the fix/26737-dbt-automated-tag-removal branch from c5198aa to f3c5f1f Compare April 16, 2026 21:03
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

@mohitjeswani01
Copy link
Copy Markdown
Author

@PubChimps addressed bot comments also ran mvn spotless:apply . please let me know if anything else is needed by my side 🙏

@github-actions
Copy link
Copy Markdown
Contributor

The Java checkstyle failed.

Please run mvn spotless:apply in the root of your repository and commit the changes to this PR.
You can also use pre-commit to automate the Java code formatting.

You can install the pre-commit hooks with make install_test precommit_install.

Copilot AI review requested due to automatic review settings April 16, 2026 21:28
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated no new comments.

Copilot AI review requested due to automatic review settings April 29, 2026 15:05
@gitar-bot
Copy link
Copy Markdown

gitar-bot Bot commented Apr 29, 2026

Code Review ✅ Approved 1 resolved / 1 findings

Stale dbt automated tags removed and tests updated to eliminate duplicate production logic. No issues found.

✅ 1 resolved
Quality: Test duplicates production logic instead of calling it

📄 openmetadata-service/src/test/java/org/openmetadata/service/jdbi3/TableRepositoryDataModelTagTest.java:30-43
The test class copies removeStaleDbtAutomatedTags and the merge+removal orchestration into private helper methods rather than invoking the actual TableRepository code. If the production logic is later modified (e.g., an additional condition is added), the test copy will drift silently—tests will still pass while the real behavior is untested.

Since the test is in the same package (org.openmetadata.service.jdbi3), the private method is still not accessible directly. However, you could either:

  1. Change the method visibility to package-private (remove private) so the test can call it directly, or
  2. Use reflection in the test to invoke the private method, or
  3. Extract the logic into a static utility method in a helper class that both TableRepository and the test can call.
Options

Display: compact → Showing less information.

Comment with these commands to change:

Compact
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | Gitar

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

@sonarqubecloud
Copy link
Copy Markdown

@mohitjeswani01
Copy link
Copy Markdown
Author

@PubChimps may i get a review here or is anything required from my side? thank you 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

safe to test Add this label to run secure Github workflows on PRs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

dbt ingestion cannot remove previously applied non-mutually-exclusive tags (follow-up to #26054)

3 participants