Fixes #28065: [openlinege] Add job ownership support in ingestion pipeline#28381
Conversation
|
Hi there 👋 Thanks for your contribution! The OpenMetadata team will review the PR shortly! Once it has been labeled as Let us know if you need any help! |
75512f1 to
12d0abd
Compare
|
Hi there 👋 Thanks for your contribution! The OpenMetadata team will review the PR shortly! Once it has been labeled as Let us know if you need any help! |
1 similar comment
|
Hi there 👋 Thanks for your contribution! The OpenMetadata team will review the PR shortly! Once it has been labeled as Let us know if you need any help! |
21bfb20 to
bf415d5
Compare
|
Hi there 👋 Thanks for your contribution! The OpenMetadata team will review the PR shortly! Once it has been labeled as Let us know if you need any help! |
1 similar comment
|
Hi there 👋 Thanks for your contribution! The OpenMetadata team will review the PR shortly! Once it has been labeled as Let us know if you need any help! |
bf415d5 to
8574a04
Compare
|
Hi there 👋 Thanks for your contribution! The OpenMetadata team will review the PR shortly! Once it has been labeled as Let us know if you need any help! |
8574a04 to
50206df
Compare
|
Hi there 👋 Thanks for your contribution! The OpenMetadata team will review the PR shortly! Once it has been labeled as Let us know if you need any help! |
|
|
The Python checkstyle failed. Please run You can install the pre-commit hooks with |
cb6474d to
b8f107b
Compare
…ntityType filter is set On the filtered users/teams list path (ownsEntityType / directOwnsOnly), fetchAndSetFieldsExcept iterated only fieldFetchers and never called fetchAndSetRelationshipFieldsInBulk. Any field handled solely by the batched relationship fetch was therefore dropped (Team.domains came back null), and the relationship fields that the bulk layer normally batches fell back to per-field N+1 fetches. Run the batched relationship fetch inside fetchAndSetFieldsExcept and skip both the excluded fields and the relationship-handled fields. fetchAndSetFields now delegates to it with an empty exclusion set, so the default path is unchanged. Add a TeamResourceIT regression test: listing teams with fields=owns,domains&ownsEntityType=pipeline must keep domains populated and restrict owns to pipelines.
|
@jsingh-yelp pushed a Java-side fix to this branch ( Java bug fixed: the new Two Python issues worth a look:
cc @ulixius9 please check when you get time. |
|
|
Yes, append mode rebuilds existing owners from cached Group teams + users. Non-Group team owners are excluded, but that matches current OpenMetadata validation rules code ref: https://github.com/open-metadata/OpenMetadata/blob/main/openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/EntityRepository.java#L7383
This is intentional for append mode we preload user/team ownership once so existing pipeline owners can be preserved while merging owners from OpenLineage events. This avoids owner lookups on each message, in my view, it should make the overall pipeline a lot faster; we can revisit a lazy/on-demand resolver if first-event cache loading becomes too expensive. |
|
cc: @mohittilala Can you please review this PR? as I see you made some recent openlineage changes |
|
cc: @ulixius9 , @mohittilala Can I please have review on this PR? |
Code Review ✅ Approved 10 resolved / 10 findingsIntegrates OpenLineage job ownership into pipeline ingestion by enabling owner resolution and configurable update modes, addressing previous build and logic issues including missing API field propagation, incorrect cache initialization, and unhandled null configurations. ✅ 10 resolved✅ Edge Case: Owner name with colon prefix (e.g. ":jdoe") silently resolves
✅ Performance: Owner cache loads ALL teams/users, not just pipeline-owning ones
✅ Quality: New test file uses unittest.TestCase instead of pytest
✅ Bug: ownsEntityType filter added unconditionally even when null
✅ Bug: Missing base
|
| Compact |
|
Was this helpful? React with 👍 / 👎 | Gitar



Describe your changes:
Fixes #28065
Change Summary:
Type of change:
High-level design:
N/A — small change.
Tests:
Checklist:
Fixes <issue-number>: <short explanation>Fixes #<issue-number>above.Summary by Gitar
OwnershipResolverto handleNonevalues forinclude_ownersandownership_update_modeconfiguration parameters._ensure_pipeline_owner_cacheusing_owner_cache_loadedflag instead of relying on nullability of dictionaries.EntityRepositorymethods to passListFilterintolistInternal,serializeJsons, andsetFieldsInBulkto improve context-aware entity processing.Nonevalue handling forinclude_ownersand defaultreplacebehavior forownership_update_modeintest_openlineage_ownership.py.This will update automatically on new commits.