Skip to content

fix(ingest/powerbi): apply lowercase normalization to all upstream lineage URN code paths#16879

Merged
aviraj-gour merged 3 commits into
masterfrom
powerbi/lineage_case
Apr 8, 2026
Merged

fix(ingest/powerbi): apply lowercase normalization to all upstream lineage URN code paths#16879
aviraj-gour merged 3 commits into
masterfrom
powerbi/lineage_case

Conversation

@aviraj-gour
Copy link
Copy Markdown
Contributor

Summary

  • Fix inconsistent URN casing for PowerBI upstream lineage to BigQuery (and other platforms) when convert_lineage_urns_to_lowercase=True (the default)
  • The standard M-Query accessor path correctly lowercased URNs via make_urn(), but the native SQL query path (Value.NativeQuery(...)) bypassed lowercasing because the shared SQL parser returns original-casing URNs for BigQuery (marked as case-sensitive in PLATFORMS_WITH_CASE_SENSITIVE_TABLES)
  • Apply lineage_urn_to_lowercase() at the consumption point in extract_lineage() and make_fine_grained_lineage_class(), catching URNs from all code paths in one place

@github-actions github-actions Bot added the ingestion PR or Issue related to the ingestion of metadata label Apr 1, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 1, 2026

Linear: ING-2149

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 1, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ All tests successful. No failed tests found.

📢 Thoughts on this report? Let us know!

@datahub-connector-tests
Copy link
Copy Markdown

datahub-connector-tests Bot commented Apr 1, 2026

Connector Tests Results

All connector tests passed for commit 53b5baa

View full test logs →

To skip connector tests, add the skip-connector-tests label (org members only).

Autogenerated by the connector-tests CI pipeline.

@maggiehays maggiehays added the needs-review Label for PRs that need review from a maintainer. label Apr 1, 2026
Copy link
Copy Markdown
Contributor

@askumar27 askumar27 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good bug fix!

# remove_special_characters must run first to expand #(lf) → \n before
# remove_drop_statement applies line-anchored patterns (USE, GO, SET, etc.)
query = native_sql_parser.remove_special_characters(query)
query = native_sql_parser.remove_drop_statement(query)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch on the ordering!

Copy link
Copy Markdown
Contributor

@askumar27 askumar27 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will need to improve test coverage for this fix.

else column_ref.column,
)
for column_ref in cll_info.upstreams
if column_ref.column
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ColumnRef.column is typed as str (non-optional), so if empty values occur, it indicates an upstream parsing bug that should be surfaced, not masked.

@aviraj-gour aviraj-gour force-pushed the powerbi/lineage_case branch from 0fe9b7a to 53b5baa Compare April 8, 2026 13:59
@aviraj-gour aviraj-gour merged commit 20b4590 into master Apr 8, 2026
49 checks passed
@aviraj-gour aviraj-gour deleted the powerbi/lineage_case branch April 8, 2026 16:34
alokr-dhub pushed a commit that referenced this pull request Apr 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ingestion PR or Issue related to the ingestion of metadata pending-submitter-merge

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants