Skip to content

Feat(lineage): add support for UNPIVOT#7729

Merged
georgesittas merged 1 commit into
mainfrom
jo/unpivot_lineage
Jun 10, 2026
Merged

Feat(lineage): add support for UNPIVOT#7729
georgesittas merged 1 commit into
mainfrom
jo/unpivot_lineage

Conversation

@georgesittas

Copy link
Copy Markdown
Collaborator

Fixes #7727

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds lineage tracing support for UNPIVOT so that lineage() resolves UNPIVOT-generated value/name columns back to the source columns listed in the IN (...) clause (fixing the phantom-column behavior described in #7727).

Changes:

  • Extend lineage pivot-handling logic to cover UNPIVOT by mapping UNPIVOT output columns to their IN (...) source columns.
  • Refactor existing PIVOT column mapping logic into a shared helper (_pivot_column_mapping).
  • Add lineage tests for UNPIVOT, including coverage for UNPIVOT over a CTE.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
tests/test_lineage.py Adds regression tests asserting that score/metric_name trace to the IN (...) source columns, including through a CTE.
sqlglot/lineage.py Implements shared (UN)PIVOT output-to-source column mapping and uses it during lineage expansion (including a CTE tracing workaround).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread sqlglot/lineage.py
Comment on lines +469 to +488
def _pivot_column_mapping(pivot: exp.Pivot) -> dict[str, list[exp.Column]]:
"""Map each (UN)PIVOT output column name to the source columns it's derived from."""
mapping: dict[str, list[exp.Column]] = {}

if pivot.unpivot:
# UNPIVOT(val FOR name IN (a, b)): both the value column(s) and the name column
# are derived from the IN-list source columns
unpivot_columns = [
col
for field in pivot.fields
for e in field.expressions
for col in e.find_all(exp.Column)
]
for value_column in pivot.expressions:
for identifier in value_column.find_all(exp.Identifier):
mapping[identifier.name] = unpivot_columns
for field in pivot.fields:
if isinstance(field, exp.In):
mapping[field.this.name] = unpivot_columns

@georgesittas georgesittas merged commit 74bef2f into main Jun 10, 2026
9 checks passed
@georgesittas georgesittas deleted the jo/unpivot_lineage branch June 10, 2026 14:00
@github-actions

Copy link
Copy Markdown
Contributor

SQLGlot Integration Test Results

✅ All tests passed

Comparing:

  • this branch (sqlglot:jo/unpivot_lineage @ sqlglot 5e5491b)
  • baseline (main @ sqlglot 9e4b3d1)

Overall

main: 192441 total, 153536 passed (pass rate: 79.8%)

sqlglot:jo/unpivot_lineage: 180247 total, 142391 passed (pass rate: 79.0%)

Transitions:
No change

Dialect pair changes: 0 previous results not found, 3 current results not found

✅ All tests passed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

lineage() does not trace UNPIVOT value/name columns to their source columns (returns a phantom column on the input relation)

3 participants