Skip to content

[Migration] Migration to Datafusion 54 - Test#137

Open
fred1268 wants to merge 13 commits into
branch-54from
branch-54-test
Open

[Migration] Migration to Datafusion 54 - Test#137
fred1268 wants to merge 13 commits into
branch-54from
branch-54-test

Conversation

@fred1268

Copy link
Copy Markdown

Test migration to Datafusion 54

bcmyers and others added 5 commits June 10, 2026 15:12
apache#22453) (#126)

## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

- Closes #.

## Rationale for this change

When the substrait consumer hits an `Aggregate` with two identical
measures (e.g. `sum(a)` present twice), planning fails with `Schema
contains duplicate unqualified field name`. Substrait carries column
names at the plan root rather than on the measures themselves, so the
measures arrive at `Aggregate` schema construction without aliases --
and two identical exprs produce two identical field names. PR apache#20539
fixed the `NameTracker` to dedupe duplicate names in the consumer, but
it was only applied to grouping expressions, not to the measures.

The planner sees:

```
field 1: (qualifier: None, name: "sum(data.a)")
field 2: (qualifier: None, name: "sum(data.a)")
```

which is rejected when constructing the Aggregate's output schema.

## What changes are included in this PR?

Run aggregate measures through the same `NameTracker` like the grouping
expressions in `from_aggregate_rel`

## Are these changes tested?

Yes -- added a roundtrip test `aggregate_identical_measures`. Without
the fix it produces `Error: SchemaError(DuplicateUnqualifiedField {
name: "sum(data.a)" }, Some(""))`

## Are there any user-facing changes?

No.

(cherry picked from commit 097efae)
Part of apache#21172

Substrait support wasn't implemented in the core lambda support to
reduce PR size

Substrait consuming and producing of higher-order functions, lambdas and
lambda variables

Unit tests added to
`datafusion/substrait/tests/cases/roundtrip_logical_plan.rs`

None

---------

(cherry picked from commit 9a6f67e)
(cherry picked from commit 1ac2df1)

Co-authored-by: gstvg <28798827+gstvg@users.noreply.github.com>
Co-authored-by: Raz Luvaton <16746759+rluvaton@users.noreply.github.com>
Co-authored-by: Ben Bellick <36523439+benbellick@users.noreply.github.com>
@datadog-prod-us1-5

datadog-prod-us1-5 Bot commented Jun 10, 2026

Copy link
Copy Markdown

Pipelines

Fix all issues with BitsAI

⚠️ Warnings

🚦 1 Pipeline job failed

Rust | Verify Vendored Code   View in Datadog   GitHub Actions

Useful? React with 👍 / 👎

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: 93ba65a | Docs | Datadog PR Page | Give us feedback!

@fred1268 fred1268 changed the title Branch 54 test [Migration] Migration to Datafusion 54 - Test Jun 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants