fix: support aggregation with non-equality conditions in scalar subqueries#24053
fix: support aggregation with non-equality conditions in scalar subqueries#24053mergify[bot] merged 7 commits intomatrixorigin:mainfrom
Conversation
…eries (matrixorigin#23942) When a scalar subquery contains an aggregate function with non-equality correlated predicates (e.g. <, >=), the planner rejected it with NYI. The root cause: pullupThroughAgg adds inner expressions from non-eq predicates to GROUP BY, producing multiple rows per outer row, which breaks SINGLE JOIN semantics. Fix: bypass the inner AGG, use LEFT JOIN with all predicates applied directly to the raw inner rows, then add a new AGG on top that groups by outer columns. This way the aggregate operates on all matching rows correctly. Also converts starcount to count(inner_col) so that NULL rows from LEFT JOIN are not counted.
Review Summary by QodoSupport aggregation with non-equality conditions in scalar subqueries
WalkthroughsDescription• Support aggregation with non-equality conditions in scalar subqueries • Bypass inner AGG and use LEFT JOIN with re-aggregation on top • Replace starcount with count(inner_col) to handle NULL rows correctly • Convert NULL aggregate results to 0 for COUNT functions Diagramflowchart LR
A["Scalar Subquery<br/>with Non-Eq Agg"] --> B["Bypass Inner AGG"]
B --> C["LEFT JOIN<br/>All Predicates"]
C --> D["New AGG<br/>Group by Outer Cols"]
D --> E["Replace starcount<br/>with count"]
E --> F["Handle NULL<br/>Results"]
File Changes1. pkg/sql/plan/build_test.go
|
Code Review by Qodo
1.
|
There was a problem hiding this comment.
Pull request overview
This PR fixes planning of scalar correlated aggregate subqueries that include non-equality predicates (e.g. <, >=), which previously failed with a NYI error due to a non-semantics-preserving pullupThroughAgg rewrite.
Changes:
- Adds a dedicated rewrite path for scalar aggregate subqueries with non-equality correlated predicates by bypassing the inner AGG, using a LEFT JOIN, and re-aggregating grouped by outer columns.
- Adjusts COUNT semantics under LEFT JOIN by rewriting
starcounttocount(inner.Row_ID)and ensuring COUNT returns0(notNULL) on no-match. - Updates/extends tests and expected results to cover the new supported behavior and to reflect the new NYI message for derived-table outer bindings.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
pkg/sql/plan/flatten_subquery.go |
Implements new flattening/rewrite logic for scalar aggregate subqueries with non-equality correlated predicates, plus helper functions. |
pkg/sql/plan/build_test.go |
Adds a regression plan-build test that should now pass for a non-eq correlated scalar aggregate subquery. |
test/distributed/cases/dml/select/subquery.result |
Updates golden outputs to reflect newly supported correlated scalar aggregates (and related formatting/precision expectations). |
test/distributed/cases/hint/hint_cte.result |
Updates expected NYI output message for derived-table outer bindings. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…ery flatten pullupThroughAgg appends pulled-up inner expressions to aggNode.GroupBy, so the previous len(aggNode.GroupBy) > 1 guard mistakenly rejected queries with two correlated predicates (one eq + one non-eq) even when the user did not write GROUP BY. Use subCtx.groups, which only holds the user's original GROUP BY, to make the guard precise. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Merge Queue Status
This pull request spent 56 minutes 24 seconds in the queue, including 56 minutes 7 seconds running CI. Required conditions to merge
|
What type of PR is this?
Which issue(s) this PR fixes:
issue #23942
What this PR does / why we need it:
When a scalar subquery contains an aggregate function with non-equality correlated predicates (e.g.
<,>=), the planner rejected it with a NYI error.Root cause:
pullupThroughAggadds inner expressions from non-eq predicates to GROUP BY, producing multiple rows per outer row, which breaks SINGLE JOIN semantics.Fix: Bypass the inner AGG, use LEFT JOIN with all predicates applied directly to the raw inner rows, then add a new AGG on top that groups by outer columns. This way the aggregate operates on all matching rows correctly. Also converts
starcounttocount(inner_col)so that NULL rows from LEFT JOIN are not counted.