-
Notifications
You must be signed in to change notification settings - Fork 190
feat!: add id-based outer reference resolution for DAG plans #1031
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
9 commits
Select commit
Hold shift + click to select a range
65a49a8
Add id-based outer reference resolution for DAG plans
yongchul 4c94434
Apply suggestions from code review
yongchul cc1f7cd
Changes to oneof. update example. resolved PR feedback.
yongchul 649c5ef
Consistently use gte 1
yongchul e005437
Merge branch 'main' into correlated_reference
yongchul bf65fdc
Apply suggestions from code review
yongchul 0471b8c
Address PR feedback
yongchul 11560b9
Apply suggestions from code review
yongchul 0981644
rename id to rel_anchor. id_reference to rel_reference.
yongchul File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,75 +1,98 @@ | ||
| # Subqueries | ||
|
|
||
| Subqueries are scalar expressions comprised of another query. | ||
|
|
||
| ## Forms | ||
|
|
||
| ### Scalar | ||
|
|
||
| Scalar subqueries are subqueries that return one row and one column. | ||
|
|
||
| | Property | Description | Required | | ||
| | -------- | -------------- | -------- | | ||
| | Input | Input relation | Yes | | ||
|
|
||
| ### `IN` predicate | ||
|
|
||
| An `IN` subquery predicate checks that the left expression is contained in the | ||
| right subquery. | ||
|
|
||
| #### Examples | ||
|
|
||
| ```sql | ||
| SELECT * | ||
| FROM t1 | ||
| WHERE x IN (SELECT * FROM t2) | ||
| ``` | ||
|
|
||
| ```sql | ||
| SELECT * | ||
| FROM t1 | ||
| WHERE (x, y) IN (SELECT a, b FROM t2) | ||
| ``` | ||
|
|
||
| | Property | Description | Required | | ||
| | -------- | ------------------------------------------- | -------- | | ||
| | Needles | Expressions whose existence will be checked | Yes | | ||
| | Haystack | Subquery to check | Yes | | ||
|
|
||
| ### Set predicates | ||
|
|
||
| A set predicate is a predicate over a set of rows in the form of a subquery. | ||
|
|
||
| `EXISTS` and `UNIQUE` are common SQL spellings of these kinds of predicates. | ||
|
|
||
| | Property | Description | Required | | ||
| | --------- | ------------------------------------------ | -------- | | ||
| | Operation | The operation to perform over the set | Yes | | ||
| | Tuples | Set of tuples to check using the operation | Yes | | ||
|
|
||
| ### Set comparisons | ||
|
|
||
| A set comparison subquery is a subquery comparison using `ANY` or `ALL` operations. | ||
|
|
||
| #### Examples | ||
|
|
||
| ```sql | ||
| SELECT * | ||
| FROM t1 | ||
| WHERE x < ANY(SELECT y from t2) | ||
| ``` | ||
|
|
||
| | Property | Description | Required | | ||
| | --------------------- | ---------------------------------------------- | -------- | | ||
| | Reduction operation | The kind of reduction to use over the subquery | Yes | | ||
| | Comparison operation | The kind of comparison operation to use | Yes | | ||
| | Expression | Left-hand side expression to check | Yes | | ||
| | Subquery | Subquery to check | Yes | | ||
|
|
||
|
|
||
|
|
||
| === "Protobuf Representation" | ||
|
|
||
| ```proto | ||
| %%% proto.message.Expression.Subquery %%% | ||
| ``` | ||
| # Subqueries | ||
|
|
||
| Subqueries are scalar expressions comprised of another query. | ||
|
|
||
| ## Forms | ||
|
|
||
| ### Scalar | ||
|
|
||
| Scalar subqueries are subqueries that return one row and one column. | ||
|
|
||
| | Property | Description | Required | | ||
| | -------- | -------------- | -------- | | ||
| | Input | Input relation | Yes | | ||
|
|
||
| ### `IN` predicate | ||
|
|
||
| An `IN` subquery predicate checks that the left expression is contained in the | ||
| right subquery. | ||
|
|
||
| #### Examples | ||
|
|
||
| ```sql | ||
| SELECT * | ||
| FROM t1 | ||
| WHERE x IN (SELECT * FROM t2) | ||
| ``` | ||
|
|
||
| ```sql | ||
| SELECT * | ||
| FROM t1 | ||
| WHERE (x, y) IN (SELECT a, b FROM t2) | ||
| ``` | ||
|
|
||
| | Property | Description | Required | | ||
| | -------- | ------------------------------------------- | -------- | | ||
| | Needles | Expressions whose existence will be checked | Yes | | ||
| | Haystack | Subquery to check | Yes | | ||
|
|
||
| ### Set predicates | ||
|
|
||
| A set predicate is a predicate over a set of rows in the form of a subquery. | ||
|
|
||
| `EXISTS` and `UNIQUE` are common SQL spellings of these kinds of predicates. | ||
|
|
||
| | Property | Description | Required | | ||
| | --------- | ------------------------------------------ | -------- | | ||
| | Operation | The operation to perform over the set | Yes | | ||
| | Tuples | Set of tuples to check using the operation | Yes | | ||
|
|
||
| ### Set comparisons | ||
|
|
||
| A set comparison subquery is a subquery comparison using `ANY` or `ALL` operations. | ||
|
|
||
| #### Examples | ||
|
|
||
| ```sql | ||
| SELECT * | ||
| FROM t1 | ||
| WHERE x < ANY(SELECT y from t2) | ||
| ``` | ||
|
|
||
| | Property | Description | Required | | ||
| | --------------------- | ---------------------------------------------- | -------- | | ||
| | Reduction operation | The kind of reduction to use over the subquery | Yes | | ||
| | Comparison operation | The kind of comparison operation to use | Yes | | ||
| | Expression | Left-hand side expression to check | Yes | | ||
| | Subquery | Subquery to check | Yes | | ||
|
|
||
|
|
||
|
|
||
| ## Outer References in Subqueries | ||
|
|
||
| Subqueries may contain *outer references*, which are field references that reach | ||
| outside the subquery boundary to access records from an enclosing relation. | ||
| The `OuterReference` root type provides two resolution fields: | ||
|
|
||
| * `steps_out`: Resolves the reference by counting subquery boundaries | ||
| upward. This works correctly when the plan is a tree (each relation has a | ||
| single parent). | ||
|
|
||
| * `rel_reference`: Resolves the reference by naming the binding relation | ||
| via its plan-wide unique `RelCommon.rel_anchor`. Must be used instead of | ||
| `steps_out` when an outer reference appears inside a relation shared via | ||
|
|
||
| `ReferenceRel` and that shared relation can be reached through multiple | ||
|
|
||
| paths with different subquery depths, making `steps_out` ambiguous. | ||
|
|
||
|
|
||
| Exactly one of these fields must be set. See | ||
| [Field References — Outer References](field_references.md#outer-references) | ||
| for details. | ||
|
|
||
| === "Protobuf Representation" | ||
|
|
||
| ```proto | ||
| %%% proto.message.Expression.Subquery %%% | ||
| ``` |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
39 changes: 39 additions & 0 deletions
39
site/examples/proto-textformat/field_reference/outer_reference_rel_reference.textproto
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,39 @@ | ||
| # Outer reference using rel_reference (id-based resolution) | ||
| # | ||
| # Scenario: A shared relation (via ReferenceRel) contains a correlated | ||
| # filter that references a column from an enclosing relation. Because the | ||
| # shared relation can be reached through multiple paths of different | ||
| # depths, offset-based resolution (steps_out) would be ambiguous. | ||
| # rel_reference resolves this by naming the binding relation directly. | ||
| # | ||
| # Plan structure: | ||
| # | ||
| # PlanRel.relations[0].rel: (shared relation "x") | ||
| # FilterRel(col > outer_ref(rel_reference=7, position 0)) | ||
| # └── ReadRel(tableB) | ||
| # | ||
| # PlanRel.relations[1].root: | ||
| # ProjectRel [rel_anchor=7] <-- binding relation | ||
| # ├── ReadRel(tableA) | ||
| # └── Subquery.Scalar | ||
| # └── SetRel(MINUS_PRIMARY) | ||
| # ├── ProjectRel | ||
| # │ └── Subquery.Scalar | ||
| # │ └── ReferenceRel(0) (depth 2 from binding) | ||
| # └── ReferenceRel(0) (depth 1 from binding) | ||
| # | ||
| # Both ReferenceRel nodes point to the same shared relation, but they | ||
| # sit at different depths. rel_reference = 7 unambiguously resolves | ||
| # to the ProjectRel whose RelCommon.rel_anchor = 7, regardless of which | ||
| # path is taken. | ||
| # | ||
| # message Expression.FieldReference | ||
|
|
||
| outer_reference: { | ||
| rel_reference: 7 # Refers to the relation with RelCommon.rel_anchor = 7 | ||
| } | ||
| direct_reference: { | ||
| struct_field: { | ||
| field: 0 # First column of the binding relation (tableA.a) | ||
| } | ||
| } |
31 changes: 31 additions & 0 deletions
31
site/examples/proto-textformat/field_reference/outer_reference_steps_out.textproto
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,31 @@ | ||
| # Outer reference using steps_out (offset-based resolution) | ||
| # | ||
| # Scenario: A correlated scalar subquery where the inner filter references | ||
| # a column from the outer query. | ||
| # | ||
| # SQL equivalent: | ||
| # SELECT * | ||
| # FROM orders -- outer relation | ||
| # WHERE amount > ( | ||
| # SELECT AVG(amount) -- scalar subquery | ||
| # FROM orders AS o2 | ||
| # WHERE o2.customer_id = orders.customer_id -- outer reference | ||
| # ) | ||
| # | ||
| # The outer reference `orders.customer_id` is one subquery boundary up, | ||
| # so steps_out = 1. The referenced field is at position 0 (customer_id) | ||
| # in the outer relation's output. | ||
| # | ||
| # steps_out works here because the plan is a tree (each relation has | ||
| # exactly one parent), so the path to the binding relation is unambiguous. | ||
| # | ||
| # message Expression.FieldReference | ||
|
|
||
| outer_reference: { | ||
| steps_out: 1 # One subquery boundary up to the enclosing relation | ||
| } | ||
| direct_reference: { | ||
| struct_field: { | ||
| field: 0 # First column of the outer relation (customer_id) | ||
| } | ||
| } |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.