Commit f7b939c
authored
feat: clarify post_join_filter. add residual_expressions to hash join and merge join (#1044)
# Background
JoinRel, HashJoinRel, and MergeJoinRel have a field
named`post_join_filter`, causing confusion. The confusing point is
whether `post_join_filter` is part of the join predicate (i.e., two rows
are **matched** when it evaluates to `true`) or not (i.e., as the name
says, this is **post** join filter, thus `filter` evaluates after the
join operation).
Substrait
[FAQ](https://substrait.io/faq/#what-is-the-purpose-of-the-post-join-filter-field-on-join-relations)
calls out that the `post_join_filter` is part of join predicate but
@yongchul was deeply discontent about the explanation and the naming.
After lengthy discussion in the Slack channel, #807 was created. After
more back-and-forth, the original intent of `post_join_filter` was
indeed [**post** join filter, not part of join
predicate](#807 (comment)).
Also, it appears that Calcite has precisely the same name and same
semantic, post join filter. Following code point shows placing the
filter on top of the join.
https://github.com/apache/calcite/blob/0a39568b167592ded8db1128b5838982ffe264f3/core/src/main/java/org/apache/calcite/rel/rules/LoptOptimizeJoinRule.java#L553-L559
# What this changes do?
TL;DR; Make `post_join_filter` as `post_join_filter` again.
1. Documents `post_join_filter` is not join condition but semantically
filter on top of join.
2. Introduce `residual_expression` to HashJoinRel and MergeJoinRel so
that they can express non-equality join predicate -- I guess this was
the confusion who tried to shove non-equality join predicate to
`HashJoin` and `MergeJoin` somewhere and distorted the meaning without
much thought.
3. Drop `equi` from hash join and merge join documentation as with
`residual_expression` they are capable of supporting arbitrary join
predicate.
4. Fix the FAQ.
# Breaking change?
This is debatable. According to previous FAQ, `post_join_filter` was
`residual_expression` introduced in this PR. If we take the FAQ as part
of the "spec", then this is a breaking change.
If we don't consider FAQ as a spec, then this is not a breaking change
but clarification and adding a new feature -- HashJoinRel, MergeJoinRel
now supports arbitrary join predicate.
# AI disclaimer
The PR is assisted by Claud Opus 4.6.
<!-- Reviewable:start -->
- - -
This change is [<img src="https://reviewable.io/review_button.svg"
height="34" align="absmiddle"
alt="Reviewable"/>](https://reviewable.io/reviews/substrait-io/substrait/1044)
<!-- Reviewable:end -->1 parent a44716d commit f7b939c
4 files changed
Lines changed: 54 additions & 19 deletions
File tree
- proto/substrait
- site/docs
- relations
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
261 | 261 | | |
262 | 262 | | |
263 | 263 | | |
264 | | - | |
| 264 | + | |
| 265 | + | |
265 | 266 | | |
266 | 267 | | |
267 | 268 | | |
268 | 269 | | |
| 270 | + | |
| 271 | + | |
269 | 272 | | |
| 273 | + | |
| 274 | + | |
| 275 | + | |
270 | 276 | | |
271 | 277 | | |
272 | 278 | | |
| |||
837 | 843 | | |
838 | 844 | | |
839 | 845 | | |
840 | | - | |
841 | | - | |
| 846 | + | |
| 847 | + | |
| 848 | + | |
842 | 849 | | |
843 | | - | |
| 850 | + | |
| 851 | + | |
844 | 852 | | |
845 | 853 | | |
846 | 854 | | |
| |||
864 | 872 | | |
865 | 873 | | |
866 | 874 | | |
| 875 | + | |
| 876 | + | |
| 877 | + | |
867 | 878 | | |
868 | 879 | | |
869 | 880 | | |
870 | 881 | | |
871 | | - | |
| 882 | + | |
| 883 | + | |
872 | 884 | | |
| 885 | + | |
| 886 | + | |
| 887 | + | |
| 888 | + | |
| 889 | + | |
| 890 | + | |
| 891 | + | |
873 | 892 | | |
874 | 893 | | |
875 | 894 | | |
| |||
896 | 915 | | |
897 | 916 | | |
898 | 917 | | |
899 | | - | |
900 | | - | |
| 918 | + | |
| 919 | + | |
| 920 | + | |
901 | 921 | | |
902 | 922 | | |
903 | 923 | | |
| |||
923 | 943 | | |
924 | 944 | | |
925 | 945 | | |
| 946 | + | |
| 947 | + | |
| 948 | + | |
926 | 949 | | |
927 | 950 | | |
928 | 951 | | |
| 952 | + | |
| 953 | + | |
| 954 | + | |
| 955 | + | |
| 956 | + | |
| 957 | + | |
| 958 | + | |
929 | 959 | | |
930 | 960 | | |
931 | 961 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
6 | 6 | | |
7 | 7 | | |
8 | 8 | | |
9 | | - | |
| 9 | + | |
10 | 10 | | |
11 | | - | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
12 | 15 | | |
13 | 16 | | |
14 | 17 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
237 | 237 | | |
238 | 238 | | |
239 | 239 | | |
240 | | - | |
241 | | - | |
| 240 | + | |
| 241 | + | |
242 | 242 | | |
243 | 243 | | |
244 | 244 | | |
| |||
247 | 247 | | |
248 | 248 | | |
249 | 249 | | |
250 | | - | |
| 250 | + | |
251 | 251 | | |
252 | 252 | | |
253 | 253 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2 | 2 | | |
3 | 3 | | |
4 | 4 | | |
5 | | - | |
| 5 | + | |
6 | 6 | | |
7 | | - | |
| 7 | + | |
8 | 8 | | |
9 | 9 | | |
10 | 10 | | |
| |||
14 | 14 | | |
15 | 15 | | |
16 | 16 | | |
17 | | - | |
| 17 | + | |
18 | 18 | | |
19 | 19 | | |
20 | 20 | | |
| |||
23 | 23 | | |
24 | 24 | | |
25 | 25 | | |
26 | | - | |
| 26 | + | |
| 27 | + | |
27 | 28 | | |
28 | 29 | | |
29 | 30 | | |
| |||
47 | 48 | | |
48 | 49 | | |
49 | 50 | | |
50 | | - | |
| 51 | + | |
51 | 52 | | |
52 | | - | |
| 53 | + | |
53 | 54 | | |
54 | 55 | | |
55 | 56 | | |
| |||
67 | 68 | | |
68 | 69 | | |
69 | 70 | | |
70 | | - | |
| 71 | + | |
| 72 | + | |
71 | 73 | | |
72 | 74 | | |
73 | 75 | | |
| |||
0 commit comments