Skip to content

Commit a869ab3

Browse files
author
kongfanshen
committed
fix: place JOIN_RIGHT_SEMI at end of JoinType enum to preserve ABI
The initial backport (commit for aa86129e1) inserted JOIN_RIGHT_SEMI in its upstream position, immediately before JOIN_RIGHT_ANTI. In the Greenplum/Cloudberry tree this shifts the integer values of the GPDB-specific join codes that follow it -- JOIN_RIGHT_ANTI, JOIN_UNIQUE_OUTER/INNER and especially JOIN_DEDUP_SEMI / JOIN_DEDUP_SEMI_REVERSE. Some value-dependent code relies on those stable integer values; shifting them corrupts MPP motion planning. Concretely, the regression query with ctetable as not materialized (select 1 as f1) select * from ctetable c1 where f1 in (select c3.f1 from ctetable c2 full join ctetable c3 on true); (which the baseline plans as a plain Hash Semi Join returning 1 row) started producing a degenerate plan with "Gather Motion 0:1" / "Redistribute Motion 1:0" and crashed with SIGSEGV in setupCdbProcessList() during dispatch. Isolation showed the crash reproduces with ONLY the enum change applied (clean full rebuild) and disappears entirely once the pre-existing enum values are kept stable. Move JOIN_RIGHT_SEMI to the end of the enum so every previously-defined value is unchanged. JOIN_RIGHT_SEMI is still a fully executor-supported join type; only its numeric position differs from upstream. Verified: the crashing query now returns 1 row under both optimizers, and Hash Right Semi Join is still chosen (and correct) for the small-build-side case.
1 parent 5f33f44 commit a869ab3

1 file changed

Lines changed: 13 additions & 2 deletions

File tree

src/include/nodes/nodes.h

Lines changed: 13 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1026,7 +1026,6 @@ typedef enum JoinType
10261026
JOIN_LASJ_NOTIN, /* Left Anti Semi Join with Not-In semantics:
10271027
If any NULL values are produced by inner side,
10281028
return no join results. Otherwise, same as LASJ */
1029-
JOIN_RIGHT_SEMI, /* 1 copy of each RHS row that has match(es) */
10301029
JOIN_RIGHT_ANTI, /* 1 copy of each RHS row that has no match */
10311030

10321031
/*
@@ -1046,7 +1045,19 @@ typedef enum JoinType
10461045
* moving the larger of the two relations.
10471046
*/
10481047
JOIN_DEDUP_SEMI, /* inner join, LHS path must be made unique afterwards */
1049-
JOIN_DEDUP_SEMI_REVERSE /* inner join, RHS path must be made unique afterwards */
1048+
JOIN_DEDUP_SEMI_REVERSE, /* inner join, RHS path must be made unique afterwards */
1049+
1050+
/*
1051+
* JOIN_RIGHT_SEMI (backported from upstream commit aa86129e1) is an
1052+
* executor-supported join type, but it is deliberately placed at the END
1053+
* of this enum rather than next to JOIN_RIGHT_ANTI where upstream puts it.
1054+
* Inserting it in the middle would shift the integer values of the
1055+
* GPDB-specific JOIN_DEDUP_SEMI/REVERSE (and JOIN_UNIQUE_*) codes, which
1056+
* breaks value-dependent code elsewhere and corrupts MPP motion planning
1057+
* (observed as a SIGSEGV during dispatch on semijoins over non-partitioned
1058+
* loci). Appending here keeps every pre-existing value stable.
1059+
*/
1060+
JOIN_RIGHT_SEMI /* 1 copy of each RHS row that has match(es) */
10501061

10511062
/*
10521063
* We might need additional join types someday.

0 commit comments

Comments
 (0)