Commit 2232ddd
committed
feat: Support RIGHT/FULL joins in NLJ memory-limited execution
Previously RIGHT/FULL/RIGHT SEMI/RIGHT ANTI/RIGHT MARK joins were
excluded from the memory-limited (multi-pass) fallback path because
they need to track which right rows have been matched across all
left chunks. They would OOM instead of spilling.
Now all join types support the fallback:
- A global right bitmap (Vec<BooleanBuffer>, indexed by right batch
sequence number) accumulates matches across all left chunk passes.
ReplayableStreamSource guarantees consistent batch boundaries
across passes, so batch sequence numbers are stable.
- In memory-limited mode, EmitRightUnmatched merges the current
batch's bitmap into the global accumulator (bitwise OR) instead
of emitting unmatched rows immediately.
- After the last left chunk, a new state EmitGlobalRightUnmatched
replays the right side one more time and uses the accumulated
bitmap to emit unmatched right rows correctly.
Single-pass behavior is unchanged: the global bitmap path is only
active when is_memory_limited() is true.
Co-authored-by: Claude Code1 parent d59bc72 commit 2232ddd
2 files changed
Lines changed: 454 additions & 57 deletions
File tree
- datafusion
- physical-plan/src/joins
- sqllogictest/test_files
0 commit comments