fix(presto): iterate over copy in eliminate_semi_and_anti_joins#7455
Merged
georgesittas merged 1 commit intotobymao:mainfrom Apr 6, 2026
Merged
Conversation
When a SELECT has multiple SEMI or ANTI joins, eliminate_semi_and_anti_joins iterates over the joins list while calling join.pop(), which mutates the list mid-iteration. This causes every other SEMI/ANTI join to be skipped, leaving bare ANTI JOIN / SEMI JOIN syntax in the generated SQL — invalid for dialects like Presto/Trino that rely on this transform. Fix: iterate over list(...) so removals don't shift unvisited elements. Same approach as PR tobymao#4364 which fixed the identical pattern in unnest_to_explode. Made-with: Cursor
Contributor
Author
|
@georgesittas This is the same iterate-and-mutate pattern you fixed in #4364 for |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
When a
SELECThas multipleSEMIorANTIjoins,eliminate_semi_and_anti_joinsonly rewrites every other one. The remaining joins leak into the generated SQL as bareANTI JOIN/SEMI JOIN— invalid syntax for dialects like Presto/Trino.Root cause: the transform iterates over
expression.args.get("joins")while callingjoin.pop()inside the loop. After popping index i, the next element shifts into position i, but the iterator has already advanced to i+1 — so that element is never visited.Fix: iterate over
list(...)so removals don't shift unvisited elements. Same approach as #4364 which fixed the identical pattern inunnest_to_explode.Minimal reproduction
Before (bug):
t4leaks as bareANTI JOINAfter (fix): both rewritten to
NOT EXISTSChanges
sqlglot/transforms.py: iterate overlist(expression.args.get("joins") or [])instead of the original listtests/test_transforms.py: add tests for single, double, and triple ANTI/SEMI joinsMade with Cursor