expr: preserve operand errors in non-strict AND/OR reduce#37299
Open
def- wants to merge 2 commits into
Open
Conversation
antiguru
reviewed
Jun 25, 2026
13 tasks
Closes: CLU-137
Fixes two user-visible bugs in how `reduce` treats an erroring operand of
a non-strict AND/OR:
* A materialized view or query whose `mz_now()` temporal filter sits under
an OR of ANDs, e.g.
WHERE (a AND ... AND mz_now() < t) OR (b AND ... AND mz_now() < t)
could panic the compute worker (or fail to plan) with "Unsupported
temporal predicate". The shared `mz_now() < t` conjunct was left buried
inside the OR instead of being factored out, so temporal-filter
extraction failed.
* A query that should short-circuit, such as `WHERE col AND (1/0 = 1)`,
could spuriously fail at runtime (e.g. "division by zero") even on rows
where `col` is false. AND/OR are non-strict: `false AND <error>` is
`false` and `true OR <error>` is `true`, so such rows must be filtered,
not errored.
Mechanism: the generic variadic fold in `reduce` replaced a call with any
operand's literal error unconditionally, which is wrong for a function that
is not strict in errors. Fold only for the strict variadics, excluding
AND/OR and ErrorIfNull (which can absorb an operand's error at runtime).
Keeping the erroring operand then reaches `undistribute_and_or`, which
recombines operands across AND/OR's short-circuit boundary. That only
preserves error semantics for operands common to every disjunct, so skip
undistribution otherwise. A shared temporal predicate (`mz_now() < t`,
whose cast can error) is common to every disjunct, so it is still factored
out and stays extractable, unlike the reverted MaterializeInc#37049's blunt
`could_error()` skip.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation
Closes: CLU-137
Fixes two user-visible bugs in how
reducetreats an erroring operand of a non-strict AND/OR:A materialized view or query whose
mz_now()temporal filter sits under an OR of ANDs, e.g.could panic the compute worker (or fail to plan) with
Unsupported temporal predicate. The sharedmz_now() < tconjunct was left buried inside the OR instead of being factored out, so temporal-filter extraction failed. This is the CLU-137 panic.A query that should short-circuit, such as
WHERE col AND (1/0 = 1), could spuriously fail at runtime (e.g.division by zero) even on rows wherecolis false. AND/OR are non-strict:false AND <error>isfalseandtrue OR <error>istrue, so such rows must be filtered, not errored.Description
The generic variadic fold in
reducereplaced a call with any operand's literal error unconditionally, which is wrong for a function that is not strict in errors. Fold only for the strict variadics, excluding AND/OR andErrorIfNull(which can absorb an operand's error at runtime — via a dominatingfalse/trueoperand, or a non-null first argument respectively).Keeping the erroring operand then reaches
undistribute_and_or, which recombines operands across AND/OR's short-circuit boundary. That only preserves error semantics for operands common to every disjunct, so undistribution is skipped otherwise. A shared temporal predicate (mz_now() < t, whose cast can error) is common to every disjunct, so it is still factored out and stays extractable — unlike the reverted #37049's bluntcould_error()skip, which disabled this factoring and caused the CLU-137 panic.Note: pure factoring
(a AND b) OR (a AND c) → a AND (b OR c)is not error-preserving in general (for NULLa,btrue,cerroring, the original errors while the factored form yields NULL), so the guard keys on whether a could-error operand is common to all disjuncts, not on the "factoring vs absorption" distinction.Verification
src/expr/src/scalar.rs(test_reduce_and_or_preserves_operand_errors,test_reduce_and_or_undistributes_common_error) — verified red on the reverted base.test/sqllogictest/error_semantics.slt: AND/OR short-circuit-over-error queries — verified red without the fix (division by zeroon rows that should be filtered).test/sqllogictest/temporal.slt: the CLU-137mz_now()OR-of-ANDs materialized view — verified red under the reverted blunt guard (Unsupported temporal predicate).mz-expr+mz-transformunit suites and an slt sweep (predicate_pushdown,predicate_reduction,fold_constants) pass with no plan changes.