Skip to content

expr: preserve operand errors in non-strict AND/OR reduce#37299

Open
def- wants to merge 2 commits into
MaterializeInc:mainfrom
def-:pr-clu-137
Open

expr: preserve operand errors in non-strict AND/OR reduce#37299
def- wants to merge 2 commits into
MaterializeInc:mainfrom
def-:pr-clu-137

Conversation

@def-

@def- def- commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

Motivation

Closes: CLU-137

Fixes two user-visible bugs in how reduce treats an erroring operand of a non-strict AND/OR:

  • A materialized view or query whose mz_now() temporal filter sits under an OR of ANDs, e.g.

    WHERE (a AND ... AND mz_now() < t) OR (b AND ... AND mz_now() < t)

    could panic the compute worker (or fail to plan) with Unsupported temporal predicate. The shared mz_now() < t conjunct was left buried inside the OR instead of being factored out, so temporal-filter extraction failed. This is the CLU-137 panic.

  • A query that should short-circuit, such as WHERE col AND (1/0 = 1), could spuriously fail at runtime (e.g. division by zero) even on rows where col is false. AND/OR are non-strict: false AND <error> is false and true OR <error> is true, so such rows must be filtered, not errored.

Description

The generic variadic fold in reduce replaced a call with any operand's literal error unconditionally, which is wrong for a function that is not strict in errors. Fold only for the strict variadics, excluding AND/OR and ErrorIfNull (which can absorb an operand's error at runtime — via a dominating false/true operand, or a non-null first argument respectively).

Keeping the erroring operand then reaches undistribute_and_or, which recombines operands across AND/OR's short-circuit boundary. That only preserves error semantics for operands common to every disjunct, so undistribution is skipped otherwise. A shared temporal predicate (mz_now() < t, whose cast can error) is common to every disjunct, so it is still factored out and stays extractable — unlike the reverted #37049's blunt could_error() skip, which disabled this factoring and caused the CLU-137 panic.

Note: pure factoring (a AND b) OR (a AND c) → a AND (b OR c) is not error-preserving in general (for NULL a, b true, c erroring, the original errors while the factored form yields NULL), so the guard keys on whether a could-error operand is common to all disjuncts, not on the "factoring vs absorption" distinction.

Verification

  • Unit tests in src/expr/src/scalar.rs (test_reduce_and_or_preserves_operand_errors, test_reduce_and_or_undistributes_common_error) — verified red on the reverted base.
  • test/sqllogictest/error_semantics.slt: AND/OR short-circuit-over-error queries — verified red without the fix (division by zero on rows that should be filtered).
  • test/sqllogictest/temporal.slt: the CLU-137 mz_now() OR-of-ANDs materialized view — verified red under the reverted blunt guard (Unsupported temporal predicate).
  • mz-expr + mz-transform unit suites and an slt sweep (predicate_pushdown, predicate_reduction, fold_constants) pass with no plan changes.

@def- def- requested review from antiguru and ggevay June 25, 2026 13:09
@def- def- requested a review from a team as a code owner June 25, 2026 13:09
Comment thread src/expr/src/scalar/reduce/variadic.rs Outdated
@def- def- requested a review from antiguru June 25, 2026 17:34

@antiguru antiguru left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine to me, but I'd like to get another look from @ggevay or @michael.greenberg.

Comment thread src/expr/src/scalar.rs Outdated
@def- def- requested a review from mgree June 25, 2026 20:08
Closes: CLU-137

Fixes two user-visible bugs in how `reduce` treats an erroring operand of
a non-strict AND/OR:

* A materialized view or query whose `mz_now()` temporal filter sits under
  an OR of ANDs, e.g.

      WHERE (a AND ... AND mz_now() < t) OR (b AND ... AND mz_now() < t)

  could panic the compute worker (or fail to plan) with "Unsupported
  temporal predicate". The shared `mz_now() < t` conjunct was left buried
  inside the OR instead of being factored out, so temporal-filter
  extraction failed.

* A query that should short-circuit, such as `WHERE col AND (1/0 = 1)`,
  could spuriously fail at runtime (e.g. "division by zero") even on rows
  where `col` is false. AND/OR are non-strict: `false AND <error>` is
  `false` and `true OR <error>` is `true`, so such rows must be filtered,
  not errored.

Mechanism: the generic variadic fold in `reduce` replaced a call with any
operand's literal error unconditionally, which is wrong for a function that
is not strict in errors. Fold only for the strict variadics, excluding
AND/OR and ErrorIfNull (which can absorb an operand's error at runtime).
Keeping the erroring operand then reaches `undistribute_and_or`, which
recombines operands across AND/OR's short-circuit boundary. That only
preserves error semantics for operands common to every disjunct, so skip
undistribution otherwise. A shared temporal predicate (`mz_now() < t`,
whose cast can error) is common to every disjunct, so it is still factored
out and stays extractable, unlike the reverted MaterializeInc#37049's blunt
`could_error()` skip.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants