When a scalar function does not define its own validity (validity_opt returns None — currently Kleene and/or), ScalarFn::validity uses the generic fallback is_not_null(self). This is self-referential: the validity of the ScalarFn array X = and(c0, c1) is defined as is_not_null(X), but
evaluating is_not_null(X) needs X's validity — which is again is_not_null(X). Represented as a lazy array DAG this is a genuine call cycle (X.validity() → is_not_null(X) → per-row eval → X.is_invalid() → X.validity() → …) and overflows the stack. The None branch in
vortex-array/src/arrays/scalar_fn/vtable/validity.rs avoids the cycle only by materializing the result with execute_expr: once X is a concrete array its validity is a real buffer, so is_not_null reads it directly instead of re-entering the validity() vtable.
The cleaner fix is to break the cycle at its source by implementing is_not_null (and the related null-predicate functions) as array kernels. A kernel that computes the not-null mask directly from an input array — canonicalizing/reading its concrete validity rather than routing back through that array's
validity() vtable — means is_not_null(X) no longer depends on X.validity(), so the fallback can be represented lazily without recursion and the special-cased eager execute_expr branch can be removed entirely. (As an interim measure, and/or were given explicit Kleene validity expressions over their
operands' masks and null-filled values, which sidesteps the cycle per-operator; the is_not_null kernel would generalize this to every function that relies on the fallback.)
When a scalar function does not define its own validity (
validity_optreturnsNone— currently Kleeneand/or),ScalarFn::validityuses the generic fallbackis_not_null(self). This is self-referential: the validity of theScalarFnarrayX = and(c0, c1)is defined asis_not_null(X), butevaluating
is_not_null(X)needsX's validity — which is againis_not_null(X). Represented as a lazy array DAG this is a genuine call cycle (X.validity()→is_not_null(X)→ per-row eval →X.is_invalid()→X.validity()→ …) and overflows the stack. TheNonebranch invortex-array/src/arrays/scalar_fn/vtable/validity.rsavoids the cycle only by materializing the result withexecute_expr: onceXis a concrete array its validity is a real buffer, sois_not_nullreads it directly instead of re-entering thevalidity()vtable.The cleaner fix is to break the cycle at its source by implementing
is_not_null(and the related null-predicate functions) as array kernels. A kernel that computes the not-null mask directly from an input array — canonicalizing/reading its concrete validity rather than routing back through that array'svalidity()vtable — meansis_not_null(X)no longer depends onX.validity(), so the fallback can be represented lazily without recursion and the special-cased eagerexecute_exprbranch can be removed entirely. (As an interim measure,and/orwere given explicit Kleene validity expressions over theiroperands' masks and null-filled values, which sidesteps the cycle per-operator; the
is_not_nullkernel would generalize this to every function that relies on the fallback.)