Skip to content

feat: Add arrays_overlap function to functions_list.yaml / functions_set.yaml #986

@benbellick

Description

@benbellick

The arrays_overlap function (returns whether two arrays share any common elements) is a common primitive across engines and would be a good addition to functions_list.yaml or functions_set.yaml (I am not particular about which).

Cross-Engine Function Names

Engine Function
PostgreSQL && operator
DuckDB list_has_any
Spark arrays_overlap
Trino arrays_overlap
DataFusion array_has_any

Cross-Engine Semantic Comparison

The following table was generated by running equivalent queries across all five engines via Docker.

Test Case PostgreSQL DuckDB Spark Trino DataFusion
[1,2,3] && [3,4,5] true true true true true
[1,2,3] && [4,5,6] false false false false false
[1,2,3] && [] false false false false false
[] && [] false false false false false
[1,NULL,3] && [3,4] true true true true true
⚠️ [1,NULL,3] && [4,5] false false NULL NULL false
⚠️ [1,NULL] && [NULL,4] false false NULL NULL true
NULL && [1,2] NULL NULL NULL NULL NULL
[1,1,2] && [1,3] true true true true true
['a','b','c'] && ['c','d'] true true true true true

All five engines agree on the straightforward cases. The divergence is entirely about how NULL elements are handled when there is no definitive non-null overlap (cases 6 & 7).

Proposal

Lets add an arrays_overlap function to functions_{list/set}.yaml. We can handle the nullability stuff with a function option.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions