Skip to content

Commit e2272f0

Browse files
committed
More rationale for type union behavior
1 parent 1cd2602 commit e2272f0

1 file changed

Lines changed: 9 additions & 5 deletions

File tree

  • datafusion/expr-common/src/type_coercion

datafusion/expr-common/src/type_coercion/binary.rs

Lines changed: 9 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -844,12 +844,16 @@ pub fn try_type_union_resolution_with_struct(
844844

845845
/// Coerce `lhs_type` and `rhs_type` to a common type for type unification
846846
/// contexts — where two values must be brought to a common type but are not
847-
/// being compared. Examples: UNION, CASE THEN/ELSE branches, NVL2.
847+
/// being compared. Examples: UNION, CASE THEN/ELSE branches, NVL2. For other
848+
/// contexts, [`comparison_coercion`] should typically be used instead.
848849
///
849-
/// Prefers strings over numeric types (e.g., `SELECT 1 UNION SELECT '2'`
850-
/// coerces both sides to `Utf8`).
851-
///
852-
/// For comparisons (`=`, `<`, `>`), use [`comparison_coercion`] instead.
850+
/// The intuition is that we try to find the "widest" type that can represent
851+
/// all values from both sides. When one side is a string and the other is
852+
/// numeric, this prefers strings because every number has a textual
853+
/// representation but not every string can be parsed as a number (e.g., `SELECT
854+
/// 1 UNION SELECT 'a'` coerces both sides to a string). This is in contrast to
855+
/// [`comparison_coercion`], which prefers numeric types so that ordering and
856+
/// equality follow numeric semantics.
853857
pub fn type_union_coercion(lhs_type: &DataType, rhs_type: &DataType) -> Option<DataType> {
854858
if lhs_type.equals_datatype(rhs_type) {
855859
return Some(lhs_type.clone());

0 commit comments

Comments
 (0)