Skip to content

[spark] Support Nan check in SparkFilterConverter#7590

Closed
xuzifu666 wants to merge 3 commits intoapache:masterfrom
xuzifu666:nan_fix
Closed

[spark] Support Nan check in SparkFilterConverter#7590
xuzifu666 wants to merge 3 commits intoapache:masterfrom
xuzifu666:nan_fix

Conversation

@xuzifu666
Copy link
Copy Markdown
Member

Purpose

NaN (Not a Number) is a special value defined in the IEEE 754 floating-point standard. The concept of NaN only exists for floating-point numbers(Double/Float), including in Spark SQL. This pr aim to resolve the unfinished issue.

Tests

@JingsongLi
Copy link
Copy Markdown
Contributor

Spark SQL does NOT follow IEEE 754 for NaN equality. In Spark, NaN = NaN evaluates to true:

  spark-sql> SELECT float('NaN') = float('NaN');
  true

So WHERE float_col = float('NaN') should return rows where float_col IS NaN. Returning alwaysFalse() silently drops those rows — this is a data correctness bug.

@xuzifu666
Copy link
Copy Markdown
Member Author

Spark SQL does NOT follow IEEE 754 for NaN equality. In Spark, NaN = NaN evaluates to true:

  spark-sql> SELECT float('NaN') = float('NaN');
  true

So WHERE float_col = float('NaN') should return rows where float_col IS NaN. Returning alwaysFalse() silently drops those rows — this is a data correctness bug.

Thanks for the reminder!There was indeed a data issue with this change, and it seems unnecessary. I'll close this PR. @JingsongLi

@xuzifu666 xuzifu666 closed this Apr 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants