Skip to content

[datafusion-spark] Add Spark-compatible isnan function#20595

Open
shivbhatia10 wants to merge 9 commits intoapache:mainfrom
shivbhatia10:sb/datafusion-math-isnan
Open

[datafusion-spark] Add Spark-compatible isnan function#20595
shivbhatia10 wants to merge 9 commits intoapache:mainfrom
shivbhatia10:sb/datafusion-math-isnan

Conversation

@shivbhatia10
Copy link
Copy Markdown
Contributor

@shivbhatia10 shivbhatia10 commented Feb 27, 2026

Which issue does this PR close?

Part of #15914 and apache/datafusion-comet#1704

Rationale for this change

Helping to continue adding Spark compatible expressions to datafusion-spark.

What changes are included in this PR?

Add new isnan function.

Are these changes tested?

Yes, unit tests.

Are there any user-facing changes?

No.

@github-actions github-actions bot added the spark label Feb 27, 2026
@shivbhatia10 shivbhatia10 changed the title Add isnan [datafusion-spark] Add Spark-compatible isnan function Feb 27, 2026
@shivbhatia10 shivbhatia10 marked this pull request as ready for review February 27, 2026 13:57
@github-actions github-actions bot added the sqllogictest SQL Logic Tests (.slt) label Feb 28, 2026
Copy link
Copy Markdown
Contributor

@kosiew kosiew left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shivbhatia10
Thanks for working on this.

fn spark_isnan(args: &[ColumnarValue]) -> Result<ColumnarValue> {
let [value] = take_function_args("isnan", args)?;

match value {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The scalar and array paths each repeat the same "float32 vs float64, then map is_nan, then patch nulls" structure.

A small helper here would make the implementation easier to scan and keep future changes aligned across both types.

Comment on lines +51 to +54
signature: Signature::one_of(
vec![
TypeSignature::Exact(vec![DataType::Float32]),
TypeSignature::Exact(vec![DataType::Float64]),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean SELECT isnan(NULL) will be rejected during planning because the literal has Null type and there is no coercion path?

That is a semantic gap from Spark, where isnan(NULL) returns false.


# Scalar input: float64
query BBBBB
SELECT isnan(1.0::DOUBLE), isnan('NaN'::DOUBLE), isnan('inf'::DOUBLE), isnan(0.0::DOUBLE), isnan(-1.0::DOUBLE);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The happy-path coverage is solid.

Can you also add one Spark-specific planner/error test such as SELECT isnan(1) to document that this Spark variant intentionally rejects non-floating numerics.

That would make the behavioral difference from the built-in DataFusion isnan obvious to future readers and protect against accidental widening of the signature later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

spark sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants