Skip to content

Add DataFrame fill_nan #14770

@kosiew

Description

@kosiew

Is your feature request related to a problem or challenge?

There is a common operation in libraries such as PySpark to fill NaN values across an entire DataFrame (or limit by columns). It would be useful to have a similar feature in DataFusion and datafusion-python.

Describe the solution you'd like

If I have a dataframe with a bunch of null values in different columns, I would want to replace all NaNs in those columns with the provided value IF it can be cast to the column's type. Otherwise no-op should happen. Also the user should be able to limit which columns this applies to.

Describe alternatives you've considered

Additional context

This is a repost from apache/datafusion-python#922, prompted by this PR comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions