Skip to content

Bug Report: Kurtosis at constant columns values #1323

@pedro-tofani

Description

@pedro-tofani

Current Behaviour

I am trying to generate a report but it but it throws an error.

    187 descriptive_statistics = Table(
    188     [
    189         {
    190             "name": "Standard deviation",
    191             "value": fmt_numeric(summary["std"], precision=config.report.precision),
    192         },
    193         {
    194             "name": "Coefficient of variation (CV)",
    195             "value": fmt_numeric(summary["cv"], precision=config.report.precision),
    196         },
    197         {
    198             "name": "Kurtosis",
--> 199             "value": fmt_numeric(
    200                 summary["kurtosis"], precision=config.report.precision
    201             ),
    202         },

File /opt/conda/lib/python3.10/site-packages/ydata_profiling/report/formatters.py:232, in fmt_numeric(value, precision)
    221 @list_args
    222 def fmt_numeric(value: float, precision: int = 10) -> str:
    223     """Format any numeric value.
    224 
    225     Args:
   (...)
    230         The numeric value with the given precision.
    231     """
--> 232     fmtted = f"{{:.{precision}g}}".format(value)
    233     for v in ["e+", "e-"]:
    234         if v in fmtted:

TypeError: unsupported format string passed to NoneType.__format__

I think it is because pyspark.sql.functions.kurtosis function returns None for constant columns

df.select(kurtosis(df.column_name)).show()
+--------------+
|kurtosis(column_name)|
+--------------+
|          null   |
+--------------+

Expected Behaviour

It was expected to generate the report.

Data Description

My data has two columns that all the values are constants.

Code that reproduces the bug

report_df = ProfileReport(df)

pandas-profiling version

4.1.2

Dependencies

pyspark==3.3.2

OS

Linux

Checklist

  • There is not yet another bug report for this issue in the issue tracker
  • The problem is reproducible from this bug report. This guide can help to craft a minimal bug report.
  • The issue has not been resolved by the entries listed under Common Issues.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bug 🐛Something isn't workingspark ⚡PySpark features!

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions