Skip to content

_metadata properties do not work with pyjanitor #1473

@raffaelemancuso

Description

@raffaelemancuso

Brief Description

_matadata original properties are not pased to pyjanitor manipulation results

System Information

  • Operating system: Windows
  • OS details (optional): 11
  • Python version (required): 3.13

Minimally Reproducible Code

import pandas as pd
import janitor # noqa: F401
import pandas_flavor as pf

# See: https://pandas.pydata.org/pandas-docs/stable/development/extending.html#define-original-properties
class MyDataFrame(pd.DataFrame):

    # normal properties
    _metadata = ["myvar"]

    @property
    def _constructor(self):
        return MyDataFrame

@pf.register_dataframe_method
def regvar(self):
    obj = MyDataFrame(self)
    obj.myvar = 2
    return obj

@pf.register_dataframe_method
def printvar(self):
    print(self.myvar)
    return self

df = pd.DataFrame(
     {
         "Year": [1999, 2000, 2004, 1999, 2004],
         "Taxon": [
             "Saccharina",
             "Saccharina",
             "Saccharina",
             "Agarum",
             "Agarum",
         ],
         "Abundance": [4, 5, 2, 1, 8],
     }
 )
 
df2 = df.regvar().query("Taxon=='Saccharina'").printvar()

index = pd.Index(range(1999,2005),name='Year')
df2 = df.regvar().complete(index, "Taxon", sort=True).printvar()

Error Messages

First call with built-in pandas method correctly returns 2.

Second call with pyjanitor method returns:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
~\AppData\Local\Temp\ipykernel_4412\627945022.py in ?()
     39 
     40 df2 = df.regvar().query("Taxon=='Saccharina'").printvar()
     41 
     42 index = pd.Index(range(1999,2005),name='Year')
---> 43 df2 = df.regvar().complete(index, "Taxon", sort=True).printvar()

c:\Users\raffaele\venvs\base\Lib\site-packages\pandas_flavor\register.py in ?(self, *args, **kwargs)
    160                     object: The result of calling of the method.
    161                 """
    162                 global method_call_ctx_factory
    163                 if method_call_ctx_factory is None:
--> 164                     return method(self._obj, *args, **kwargs)
    165 
    166                 return handle_pandas_extension_call(
    167                     method, method_signature, self._obj, args, kwargs

~\AppData\Local\Temp\ipykernel_4412\627945022.py in ?(self)
     21 @pf.register_dataframe_method
     22 def printvar(self):
---> 23     print(self.myvar)
     24     return self

c:\Users\raffaele\venvs\base\Lib\site-packages\pandas\core\generic.py in ?(self, name)
   6295             and name not in self._accessors
   6296             and self._info_axis._can_hold_identifiers_and_holds_name(name)
   6297         ):
   6298             return self[name]
-> 6299         return object.__getattribute__(self, name)

AttributeError: 'DataFrame' object has no attribute 'myvar'

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions