You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
ifcolnotindiscrete_varsandcolnotin [id_col_name, target_column_name]: # omit discrete because a string, and target
260
+
val_counts=df[col].nunique()
261
+
ifval_counts>1andval_counts<=10: # the column contains less than 10 different values
262
+
discrete_vars.append(col)
263
+
264
+
continuous_vars=list(set(df.columns)
265
+
-set(discrete_vars)
266
+
-set([id_col_name, target_column_name]))
267
+
log.warning(
268
+
f"""Cobra automaticaly assumes that following variables are
269
+
discrete: {discrete_vars}
270
+
continuous: {continuous_vars}
271
+
If you want to change this behaviour you can specify the discrete/continuous variables yourself with the continuous_vars and discrete_vars keywords. \nIt assumes that numerical comumns with less than or equal to 10 different values are categorical"""
272
+
)
273
+
returncontinuous_vars, discrete_vars
226
274
227
275
deffit(
228
276
self,
229
277
train_data: pd.DataFrame,
230
278
continuous_vars: list,
231
279
discrete_vars: list,
232
280
target_column_name: str,
281
+
id_col_name: str=None
233
282
):
234
283
"""Fit the data to the preprocessing pipeline.
284
+
If you put continious_vars and target_vars equal to `None` and give the id_col_name Cobra will guess which varaibles are continious and which are not
0 commit comments