Describe the bug
On class GenericTask, evaluate_model_CV method, when using, for example, sklearns' StratifiedKFold, the split method is missing the 'y' parameter therefore cross validation doesn't work.
Steps to reproduce
from flaml import AutoML
from sklearn.model_selection import StratifiedKFold
cv = StratifiedKFold(n_splits=5, shuffle=True, random_state=42)
automl_settings = {
"task": 'classification',
"time_budget": 180, # total time in seconds
"metric": 'roc_auc',
"estimator_list": ["xgboost", "lgbm", "catboost"],
"eval_method": "cv",
"split_type": cv,
"ensemble": True
}
flaml_automl.fit(
X_train=X_train_transformed,
y_train=y_train,
**automl_settings
)
Model Used
Not any particular model. Error occurs when cross validating optimal-model.
Expected Behavior
Do the splits and cross validate
Screenshots and logs
This is the current code (check last line):
X_train_split, y_train_split = X_train_all, y_train_all
shuffle = getattr(kf, "shuffle", not self.is_ts_forecast())
if isinstance(kf, RepeatedStratifiedKFold):
kf = kf.split(X_train_split, y_train_split)
elif isinstance(kf, (GroupKFold, StratifiedGroupKFold)):
groups = kf.groups
kf = kf.split(X_train_split, y_train_split, groups)
shuffle = False
elif isinstance(kf, TimeSeriesSplit):
kf = kf.split(X_train_split, y_train_split)
else:
kf = kf.split(X_train_split)
Last line should be:
kf = kf.split(X_train_split, y_train_split) <----- with the "y" parameter
Additional Information
flaml 2.3.5
ubuntu 20.04.6 LTS
python 3.12
Describe the bug
On class GenericTask, evaluate_model_CV method, when using, for example, sklearns' StratifiedKFold, the split method is missing the 'y' parameter therefore cross validation doesn't work.
Steps to reproduce
from flaml import AutoML
from sklearn.model_selection import StratifiedKFold
cv = StratifiedKFold(n_splits=5, shuffle=True, random_state=42)
automl_settings = {
"task": 'classification',
"time_budget": 180, # total time in seconds
"metric": 'roc_auc',
"estimator_list": ["xgboost", "lgbm", "catboost"],
"eval_method": "cv",
"split_type": cv,
"ensemble": True
}
flaml_automl.fit(
X_train=X_train_transformed,
y_train=y_train,
**automl_settings
)
Model Used
Not any particular model. Error occurs when cross validating optimal-model.
Expected Behavior
Do the splits and cross validate
Screenshots and logs
This is the current code (check last line):
Last line should be:
kf = kf.split(X_train_split, y_train_split) <----- with the "y" parameter
Additional Information
flaml 2.3.5
ubuntu 20.04.6 LTS
python 3.12