Skip to content

fix: XGBoostSklearnEstimator honors random_seed config#1549

Open
immu4989 wants to merge 1 commit into
microsoft:mainfrom
immu4989:flaml-fix-xgboost-sklearn-reproducibility
Open

fix: XGBoostSklearnEstimator honors random_seed config#1549
immu4989 wants to merge 1 commit into
microsoft:mainfrom
immu4989:flaml-fix-xgboost-sklearn-reproducibility

Conversation

@immu4989
Copy link
Copy Markdown
Contributor

Why are these changes needed?

Seed random_state on XGBoostSklearnEstimator.__init__ so xgb.XGBClassifier / xgb.XGBRegressor / xgb.XGBRanker produce deterministic results across runs and honor the FLAML-internal random_seed config. Uses the same defensive pattern as #1541 and #1546 , pop the random_seed key from self.params, and only set random_state when the caller has not already provided one.

XGBoostLimitDepthEstimator inherits from XGBoostSklearnEstimator, so this PR closes two of the remaining XGBoost items in tracking issue #1540 (the non-sklearn XGBoostEstimator path will be addressed in a follow up PR).

Before this change, the existing xgboost / xgb_limitdepth reproducibility tests passed only because the search space init values (subsample=1.0, colsample_bytree=1.0, colsample_bylevel=1.0) are themselves deterministic. A different dataset or max_iter that explored stochastic-subsample configs would not have been reproducible.

Related issue number

Tracking issue: #1540
Pattern reference: #1547 (RandomForest), #1541 (SGD), #1546 (LRL1)

Checks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant