Skip to content

fix: LRL1 results now reproducible#1546

Merged
thinkall merged 1 commit into
microsoft:mainfrom
immu4989:flaml-fix-lrl1-reproducibility
May 12, 2026
Merged

fix: LRL1 results now reproducible#1546
thinkall merged 1 commit into
microsoft:mainfrom
immu4989:flaml-fix-lrl1-reproducibility

Conversation

@immu4989
Copy link
Copy Markdown
Contributor

@immu4989 immu4989 commented May 11, 2026

Why are these changes needed?

Summary

  • Seed random_state on LRL1Classifier so LogisticRegression(solver="saga", penalty="l1") produces deterministic results across runs.
  • Uses the same defensive pattern as fix: SGD results now reproducible #1541: pop the FLAML-internal random_seed key from self.params, and only set random_state when the caller has not already provided one.
  • Uncomment "lrl1" in both classification reproducibility test parametrize lists (it was previously disabled).

Why

LRL1Classifier defaults to solver="saga", a stochastic-gradient solver that shuffles samples each pass. Without random_state, identical fits produce different results — same root cause as SGD (#1541). LRL2Classifier is unaffected since it uses the deterministic lbfgs solver.

Test plan

  • pytest test/automl/test_classification.py -k "reproducibility and lrl1" — both wrapper and underlying-model tests pass
  • pre-commit run --files flaml/automl/model.py test/automl/test_classification.py — all hooks pass
  • CI green on the PR

Related issue number

Follows the same pattern as LGBM (#1369), CatBoost (#1364), ElasticNet (#1374), LinearSVC (#1376), and SGD (#1541).

Checks

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves determinism for FLAML’s lrl1 estimator by ensuring LRL1Classifier seeds scikit-learn’s stochastic LogisticRegression(solver="saga", penalty="l1"), and re-enables reproducibility coverage for lrl1 in the classification test suite.

Changes:

  • Set LRL1Classifier.params["random_state"] from FLAML’s internal random_seed (defaulting to 10242048) when the caller hasn’t explicitly provided random_state.
  • Remove (pop) random_seed from LRL1Classifier.params to avoid passing an unsupported parameter into scikit-learn constructors.
  • Re-enable lrl1 in both classification reproducibility parametrized test lists.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
flaml/automl/model.py Seeds LRL1Classifier’s underlying LogisticRegression via random_state for deterministic saga behavior.
test/automl/test_classification.py Re-enables lrl1 in reproducibility test parametrizations to prevent regressions.

@thinkall thinkall merged commit 4dc2de8 into microsoft:main May 12, 2026
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants