Modernize for scikit-learn 1.6+, NumPy 2.0+, scipy 1.17+, matplotlib 3.7+ by jg585Username · Pull Request #1336 · DistrictDataLabs/yellowbrick

jg585Username · 2026-05-12T08:06:46Z

Summary

Fixes 337 test failures caused by breaking API changes introduced in recent versions of scikit-learn, NumPy, scipy, and matplotlib. The project now supports:

scikit-learn ≥ 1.6 (tested against 1.8.0)
NumPy ≥ 1.24 (tested against 2.4.4)
scipy ≥ 1.7 (tested against 1.17.1)
matplotlib ≥ 3.5 (tested against 3.10.9)
Python 3.9 – 3.13

Result: 1127 passed, 0 failed (was 337 failures before this PR).

scikit-learn compatibility

`_estimator_type` removed (sklearn 1.6+)

yellowbrick/utils/types.py — The most widespread breakage. sklearn 1.6 removed the _estimator_type class attribute from many estimators. Added _get_estimator_type() which falls back through: legacy attribute → __sklearn_tags__() API → Mixin subclass inspection. Rewrote is_classifier(), is_regressor(), is_clusterer() on top of it.

`Pipeline.__sklearn_is_fitted__` (sklearn 1.8+)

yellowbrick/base.py — sklearn 1.8's Pipeline.score() calls check_is_fitted(last_step) on the final pipeline step. ModelVisualizer has no trailing-underscore attributes of its own, so this raised NotFittedError even after a successful fit(). Added __sklearn_is_fitted__() delegating to check_is_fitted(self.estimator).

`__sklearn_tags__()` for ContribEstimator (sklearn 1.6+)

yellowbrick/contrib/wrapper.py — Added __sklearn_tags__() so wrapped contrib estimators correctly report their type through the new Tags API.

`_check_targets()` returns 4-tuple (sklearn 1.8)

yellowbrick/classifier/class_prediction_error.py — Was unpacked as 3-tuple; switched to indexed access.

`multi_class="auto"` removed (sklearn 1.7)

yellowbrick/classifier/threshold.py — Removed the parameter.

`store_cv_values` / `cv_values_` renamed (sklearn 1.7)

yellowbrick/regressor/alphas.py — Added dual-check with fallback to old names for backward compatibility.

`np.matrix` rejected (sklearn 1.7)

yellowbrick/cluster/elbow.py — Sparse matrix .mean() returns np.matrix; wrapped with np.asarray().

TSNE perplexity validation (sklearn 1.7+)

yellowbrick/cluster/icdm.py — perplexity must be < n_samples; added dynamic cap before fit_transform.

NumPy compatibility

File	Change	Reason
`yellowbrick/utils/helpers.py`	`np.in1d` → `np.isin`	Removed in NumPy 2.0
`yellowbrick/contrib/missing/dispersion.py`	`np.string_` → `np.bytes_`, `np.unicode_` → `np.str_`	Removed in NumPy 1.24
`yellowbrick/text/dispersion.py`	Wrapped generators in `list()` for `np.stack()`	NumPy 2.0 requires sequences
`yellowbrick/cluster/icdm.py`	`interpolation=` → `method=` in `np.percentile`	Renamed in NumPy 1.22
`yellowbrick/classifier/base.py`	Extract KeyError key via `e.args[0]`	NumPy 2.0 changed `np.str_` repr

matplotlib compatibility

yellowbrick/regressor/influence.py — Removed use_line_collection=True from ax.stem() (argument removed in matplotlib 3.7).

Test suite updates

13 test files updated for deprecated APIs, wrong MRO order, removed parameters, and updated numerical expected values (sklearn 1.8 / scipy 1.17 produce slightly different scores)
~96 baseline PNG images regenerated to match matplotlib 3.10 rendering
tests/test_compat.py — 25 new targeted regression tests, one per fix, so future package upgrades immediately surface regressions:
- TestEstimatorTypeDetection — verifies is_classifier/regressor/clusterer via Mixin-only classes (no _estimator_type)
- TestModelVisualizerFittedState — verifies __sklearn_is_fitted__ and Pipeline.score() don't raise after fit
- TestNumpyCompat — documents and verifies all NumPy 2.0 API removals
- TestMatplotlibCompat — verifies ax.stem() without use_line_collection
requirements.txt / setup.py — bumped minimums and added Python 3.9–3.13 classifiers

… matplotlib 3.7+ Fixes 337 test failures caused by breaking API changes in upstream packages. ## scikit-learn compatibility - types.py: replace all _estimator_type checks with _get_estimator_type() helper that falls back to __sklearn_tags__() and Mixin subclass inspection (sklearn 1.6+) - base.py: add ModelVisualizer.__sklearn_is_fitted__() so Pipeline.score() correctly detects fitted state of wrapped visualizers (sklearn 1.8+) - contrib/wrapper.py: add __sklearn_tags__() to ContribEstimator (sklearn 1.6+) - classifier/class_prediction_error.py: unpack _check_targets() by index (4-tuple in 1.8) - classifier/base.py: extract KeyError key via e.args[0] for np.str_ repr change - classifier/threshold.py: remove multi_class="auto" (removed in 1.7) - regressor/alphas.py: support store_cv_results/cv_results_ with fallback to old names - cluster/elbow.py: wrap sparse matrix center with np.asarray() (np.matrix rejected in 1.7) - cluster/icdm.py: cap TSNE perplexity dynamically (must be < n_samples) ## NumPy compatibility - helpers.py: np.in1d → np.isin (removed in NumPy 2.0) - contrib/missing/dispersion.py: np.string_ → np.bytes_, np.unicode_ → np.str_ (removed in 1.24) - text/dispersion.py: wrap generators in list() for np.stack() (NumPy 2.0) - cluster/icdm.py: interpolation= → method= in np.percentile (renamed in 1.22) ## matplotlib compatibility - regressor/influence.py: remove use_line_collection=True from ax.stem() (removed in 3.7) ## Test suite updates - Updated 13 test files for deprecated APIs and new sklearn/scipy numerical values - Regenerated ~96 baseline PNG images for matplotlib 3.10 rendering - Added tests/test_compat.py with 25 targeted regression tests for each fix - Updated requirements.txt and setup.py: sklearn>=1.6, numpy>=1.24, scipy>=1.7, matplotlib>=3.5, Python 3.9–3.13 Result: 1127 passed, 0 failed (vs 337 failures before) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

jg585Username · 2026-05-12T08:31:25Z

Relationship to Other Open Modernization PRs

While working on this PR, I noticed several other open attempts at the same problem. Here's how they relate:

#1322 and #1325 (lwgray) — Both are subsets of this PR. #1322 only removes the use_line_collection parameter from ax.stem(). #1325 covers that plus a handful of other matplotlib/NumPy fixes, but stops well short of full compatibility.

#1332 and #1333 (PythonCharmers) — These fix types.py and contrib/wrapper.py only, targeting the sklearn 1.6 tags API. Their approach for types.py (delegating directly to sklearn's own is_classifier/regressor/clusterer) is arguably simpler than ours, but neither PR addresses the sklearn 1.8 Pipeline.__sklearn_is_fitted__ breakage or any of the NumPy/matplotlib issues.

#1329 (lwgray) — The most comprehensive existing attempt, targeting sklearn 1.7, NumPy 2.0, and matplotlib 3.10. Covers roughly 70% of the same ground as this PR. The key differences are:

They skip the pipeline validation tests rather than fixing the root cause; this PR adds __sklearn_is_fitted__() to ModelVisualizer so Pipeline.score() works correctly on sklearn 1.8
They don't address _check_targets() returning a 4-tuple (sklearn 1.8), TSNE perplexity validation, np.matrix rejection, or multi_class="auto" removal
This PR targets sklearn 1.8 specifically and achieves 1127 passed, 0 failed
This PR adds tests/test_compat.py with 25 targeted regression tests so future package upgrades surface breakages immediately

In short: this PR is a strict superset of all the above. If the maintainers prefer a smaller, more incremental approach, merging #1329 first and then layering the remaining sklearn 1.8 fixes on top would also be a viable path.

lwgray · 2026-05-22T15:07:50Z

@jg585Username I will take a look a this this weekend

+            etype = estimator.__sklearn_tags__().estimator_type
+            if etype is not None:
+                return etype
+        except Exception:


+from sklearn.naive_bayes import GaussianNB
+from sklearn.impute import SimpleImputer
+from sklearn.pipeline import Pipeline
+from sklearn.datasets import make_classification, make_regression


- Remove unused make_classification import in test_compat.py - Hoist sklearn imports in base.py to module top (called by Pipeline) - Simplify ContribEstimator.__sklearn_tags__: drop dead _HAS_SKLEARN_TAGS flag since sklearn>=1.6 is now a hard requirement - Add explanatory comment to bare except in _get_estimator_type so the defensive fallback is clear to readers and CodeQL

- Format four files modified in this PR with project's pinned black 22.6.0 - Shorten docstrings on __sklearn_is_fitted__ and test_percentile_method_kwarg - Rewrap error message in ContribEstimator.__getattr__ - Replace long sklearn source URLs with short function references in is_classifier / is_regressor docstrings

jg585Username closed this May 12, 2026

jg585Username reopened this May 12, 2026

lwgray self-assigned this May 22, 2026

github-advanced-security AI found potential problems May 22, 2026

View reviewed changes

jg585Username added 2 commits May 22, 2026 10:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Modernize for scikit-learn 1.6+, NumPy 2.0+, scipy 1.17+, matplotlib 3.7+#1336

Modernize for scikit-learn 1.6+, NumPy 2.0+, scipy 1.17+, matplotlib 3.7+#1336
jg585Username wants to merge 3 commits into
DistrictDataLabs:developfrom
jg585Username:feature/modernize-sklearn-numpy-compat

jg585Username commented May 12, 2026 •

edited

Loading

Uh oh!

jg585Username commented May 12, 2026

Uh oh!

lwgray commented May 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

jg585Username commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

scikit-learn compatibility

_estimator_type removed (sklearn 1.6+)

Pipeline.__sklearn_is_fitted__ (sklearn 1.8+)

__sklearn_tags__() for ContribEstimator (sklearn 1.6+)

_check_targets() returns 4-tuple (sklearn 1.8)

multi_class="auto" removed (sklearn 1.7)

store_cv_values / cv_values_ renamed (sklearn 1.7)

np.matrix rejected (sklearn 1.7)

TSNE perplexity validation (sklearn 1.7+)

NumPy compatibility

matplotlib compatibility

Test suite updates

Uh oh!

jg585Username commented May 12, 2026

Relationship to Other Open Modernization PRs

Uh oh!

lwgray commented May 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

jg585Username commented May 12, 2026 •

edited

Loading

`_estimator_type` removed (sklearn 1.6+)

`Pipeline.__sklearn_is_fitted__` (sklearn 1.8+)

`__sklearn_tags__()` for ContribEstimator (sklearn 1.6+)

`_check_targets()` returns 4-tuple (sklearn 1.8)

`multi_class="auto"` removed (sklearn 1.7)

`store_cv_values` / `cv_values_` renamed (sklearn 1.7)

`np.matrix` rejected (sklearn 1.7)