Add sklearn estimators feature_importances_ attribute by popescu-v · Pull Request #486 · KhiopsML/khiops-python

popescu-v · 2025-10-01T17:12:19Z

Also:

drop feature_evaluated_importances_ and associated
only put importances in feature_used_importances.

TODO Before Asking for a Review

Rebase your branch to the latest version of dev (or main for release PRs)
Make sure all CI workflows are green
When adding a public feature/fix: Update the Unreleased section of CHANGELOG.md (no date)
Self-Review: Review "Files Changed" tab and fix any problems you find
API Docs (only if there are changes in docstrings, rst files or samples):
- Check the docs build without warning: see the log of the API Docs workflow
- Check that your changes render well in HTML: download the API Docs artifact and open index.html
- If there are any problems it is faster to iterate by building locally the API Docs

folmos-at-orange · 2025-10-08T07:39:11Z

    n_features : int, default 100
-        *Multi-table only* : Maximum number of multi-table aggregate features to
-        construct. See :doc:`/multi_table_primer` for more details.
+        Maximum number of features to construct. See :doc:`/multi_table_primer`


I'd put "AutoML features"

I'd avoid having "AutoML" and rather use "Maximum number of features to construct automatically".

folmos-at-orange · 2025-10-08T08:10:11Z

+        # feature_used_names_ is not set if no variable is selected in the model
+        feature_used_names = getattr(self, "feature_used_names_", [])


This should be the else branch above.

IMHO, there is no need for an else here: either the condition in if modeling_report.selected_variables is not None is true, in which case feature_used_names_ is created and set, or the condition is false, in which case it is not. The getattr that follows the if block guarantees that feature_used_names is a list: either as retrieved from feature_used_names_ (which was set in the if block above), or the empty list.

But on second thought I see your point: we set the three attributes to empty / zero structures if there are no selected variables. Thusly, we can tap into the attributes directly afterwards, without any getattr needed. Hence, I'll do this (if we keep the importances).

This attribute quantifies the importance of each of the _input_ features, in their order of occurrence in the input dataset: - if the feature is used, then its importance is retrieved from the report (as the average of its exact Shapley values across the training dataset) - else, the importance is set to 0.0.

…ators Indeed, these characterize the analysis process, not the resulting model itself.

…ttribute The level and weight of the features characterize the analysis process, not the resulting model itself.

…ample This enables a more even parallel with the (input) feature importances.

folmos-at-orange · 2025-10-09T16:14:11Z

  comments, and dictionary and variable block internal comments.
 - (`core`) Dictionary `Rule` class and supporting API for serializing `Rule` instances.
 - (`core`) New way to add a variable to a dictionary using a complete specification.
+- (`core`) New API constants for rule names:


for rule names -> for rules used in automatic variable construction.

- separate construction rules into: - rules applied by default (`DEFAULT_CONSTRUCTION_RULES`); - calendar-related rules (`CALENDRICAL_CONSTRUCTION_RULES`); - document the construction rules; - fix the `construction_rules` parameter documentation in the Core API; - fix the `n_features` parameter documentation the Sklearn estimator API; - update the `feature_importances_` attribute documentation in the Sklearn estimator API accordingly.

popescu-v · 2025-10-14T13:34:49Z

Not merging it as per the latest decisions (see #480 (comment)).

popescu-v linked an issue Oct 1, 2025 that may be closed by this pull request

Improve Feature Importance Support in Sklearn Khiops Estimators #480

Closed

popescu-v marked this pull request as draft October 1, 2025 17:13

popescu-v force-pushed the 480-improve-feature-importance-support-in-sklearn-khiops-estimators branch 3 times, most recently from 262381e to e5dbf4c Compare October 2, 2025 16:25

popescu-v requested review from folmos-at-orange and tramora October 2, 2025 16:25

popescu-v self-assigned this Oct 2, 2025

popescu-v force-pushed the 480-improve-feature-importance-support-in-sklearn-khiops-estimators branch 2 times, most recently from 06c2784 to 230dd96 Compare October 3, 2025 13:33

popescu-v marked this pull request as ready for review October 3, 2025 13:35

popescu-v force-pushed the 480-improve-feature-importance-support-in-sklearn-khiops-estimators branch 3 times, most recently from 0052b91 to 7501724 Compare October 3, 2025 14:02

folmos-at-orange requested changes Oct 3, 2025

View reviewed changes

popescu-v force-pushed the 480-improve-feature-importance-support-in-sklearn-khiops-estimators branch 3 times, most recently from 3b1b81a to 254c962 Compare October 7, 2025 17:46

folmos-at-orange reviewed Oct 8, 2025

View reviewed changes

popescu-v added 7 commits October 8, 2025 13:04

Silence Pandas deprecation warning

ffbd76e

Drop feature_evaluated_* and n_features_evaluated_ from sklearn estim…

4ef9b56

…ators Indeed, these characterize the analysis process, not the resulting model itself.

Drop level and weight from the .feature_used_importances_ estimator a…

9512cb3

…ttribute The level and weight of the features characterize the analysis process, not the resulting model itself.

Simplify feature_importances_ computation

6a59b45

Show the first 5 used feature importances in the sklearn classifier s…

2e47c38

…ample This enables a more even parallel with the (input) feature importances.

Fix docstring reference to metadata in dictionary getters

b14fa82

popescu-v force-pushed the 480-improve-feature-importance-support-in-sklearn-khiops-estimators branch from 254c962 to b7389c1 Compare October 9, 2025 15:59

folmos-at-orange approved these changes Oct 9, 2025

View reviewed changes

popescu-v force-pushed the 480-improve-feature-importance-support-in-sklearn-khiops-estimators branch from b7389c1 to 1cfb7a6 Compare October 10, 2025 12:11

Update CHANGELOG

9d19f9a

popescu-v force-pushed the 480-improve-feature-importance-support-in-sklearn-khiops-estimators branch from 1cfb7a6 to 9d19f9a Compare October 10, 2025 12:35

popescu-v closed this Oct 14, 2025

popescu-v mentioned this pull request Oct 14, 2025

Drop importances from khiops sklearn estimator attributes #494

Merged

6 tasks

		# feature_used_names_ is not set if no variable is selected in the model
		feature_used_names = getattr(self, "feature_used_names_", [])

Conversation

popescu-v commented Oct 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

TODO Before Asking for a Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

folmos-at-orange Oct 8, 2025

Choose a reason for hiding this comment

Uh oh!

popescu-v Oct 9, 2025

Choose a reason for hiding this comment

Uh oh!

folmos-at-orange Oct 9, 2025

Choose a reason for hiding this comment

Uh oh!

popescu-v Oct 10, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

folmos-at-orange Oct 8, 2025

Choose a reason for hiding this comment

Uh oh!

popescu-v Oct 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

popescu-v Oct 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

folmos-at-orange Oct 9, 2025

Choose a reason for hiding this comment

Uh oh!

popescu-v Oct 10, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

popescu-v commented Oct 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

popescu-v commented Oct 1, 2025 •

edited

Loading

popescu-v Oct 9, 2025 •

edited

Loading

popescu-v Oct 9, 2025 •

edited

Loading