Add sklearn estimators feature_importances_ attribute#486
Conversation
262381e to
e5dbf4c
Compare
06c2784 to
230dd96
Compare
0052b91 to
7501724
Compare
3b1b81a to
254c962
Compare
| n_features : int, default 100 | ||
| *Multi-table only* : Maximum number of multi-table aggregate features to | ||
| construct. See :doc:`/multi_table_primer` for more details. | ||
| Maximum number of features to construct. See :doc:`/multi_table_primer` |
There was a problem hiding this comment.
I'd put "AutoML features"
There was a problem hiding this comment.
I'd avoid having "AutoML" and rather use "Maximum number of features to construct automatically".
| # feature_used_names_ is not set if no variable is selected in the model | ||
| feature_used_names = getattr(self, "feature_used_names_", []) |
There was a problem hiding this comment.
This should be the else branch above.
There was a problem hiding this comment.
IMHO, there is no need for an else here: either the condition in if modeling_report.selected_variables is not None is true, in which case feature_used_names_ is created and set, or the condition is false, in which case it is not. The getattr that follows the if block guarantees that feature_used_names is a list: either as retrieved from feature_used_names_ (which was set in the if block above), or the empty list.
There was a problem hiding this comment.
But on second thought I see your point: we set the three attributes to empty / zero structures if there are no selected variables. Thusly, we can tap into the attributes directly afterwards, without any getattr needed. Hence, I'll do this (if we keep the importances).
This attribute quantifies the importance of each of the _input_ features, in their order of occurrence in the input dataset: - if the feature is used, then its importance is retrieved from the report (as the average of its exact Shapley values across the training dataset) - else, the importance is set to 0.0.
…ators Indeed, these characterize the analysis process, not the resulting model itself.
…ttribute The level and weight of the features characterize the analysis process, not the resulting model itself.
…ample This enables a more even parallel with the (input) feature importances.
254c962 to
b7389c1
Compare
| comments, and dictionary and variable block internal comments. | ||
| - (`core`) Dictionary `Rule` class and supporting API for serializing `Rule` instances. | ||
| - (`core`) New way to add a variable to a dictionary using a complete specification. | ||
| - (`core`) New API constants for rule names: |
There was a problem hiding this comment.
for rule names -> for rules used in automatic variable construction.
- separate construction rules into: - rules applied by default (`DEFAULT_CONSTRUCTION_RULES`); - calendar-related rules (`CALENDRICAL_CONSTRUCTION_RULES`); - document the construction rules; - fix the `construction_rules` parameter documentation in the Core API; - fix the `n_features` parameter documentation the Sklearn estimator API; - update the `feature_importances_` attribute documentation in the Sklearn estimator API accordingly.
b7389c1 to
1cfb7a6
Compare
1cfb7a6 to
9d19f9a
Compare
|
Not merging it as per the latest decisions (see #480 (comment)). |
Also:
TODO Before Asking for a Review
dev(ormainfor release PRs)Unreleasedsection ofCHANGELOG.md(no date)index.html