refactored decision tree into reusable components#316
refactored decision tree into reusable components#316Mec-iS merged 4 commits intosmartcorelib:developmentfrom
Conversation
|
|
||
| #[cfg_attr(feature = "serde", derive(Serialize, Deserialize))] | ||
| #[derive(Debug, Clone)] | ||
| struct Node { |
There was a problem hiding this comment.
this should problably be Node<TX>, why it should be limited to f64? in general everytime you write a fixed numerical type you should ask yourself if it is generalisable.
There was a problem hiding this comment.
All this code is copied and pasted from the DecisionTree file with the difference being that I use the name BaseTree instead of DecisionTree. This implementation is already in your codebase: https://github.com/smartcorelib/smartcore/blob/development/src/tree/decision_tree_regressor.rs#L129.
There was a problem hiding this comment.
Plus Node is an implementation detail (no pub keyword) so it doesn't matter what numeric type it contains.
|
|
||
| impl PartialEq for Node { | ||
| fn eq(&self, other: &Self) -> bool { | ||
| (self.output - other.output).abs() < f64::EPSILON |
There was a problem hiding this comment.
again, this will work only for f64?
There was a problem hiding this comment.
Node is an implementation detail
|
|
||
| #[cfg_attr(feature = "serde", derive(Serialize, Deserialize))] | ||
| #[derive(Debug, Clone, Default)] | ||
| pub enum Splitter { |
|
|
||
| #[cfg_attr(feature = "serde", derive(Serialize, Deserialize))] | ||
| #[derive(Debug, Clone)] | ||
| /// Parameters of Regression base_tree |
There was a problem hiding this comment.
this should explain a little more like:
/// Parameters for configuring the behavior of a decision tree regressor.
///
/// This struct controls various aspects of tree construction including
/// depth limits, sample requirements, and splitting strategies.
| let mut order: Vec<Vec<usize>> = Vec::new(); | ||
|
|
||
| for i in 0..num_attributes { | ||
| let mut col_i: Vec<TX> = x.get_col(i).iterator(0).copied().collect(); |
There was a problem hiding this comment.
this one could be just maybe?
let mut col_view: Vec<f64> = x.get_col(i).iterator(0).copied().collect();
let indices = col_view.argsort_mut();
There was a problem hiding this comment.
I copied and pasted this code from the decision tree file. That would probably be more readable. Ima be honest, it was pretty hard reading the code of the decision tree implementation, but it is indeed optimal.
| fn parameters(&self) -> &BaseTreeRegressorParameters { | ||
| self.parameters.as_ref().unwrap() | ||
| } | ||
| /// Get estimate of intercept, return value |
There was a problem hiding this comment.
I don't think this comment is correct, please double-check
There was a problem hiding this comment.
That's already written in the codebase.
There was a problem hiding this comment.
|
Thank you! This is a nice starting point so to make possible to implement:
|
|
This opens up to implementations like: trait TreeRegressor<TX, TY, X, Y> {
fn fit_tree(&self, x: &X, y: &Y, samples: Vec<usize>) -> Result<Self, Failed>;
fn predict_tree(&self, x: &X) -> Result<Y, Failed>;
}
trait EnsembleRegressor<TX, TY, X, Y> {
fn fit_ensemble(&self, x: &X, y: &Y, n_estimators: usize) -> Result<Self, Failed>;
fn predict_ensemble(&self, x: &X) -> Result<Y, Failed>;
}is that correct? Recommended Implementation Priority
|
|
I think we should build Base models that other more complex models can build on top of. |
|
Idk if traits are necessary. I think all we need is for example a complex model that stores the base model as an attribute. And then we can add extra logic on top of the base model. |
No description provided.