Skip to content

implemented xgdboost_regression#314

Merged
Mec-iS merged 3 commits intosmartcorelib:developmentfrom
DanielLacina:xgboost_regression
Jul 9, 2025
Merged

implemented xgdboost_regression#314
Mec-iS merged 3 commits intosmartcorelib:developmentfrom
DanielLacina:xgboost_regression

Conversation

@DanielLacina
Copy link
Copy Markdown
Contributor

No description provided.

@DanielLacina DanielLacina requested a review from Mec-iS as a code owner July 8, 2025 23:22
Comment thread src/xgboost/xgb_regressor.rs Outdated
/// * `lambda` - L2 regularization term on weights.
/// * `gamma` - Minimum loss reduction required to make a further partition.
pub fn fit(
data: &Vec<Vec<f64>>,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are we using allocation here?

all the fit(...) methods should have a similar signature and use the Array types. Like you did in:

impl<TX: Number, TY: Number, X: Array2<TX>, Y: Array1<TY>> AgglomerativeClustering<TX, TY, X, Y> {
    /// Fits the agglomerative clustering model to the data.
    ///
    /// # Arguments
    /// * `data` - A reference to the input data matrix.
    /// * `parameters` - The parameters for the clustering algorithm, including `n_clusters`.
    ///
    /// # Returns
    /// A `Result` containing the fitted model with cluster labels, or an error if
    pub fn fit(data: &X, parameters: AgglomerativeClusteringParameters) -> Result<Self, Failed> {
        let (num_samples, _) = data.shape();
        let n_clusters = parameters.n_clusters;
        if n_clusters > num_samples {
            return Err(Failed::because(
                FailedError::ParametersError,
                &format!(
                    "n_clusters: {n_clusters} cannot be greater than n_samples: {num_samples}"
                ),
            ));
        }

Same in other parts, whenever we use Vec we should ask if it is not the case to use Array. Vec can be used for labels and sequence that we are sure are relatively small order of magnitude (usually 100-300 elements). For everything that is 2D or require large arrays, the Array interface should be implemented using DenseMatrix or the Array traits.

Comment thread src/xgboost/xgb_regressor.rs Outdated

/// Predicts the output values for a dataset.
pub fn predict(&self, data: &[Vec<f64>]) -> Vec<f64> {
data.iter().map(|row| self.predict_for_row(row)).collect()
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as above this iteration should use the .iterator(0).copied() implementation. There is a custom iterator for Array, but if you don't use Array you cannot access it. This is also important for compatibility with other libraries like ndarray

Comment thread src/xgboost/xgb_regressor.rs
@Mec-iS
Copy link
Copy Markdown
Collaborator

Mec-iS commented Jul 9, 2025

Thanks 👍🏼

@DanielLacina
Copy link
Copy Markdown
Contributor Author

resolved the problems

@DanielLacina
Copy link
Copy Markdown
Contributor Author

I think the lints are resolved

@Mec-iS Mec-iS self-requested a review July 9, 2025 14:25
Copy link
Copy Markdown
Collaborator

@Mec-iS Mec-iS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondeful!

@Mec-iS Mec-iS merged commit 5cc5528 into smartcorelib:development Jul 9, 2025
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants