Skip to content

Added the material for XGBoost optimization#30

Open
bbhattar wants to merge 8 commits into
intel:mainfrom
bbhattar:xgboost
Open

Added the material for XGBoost optimization#30
bbhattar wants to merge 8 commits into
intel:mainfrom
bbhattar:xgboost

Conversation

@bbhattar

Copy link
Copy Markdown

Added the materials for XGBoost optimization. Please review and give me your feedback.

Comment thread software/xgboost/README.md Outdated
Comment thread software/xgboost/README.md Outdated
Comment thread software/xgboost/README.md Outdated
Comment thread software/xgboost/README.md Outdated
Comment thread software/xgboost/README.md Outdated
Comment thread software/xgboost/README.md Outdated
Comment thread software/xgboost/README.md Outdated
Comment thread software/xgboost/README.md Outdated
@rsiyer-intel rsiyer-intel requested a review from adgubrud May 18, 2026 19:29
@rsiyer-intel

Copy link
Copy Markdown
Collaborator

Since the latest changes still have perf data, it cannot be approved till we get perf claim pre-requisites fulfilled.

Comment thread software/xgboost/README.md Outdated
Comment thread software/xgboost/README.md Outdated

@david-cortes-intel david-cortes-intel left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General comment: this guide says 'xgboost', but it is limited to predictions/inference, while a similar guide could also be done for training, covering details like threading, hyperparameters to try, and similar.

Comment thread software/xgboost/README.md Outdated
Comment thread software/xgboost/README.md Outdated
Comment thread software/xgboost/README.md Outdated
Comment thread software/xgboost/README.md Outdated
Comment thread software/xgboost/README.md Outdated
Comment thread software/xgboost/README.md Outdated
Comment thread software/xgboost/README.md Outdated
Comment thread software/xgboost/README.md Outdated
Comment thread software/xgboost/README.md Outdated
Comment thread software/xgboost/README.md Outdated
Comment thread software/xgboost/README.md Outdated

@razdoburdin razdoburdin left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please update installation instructions and consider switching to the actual versions of the software.

Comment thread software/xgboost/README.md Outdated
…-learn, removing memory allocator section, and clarifying the scope to include all 3 methods
@bbhattar

Copy link
Copy Markdown
Author

Since the latest changes still have perf data, it cannot be approved till we get perf claim pre-requisites fulfilled.

@rsiyer-intel Updated the doc with PDT approved data

@bbhattar

Copy link
Copy Markdown
Author

please update installation instructions and consider switching to the actual versions of the software.

Done and done

Comment thread software/xgboost/README.md Outdated
Comment thread software/xgboost/README.md Outdated
Comment thread software/xgboost/README.md Outdated
Comment thread software/xgboost/README.md Outdated
Comment thread software/xgboost/README.md Outdated
Comment thread software/xgboost/README.md Outdated
Comment thread software/xgboost/README.md Outdated
Comment thread software/xgboost/README.md
Comment thread software/xgboost/README.md Outdated
Comment thread software/xgboost/README.md Outdated
Comment thread software/xgboost/README.md Outdated
Comment thread software/xgboost/README.md Outdated
Comment thread software/xgboost/README.md Outdated
For multiclass classification, default XGBoost, LightGBM, and daal4py all use one tree per class. CatBoost, on the other hand, uses vectorized trees. This means all other approaches end up processing `num_classes x` more trees compared to CatBoost, e.g., 7,000 vs 1,000 for Covtype. For smaller `num_estimators` like `100`, `daal4py` outperforms CatBoost, but as `num_estimators` gets larger, CatBoost provides better inference latency.
For multiclass classification, XGBoost, LightGBM, and daal4py (with default settings as of the tested versions) use one tree per class, while CatBoost uses symmetric (oblivious) trees that handle all classes in a single tree. This means daal4py ends up processing `num_classes × num_estimators` trees compared to CatBoost's `num_estimators` trees (e.g., 7,000 vs 1,000 for Covtype with 7 classes). As a result, CatBoost can provide better inference latency for multiclass tasks with many classes and large ensembles.

> **Note:** XGBoost is moving towards multi-output trees (via `multi_strategy="multi_output_tree"`) which would reduce this gap by handling all classes in a single tree, similar to CatBoost. Check the [XGBoost documentation](https://xgboost.readthedocs.io/en/latest/tutorials/multioutput.html) for the latest defaults.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This tutorial doesn't show what the defaults are.

…e mention of undefined default, removed unnecessary symmetric tree mention
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants