Skip to content

Add basic xgboost.interpret.shap_values API#12208

Draft
RAMitchell wants to merge 2 commits into
dmlc:masterfrom
RAMitchell:codex/interpret-basic
Draft

Add basic xgboost.interpret.shap_values API#12208
RAMitchell wants to merge 2 commits into
dmlc:masterfrom
RAMitchell:codex/interpret-basic

Conversation

@RAMitchell
Copy link
Copy Markdown
Member

Summary

Adds an initial xgboost.interpret module with a basic shap_values function.

This is a small first step toward #11947. The implementation wraps the existing Booster.predict(pred_contribs=True) path and keeps the behavior intentionally narrow while establishing the public module/function entry point.

Changes

  • Add xgboost.interpret.shap_values
  • Accept either a Booster or sklearn-style XGBoost model
  • Accept DMatrix or array-like inputs
  • Return feature SHAP values without the bias column by default
  • Support return_bias=True to return (values, bias)
  • Support optional temporary device= override, restoring the booster config after prediction
  • Reject X_background for now, since interventional SHAP is not implemented yet
  • Add focused Python tests

Notes

This does not add generated docs yet. The function has a docstring, but a dedicated Sphinx page can follow once the initial API shape is agreed.

Testing

PYTHONPATH=/home/nfs/rorym/xgboost-wt/interpret-basic/python-package \
  conda run -n xgboost python -m pytest tests/python/test_interpret.py

Result: 5 passed

Pre-commit passed during commit, including ruff, ruff format, and pylint.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new public Python entry point for interpretability by introducing xgboost.interpret.shap_values, initially implemented as a thin wrapper around Booster.predict(pred_contribs=True) and exposing a stable module/function surface for future interpretability features (per #11947).

Changes:

  • Introduce python-package/xgboost/interpret.py with a first-pass shap_values() API (bias-column handling, optional device override, background data explicitly unsupported for now).
  • Export the new interpret module from xgboost.__init__ for from xgboost import interpret.
  • Add focused pytest coverage for API behavior and device override config restoration.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
tests/python/test_interpret.py Adds unit tests covering shap_values parity with pred_contribs, sklearn-model acceptance, background-data rejection, and config restoration.
python-package/xgboost/interpret.py Implements the new xgboost.interpret.shap_values wrapper and related helpers.
python-package/xgboost/init.py Exposes the new interpret module at the package top level and in __all__.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +28 to +31
if iteration_range is None:
get_iteration_range = getattr(model, "_get_iteration_range", None)
if get_iteration_range is not None:
return get_iteration_range(iteration_range)
Comment on lines +39 to +42
if isinstance(X, DMatrix):
if feature_names is not None:
X.feature_names = feature_names
return X
Comment on lines +64 to +69
config = booster.save_config()
try:
booster.set_param({"device": device})
return booster.predict(data, **kwargs)
finally:
booster.load_config(config)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants