Skip to content

Commit 3127826

Browse files
spec: add shap-summary specification (#2934)
## New Specification: `shap-summary` Related to #2923 --- ### specification.md # shap-summary: SHAP Summary Plot ## Description A SHAP (SHapley Additive exPlanations) summary plot displaying the distribution of SHAP values for each feature, ordered by mean absolute SHAP value (importance). Each dot represents a sample, positioned horizontally by its SHAP value and colored by the feature's value (typically low=blue to high=red). This visualization is essential for machine learning interpretability, showing both feature importance and the direction and magnitude of feature effects on model predictions. ## Applications - Explaining gradient boosting or random forest model predictions to stakeholders by showing which features drive predictions up or down - Identifying non-linear relationships where high and low feature values both push predictions in the same direction - Detecting feature interactions by observing clustered or bimodal SHAP value distributions ## Data - `shap_values` (numeric matrix) - SHAP values for each sample and feature, shape (n_samples, n_features) - `feature_values` (numeric matrix) - Original feature values for coloring, same shape as shap_values - `feature_names` (categorical) - Names of features for y-axis labels - Size: 100-1000 samples recommended for clear distributions - Example: SHAP values from shap.TreeExplainer on XGBoost/LightGBM model ## Notes - Sort features by mean absolute SHAP value (most important at top) - Use diverging color scale (blue-red) to indicate low-to-high feature values - Add vertical line at x=0 to clearly separate positive and negative impacts - Consider jittering points vertically to reduce overlap - For many features, show only top 10-20 most important --- **Next:** Add `approved` label to the issue to merge this PR. --- :robot: *[spec-create workflow](https://github.com/MarkusNeusinger/pyplots/actions/runs/20612621261)* Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
1 parent 8dc645a commit 3127826

2 files changed

Lines changed: 58 additions & 0 deletions

File tree

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
# shap-summary: SHAP Summary Plot
2+
3+
## Description
4+
5+
A SHAP (SHapley Additive exPlanations) summary plot displaying the distribution of SHAP values for each feature, ordered by mean absolute SHAP value (importance). Each dot represents a sample, positioned horizontally by its SHAP value and colored by the feature's value (typically low=blue to high=red). This visualization is essential for machine learning interpretability, showing both feature importance and the direction and magnitude of feature effects on model predictions.
6+
7+
## Applications
8+
9+
- Explaining gradient boosting or random forest model predictions to stakeholders by showing which features drive predictions up or down
10+
- Identifying non-linear relationships where high and low feature values both push predictions in the same direction
11+
- Detecting feature interactions by observing clustered or bimodal SHAP value distributions
12+
13+
## Data
14+
15+
- `shap_values` (numeric matrix) - SHAP values for each sample and feature, shape (n_samples, n_features)
16+
- `feature_values` (numeric matrix) - Original feature values for coloring, same shape as shap_values
17+
- `feature_names` (categorical) - Names of features for y-axis labels
18+
- Size: 100-1000 samples recommended for clear distributions
19+
- Example: SHAP values from shap.TreeExplainer on XGBoost/LightGBM model
20+
21+
## Notes
22+
23+
- Sort features by mean absolute SHAP value (most important at top)
24+
- Use diverging color scale (blue-red) to indicate low-to-high feature values
25+
- Add vertical line at x=0 to clearly separate positive and negative impacts
26+
- Consider jittering points vertically to reduce overlap
27+
- For many features, show only top 10-20 most important
Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,31 @@
1+
# Specification-level metadata for shap-summary
2+
# Auto-synced to PostgreSQL on push to main
3+
4+
spec_id: shap-summary
5+
title: SHAP Summary Plot
6+
7+
# Specification tracking
8+
created: 2025-12-31T05:23:41Z
9+
updated: null
10+
issue: 2923
11+
suggested: MarkusNeusinger
12+
13+
# Classification tags (applies to all library implementations)
14+
# See docs/concepts/tagging-system.md for detailed guidelines
15+
tags:
16+
plot_type:
17+
- shap
18+
- strip
19+
- swarm
20+
data_type:
21+
- numeric
22+
- categorical
23+
domain:
24+
- machine-learning
25+
- statistics
26+
- general
27+
features:
28+
- interpretability
29+
- color-mapped
30+
- distribution
31+
- ranking

0 commit comments

Comments
 (0)