Skip to content

Commit 1c351c9

Browse files
authored
Merge pull request #182 from codeharborhub/dev-1
Update Docs: ml-docs
2 parents c3f8fd1 + a37489a commit 1c351c9

File tree

2 files changed

+206
-0
lines changed

2 files changed

+206
-0
lines changed
Lines changed: 115 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,115 @@
1+
---
2+
title: "LIME & SHAP: Interpreting the Black Box"
3+
sidebar_label: LIME & SHAP
4+
description: "A deep dive into local explanations using LIME and game-theory-based global/local explanations with SHAP."
5+
tags: [machine-learning, xai, lime, shap, interpretability]
6+
---
7+
8+
When using complex models like XGBoost, Random Forests, or Neural Networks, we often need to know: *"Why did the model deny this specific loan?"* or *"What features are driving the predictions for all users?"* **LIME** and **SHAP** are the industry standards for answering these questions.
9+
10+
## 1. LIME (Local Interpretable Model-Agnostic Explanations)
11+
12+
LIME works on the principle that while a model may be incredibly complex globally, it can be approximated by a simple, linear model **locally** around a specific data point.
13+
14+
### How LIME Works:
15+
1. **Select a point:** Choose the specific instance you want to explain.
16+
2. **Perturb the data:** Create new "fake" samples by slightly varying the features of that point.
17+
3. **Get predictions:** Run these fake samples through the "black box" model.
18+
4. **Weight the samples:** Give more weight to samples that are closer to the original point.
19+
5. **Train a surrogate:** Train a simple Linear Regression model on this weighted, perturbed dataset.
20+
6. **Explain:** The coefficients of the linear model act as the explanation.
21+
22+
## 2. SHAP (SHapley Additive exPlanations)
23+
24+
SHAP is based on **Shapley Values** from cooperative game theory. It treats every feature of a model as a "player" in a game where the "payout" is the model's prediction.
25+
26+
### The Core Concept:
27+
SHAP calculates the contribution of each feature by comparing what the model predicts **with** the feature versus **without** it, across all possible combinations of features.
28+
29+
**The result is "Additive":**
30+
31+
$$
32+
f(x) = \text{base\_value} + \sum \text{shap\_values}
33+
$$
34+
35+
* **Base Value:** The average prediction of the model across the training set.
36+
* **SHAP Value:** The amount a specific feature pushed the prediction higher or lower than the average.
37+
38+
## 3. LIME vs. SHAP: The Comparison
39+
40+
| Feature | LIME | SHAP |
41+
| :--- | :--- | :--- |
42+
| **Foundation** | Local Linear Surrogates | Game Theory (Shapley Values) |
43+
| **Consistency** | Can be unstable (results vary on perturbation) | Mathematically consistent and fair |
44+
| **Speed** | Very Fast | Can be slow (computationally expensive) |
45+
| **Scope** | Strictly Local | Both Local and Global |
46+
47+
## 4. Visualization Logic
48+
49+
The following diagram illustrates the flow of SHAP from the model output down to the feature-level contribution.
50+
51+
```mermaid
52+
graph TD
53+
Output[Model Prediction: 0.85] --> Base[Base Value / Mean: 0.50]
54+
55+
subgraph Feature_Contributions [Shapley Attribution]
56+
F1[Income: +0.20]
57+
F2[Credit Score: +0.10]
58+
F3[Age: +0.10]
59+
F4[Debt: -0.05]
60+
end
61+
62+
Base --> F1
63+
F1 --> F2
64+
F2 --> F3
65+
F3 --> F4
66+
F4 --> Final[Final Prediction: 0.85]
67+
68+
style F1 fill:#e8f5e9,stroke:#2e7d32,color:#333
69+
style F4 fill:#ffebee,stroke:#c62828,color:#333
70+
style Final fill:#e1f5fe,stroke:#01579b,color:#333
71+
72+
```
73+
74+
## 5. Global Interpretation with SHAP
75+
76+
One of SHAP's greatest strengths is the **Summary Plot**. By aggregating the SHAP values of all points in a dataset, we can see:
77+
78+
1. Which features are the most important.
79+
2. How the *value* of a feature (high vs. low) impacts the output.
80+
81+
## 6. Implementation Sketch (Python)
82+
83+
Both LIME and SHAP have robust Python libraries.
84+
85+
```python
86+
import shap
87+
import lime
88+
import lime.lime_tabular
89+
90+
# --- SHAP Example ---
91+
explainer = shap.TreeExplainer(my_model)
92+
shap_values = explainer.shap_values(X_test)
93+
94+
# Visualize the first prediction's explanation
95+
shap.initjs()
96+
shap.force_plot(explainer.expected_value, shap_values[0,:], X_test.iloc[0,:])
97+
98+
# --- LIME Example ---
99+
explainer_lime = lime.lime_tabular.LimeTabularExplainer(
100+
X_train.values, feature_names=X_train.columns, class_names=['Negative', 'Positive']
101+
)
102+
exp = explainer_lime.explain_instance(X_test.values[0], my_model.predict_proba)
103+
exp.show_in_notebook()
104+
105+
```
106+
107+
## References
108+
109+
* **SHAP GitHub:** [Official Repository and Documentation](https://github.com/slundberg/shap)
110+
* **LIME Paper:** ["Why Should I Trust You?": Explaining the Predictions of Any Classifier](https://arxiv.org/abs/1602.04938)
111+
* **Distill.pub:** [Interpretable Machine Learning Guide](https://christophm.github.io/interpretable-ml-book/shap.html)
112+
113+
---
114+
115+
**LIME and SHAP help us understand "Tabular" and "Text" data. But what if we are using CNNs for images? How do we know which pixels the model is looking at?**
Lines changed: 91 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,91 @@
1+
---
2+
title: "XAI Basics: Beyond the Black Box"
3+
sidebar_label: XAI Basics
4+
description: "An introduction to Explainable AI, its importance in ethics and regulation, and the trade-off between performance and interpretability."
5+
tags: [machine-learning, xai, ethics, transparency, interpretability]
6+
---
7+
8+
As Machine Learning models become more complex (like Deep Neural Networks and Transformers), they often become **"Black Boxes."** We can see the input and the output, but we don't truly understand *why* the model made a specific decision.
9+
10+
**Explainable AI (XAI)** is a set of processes and methods that allows human users to comprehend and trust the results and output created by machine learning algorithms.
11+
12+
## 1. Why do we need XAI?
13+
14+
In many industries, a simple "prediction" isn't enough. We need justification for the following reasons:
15+
16+
* **Trust and Accountability:** If a medical AI diagnoses a patient, the doctor needs to know which features (symptoms) led to that conclusion.
17+
* **Bias Detection:** XAI helps uncover if a model is making decisions based on protected attributes like race, gender, or age.
18+
* **Regulatory Compliance:** Laws like the **GDPR** include a "right to explanation," meaning users can demand to know how an automated decision was made about them.
19+
* **Model Debugging:** Understanding why a model failed is the first step toward fixing it.
20+
21+
## 2. The Interpretability vs. Accuracy Trade-off
22+
23+
There is generally an inverse relationship between how well a model performs and how easy it is to explain.
24+
25+
| Model Type | Interpretability | Accuracy (Complex Data) |
26+
| :--- | :--- | :--- |
27+
| **Linear Regression** | High (Coefficients) | Low |
28+
| **Decision Trees** | High (Visual paths) | Medium |
29+
| **Random Forests** | Medium | High |
30+
| **Deep Learning** | Low (Black Box) | Very High |
31+
32+
## 3. Key Concepts in XAI
33+
34+
To navigate the world of explainability, we must distinguish between different scopes and methods:
35+
36+
### A. Intrinsic vs. Post-hoc
37+
* **Intrinsic (Ante-hoc):** Models that are simple enough to be self-explanatory (e.g., a small Decision Tree).
38+
* **Post-hoc:** Methods applied *after* a complex model is trained to extract explanations (e.g., SHAP, LIME).
39+
40+
### B. Global vs. Local Explanations
41+
* **Global Explainability:** Understanding the *entire* logic of the model. "What features are most important for all predictions?"
42+
* **Local Explainability:** Understanding a *single* specific prediction. "Why was *this* specific loan application rejected?"
43+
44+
## 4. Logical Framework of XAI (Mermaid)
45+
46+
The following diagram illustrates how XAI bridges the gap between the Machine Learning model and the Human User.
47+
48+
```mermaid
49+
graph LR
50+
Data[Training Data] --> ML_Model[Complex ML Model]
51+
ML_Model --> Prediction[Prediction/Result]
52+
53+
subgraph XAI_Layer [Explainability Layer]
54+
ML_Model --> Explain_Method[XAI Method: SHAP/LIME/Grad-CAM]
55+
Explain_Method --> Explanation[Human-Interpretable Explanation]
56+
end
57+
58+
Explanation --> User((Human User))
59+
Prediction --> User
60+
61+
style XAI_Layer fill:#e8f5e9,stroke:#2e7d32,stroke-width:2px,color:#333
62+
style ML_Model fill:#f5f5f5,stroke:#333,color:#333
63+
style Explanation fill:#fff3e0,stroke:#ef6c00,color:#333
64+
65+
```
66+
67+
## 5. Standard XAI Techniques
68+
69+
We will cover these in detail in the following chapters:
70+
71+
1. **Feature Importance:** Ranking which variables had the biggest impact on the model.
72+
2. **Partial Dependence Plots (PDP):** Showing how a feature affects the outcome while holding others constant.
73+
3. **LIME:** Approximating a complex model locally with a simpler, interpretable one.
74+
4. **SHAP:** Using game theory to fairly attribute the "payout" (prediction) to each "player" (feature).
75+
76+
## 6. Evaluation Criteria for Explanations
77+
78+
What makes an explanation "good"?
79+
80+
* **Fidelity:** How accurately does the explanation represent what the model actually did?
81+
* **Understandability:** Is the explanation simple enough for a non-technical user?
82+
* **Robustness:** Does the explanation stay consistent for similar inputs?
83+
84+
## References
85+
86+
* **DARPA:** [Explainable Artificial Intelligence (XAI) Program](https://www.darpa.mil/program/explainable-artificial-intelligence)
87+
* **Book:** [Interpretable Machine Learning by Christoph Molnar](https://christophm.github.io/interpretable-ml-book/)
88+
89+
---
90+
91+
**Now that we understand the "why," let's look at one of the most popular methods for explaining individual predictions.**

0 commit comments

Comments
 (0)