Skip to content

Commit dc16b71

Browse files
feat(letsplot): implement calibration-curve (#2358)
## Implementation: `calibration-curve` - letsplot Implements the **letsplot** version of `calibration-curve`. **File:** `plots/calibration-curve/implementations/letsplot.py` --- :robot: *[impl-generate workflow](https://github.com/MarkusNeusinger/pyplots/actions/runs/20528205405)* --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
1 parent 7766286 commit dc16b71

2 files changed

Lines changed: 152 additions & 0 deletions

File tree

Lines changed: 123 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,123 @@
1+
""" pyplots.ai
2+
calibration-curve: Calibration Curve
3+
Library: letsplot 4.8.2 | Python 3.13.11
4+
Quality: 91/100 | Created: 2025-12-26
5+
"""
6+
7+
import numpy as np
8+
import pandas as pd
9+
from lets_plot import *
10+
11+
12+
LetsPlot.setup_html()
13+
14+
# Data - Generate realistic binary classification predictions
15+
np.random.seed(42)
16+
n_samples = 1000
17+
18+
# Create true labels with imbalanced classes (60/40 split)
19+
y_true = np.concatenate([np.zeros(600), np.ones(400)])
20+
21+
# Generate predicted probabilities with realistic calibration issues
22+
# Model tends to be slightly overconfident (probabilities pushed toward extremes)
23+
y_prob = np.zeros(n_samples)
24+
25+
# For true negatives: mostly low probabilities with some mid-range
26+
y_prob[:600] = np.clip(np.random.beta(2, 5, 600) * 0.6 + np.random.normal(0, 0.05, 600), 0, 1)
27+
# For true positives: mostly high probabilities but with spread
28+
y_prob[600:] = np.clip(np.random.beta(5, 2, 400) * 0.6 + 0.35 + np.random.normal(0, 0.08, 400), 0, 1)
29+
30+
# Shuffle the data
31+
shuffle_idx = np.random.permutation(n_samples)
32+
y_true = y_true[shuffle_idx]
33+
y_prob = y_prob[shuffle_idx]
34+
35+
# Calculate calibration curve with 10 bins
36+
n_bins = 10
37+
bin_edges = np.linspace(0, 1, n_bins + 1)
38+
bin_centers = (bin_edges[:-1] + bin_edges[1:]) / 2
39+
40+
mean_predicted = []
41+
fraction_positive = []
42+
bin_counts = []
43+
44+
for i in range(n_bins):
45+
mask = (y_prob >= bin_edges[i]) & (y_prob < bin_edges[i + 1])
46+
if i == n_bins - 1: # Include right edge for last bin
47+
mask = (y_prob >= bin_edges[i]) & (y_prob <= bin_edges[i + 1])
48+
49+
if mask.sum() > 0:
50+
mean_predicted.append(y_prob[mask].mean())
51+
fraction_positive.append(y_true[mask].mean())
52+
bin_counts.append(mask.sum())
53+
else:
54+
mean_predicted.append(bin_centers[i])
55+
fraction_positive.append(np.nan)
56+
bin_counts.append(0)
57+
58+
# Calculate Brier Score
59+
brier_score = np.mean((y_prob - y_true) ** 2)
60+
61+
# Calculate Expected Calibration Error (ECE)
62+
ece = 0
63+
total_samples = sum(bin_counts)
64+
for i in range(n_bins):
65+
if bin_counts[i] > 0:
66+
ece += (bin_counts[i] / total_samples) * abs(fraction_positive[i] - mean_predicted[i])
67+
68+
# Create dataframe for calibration curve
69+
df_calibration = pd.DataFrame(
70+
{"mean_predicted": mean_predicted, "fraction_positive": fraction_positive, "bin_count": bin_counts}
71+
)
72+
df_calibration = df_calibration.dropna()
73+
74+
# Create dataframe for diagonal (perfect calibration)
75+
df_diagonal = pd.DataFrame({"x": [0, 1], "y": [0, 1]})
76+
77+
# Create dataframe for histogram of predictions
78+
hist_bins = 20
79+
hist_counts, hist_edges = np.histogram(y_prob, bins=hist_bins, range=(0, 1))
80+
hist_centers = (hist_edges[:-1] + hist_edges[1:]) / 2
81+
df_histogram = pd.DataFrame(
82+
{
83+
"prob_center": hist_centers,
84+
"count": hist_counts / hist_counts.max(), # Normalize for subplot
85+
}
86+
)
87+
88+
# Plot
89+
plot = (
90+
ggplot()
91+
# Perfect calibration diagonal line
92+
+ geom_line(aes(x="x", y="y"), data=df_diagonal, color="#888888", size=1.5, linetype="dashed")
93+
# Calibration curve
94+
+ geom_line(aes(x="mean_predicted", y="fraction_positive"), data=df_calibration, color="#306998", size=2)
95+
+ geom_point(
96+
aes(x="mean_predicted", y="fraction_positive"), data=df_calibration, color="#306998", size=5, alpha=0.9
97+
)
98+
# Histogram bars at bottom showing prediction distribution
99+
+ geom_bar(
100+
aes(x="prob_center", y="count"), data=df_histogram, stat="identity", fill="#FFD43B", alpha=0.6, width=0.045
101+
)
102+
# Labels and styling
103+
+ labs(
104+
x="Mean Predicted Probability",
105+
y="Fraction of Positives",
106+
title=f"calibration-curve · letsplot · pyplots.ai\nBrier Score: {brier_score:.4f} | ECE: {ece:.4f}",
107+
)
108+
+ scale_x_continuous(limits=[0, 1], breaks=[0, 0.2, 0.4, 0.6, 0.8, 1.0])
109+
+ scale_y_continuous(limits=[0, 1], breaks=[0, 0.2, 0.4, 0.6, 0.8, 1.0])
110+
+ theme_minimal()
111+
+ theme(
112+
plot_title=element_text(size=22),
113+
axis_title=element_text(size=18),
114+
axis_text=element_text(size=14),
115+
panel_grid_major=element_line(color="#CCCCCC", size=0.5),
116+
panel_grid_minor=element_blank(),
117+
)
118+
+ ggsize(1600, 900)
119+
)
120+
121+
# Save outputs
122+
ggsave(plot, "plot.png", path=".", scale=3)
123+
ggsave(plot, "plot.html", path=".")
Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
library: letsplot
2+
specification_id: calibration-curve
3+
created: '2025-12-26T19:37:18Z'
4+
updated: '2025-12-26T19:43:58Z'
5+
generated_by: claude-opus-4-5-20251101
6+
workflow_run: 20528205405
7+
issue: 0
8+
python_version: 3.13.11
9+
library_version: 4.8.2
10+
preview_url: https://storage.googleapis.com/pyplots-images/plots/calibration-curve/letsplot/plot.png
11+
preview_thumb: https://storage.googleapis.com/pyplots-images/plots/calibration-curve/letsplot/plot_thumb.png
12+
preview_html: https://storage.googleapis.com/pyplots-images/plots/calibration-curve/letsplot/plot.html
13+
quality_score: 91
14+
review:
15+
strengths:
16+
- Excellent implementation of all spec requirements including diagonal reference
17+
line, 10 bins, both Brier score and ECE metrics
18+
- Clean integration of histogram showing prediction distribution at the bottom without
19+
requiring a separate subplot
20+
- Good color scheme with blue calibration curve, gray reference diagonal, and yellow
21+
histogram bars
22+
- Proper title format following pyplots.ai conventions
23+
- Well-structured code following KISS principles with good reproducibility
24+
weaknesses:
25+
- The calibration curve is quite extreme - fraction of positives jumps from near
26+
0 to near 1 very sharply around the 0.3-0.6 probability range, making the miscalibration
27+
pattern less gradual than typical real-world examples
28+
- Histogram bars extend quite high (normalized to max) which visually competes with
29+
the calibration curve; could use smaller scale factor

0 commit comments

Comments
 (0)