Skip to content

Commit 4b0ebc5

Browse files
feat(altair): implement calibration-curve (#2347)
## Implementation: `calibration-curve` - altair Implements the **altair** version of `calibration-curve`. **File:** `plots/calibration-curve/implementations/altair.py` --- :robot: *[impl-generate workflow](https://github.com/MarkusNeusinger/pyplots/actions/runs/20528203391)* --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
1 parent 89bef3e commit 4b0ebc5

2 files changed

Lines changed: 135 additions & 0 deletions

File tree

Lines changed: 108 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,108 @@
1+
""" pyplots.ai
2+
calibration-curve: Calibration Curve
3+
Library: altair 6.0.0 | Python 3.13.11
4+
Quality: 92/100 | Created: 2025-12-26
5+
"""
6+
7+
import altair as alt
8+
import numpy as np
9+
import pandas as pd
10+
11+
12+
# Data - Generate synthetic classification predictions
13+
np.random.seed(42)
14+
n_samples = 2000
15+
16+
# Simulate predictions from a slightly overconfident classifier
17+
y_true = np.random.binomial(1, 0.4, n_samples)
18+
# Create predictions correlated with true labels but with some noise
19+
base_prob = y_true * 0.6 + (1 - y_true) * 0.3
20+
noise = np.random.normal(0, 0.15, n_samples)
21+
y_prob = np.clip(base_prob + noise, 0.01, 0.99)
22+
23+
# Calculate calibration curve manually (10 bins)
24+
n_bins = 10
25+
bin_edges = np.linspace(0, 1, n_bins + 1)
26+
prob_true = []
27+
prob_pred = []
28+
29+
for i in range(n_bins):
30+
mask = (y_prob >= bin_edges[i]) & (y_prob < bin_edges[i + 1])
31+
if mask.sum() > 0:
32+
prob_pred.append(y_prob[mask].mean())
33+
prob_true.append(y_true[mask].mean())
34+
35+
# Create calibration data
36+
calibration_df = pd.DataFrame({"Mean Predicted Probability": prob_pred, "Fraction of Positives": prob_true})
37+
38+
# Calculate Brier score
39+
brier_score = np.mean((y_prob - y_true) ** 2)
40+
41+
# Create histogram data for predicted probabilities
42+
hist, bin_edges_hist = np.histogram(y_prob, bins=20)
43+
hist_df = pd.DataFrame({"Probability": (bin_edges_hist[:-1] + bin_edges_hist[1:]) / 2, "Count": hist})
44+
45+
# Perfect calibration line
46+
perfect_df = pd.DataFrame({"x": [0, 1], "y": [0, 1]})
47+
48+
# Calibration curve chart
49+
calibration_line = (
50+
alt.Chart(calibration_df)
51+
.mark_line(color="#306998", strokeWidth=4)
52+
.encode(
53+
x=alt.X("Mean Predicted Probability:Q", scale=alt.Scale(domain=[0, 1]), title="Mean Predicted Probability"),
54+
y=alt.Y("Fraction of Positives:Q", scale=alt.Scale(domain=[0, 1]), title="Fraction of Positives"),
55+
)
56+
)
57+
58+
calibration_points = (
59+
alt.Chart(calibration_df)
60+
.mark_point(color="#306998", size=300, filled=True)
61+
.encode(
62+
x=alt.X("Mean Predicted Probability:Q"),
63+
y=alt.Y("Fraction of Positives:Q"),
64+
tooltip=["Mean Predicted Probability:Q", "Fraction of Positives:Q"],
65+
)
66+
)
67+
68+
# Perfect calibration diagonal line
69+
perfect_line = (
70+
alt.Chart(perfect_df)
71+
.mark_line(color="#FFD43B", strokeWidth=3, strokeDash=[8, 4])
72+
.encode(x=alt.X("x:Q"), y=alt.Y("y:Q"))
73+
)
74+
75+
# Main calibration chart
76+
calibration_chart = alt.layer(perfect_line, calibration_line, calibration_points).properties(
77+
width=1400,
78+
height=600,
79+
title=alt.Title(
80+
"calibration-curve · altair · pyplots.ai",
81+
subtitle=f"Brier Score: {brier_score:.4f}",
82+
fontSize=28,
83+
subtitleFontSize=20,
84+
),
85+
)
86+
87+
# Histogram chart (below)
88+
histogram_chart = (
89+
alt.Chart(hist_df)
90+
.mark_bar(color="#306998", opacity=0.7)
91+
.encode(
92+
x=alt.X("Probability:Q", scale=alt.Scale(domain=[0, 1]), title="Predicted Probability"),
93+
y=alt.Y("Count:Q", title="Count"),
94+
)
95+
.properties(width=1400, height=200, title=alt.Title("Distribution of Predicted Probabilities", fontSize=20))
96+
)
97+
98+
# Combine charts vertically
99+
combined_chart = (
100+
alt.vconcat(calibration_chart, histogram_chart)
101+
.configure_axis(labelFontSize=16, titleFontSize=18)
102+
.configure_title(anchor="middle")
103+
.configure_view(strokeWidth=0)
104+
)
105+
106+
# Save as PNG and HTML
107+
combined_chart.save("plot.png", scale_factor=3.0)
108+
combined_chart.save("plot.html")
Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
library: altair
2+
specification_id: calibration-curve
3+
created: '2025-12-26T19:36:31Z'
4+
updated: '2025-12-26T19:42:30Z'
5+
generated_by: claude-opus-4-5-20251101
6+
workflow_run: 20528203391
7+
issue: 0
8+
python_version: 3.13.11
9+
library_version: 6.0.0
10+
preview_url: https://storage.googleapis.com/pyplots-images/plots/calibration-curve/altair/plot.png
11+
preview_thumb: https://storage.googleapis.com/pyplots-images/plots/calibration-curve/altair/plot_thumb.png
12+
preview_html: https://storage.googleapis.com/pyplots-images/plots/calibration-curve/altair/plot.html
13+
quality_score: 92
14+
review:
15+
strengths:
16+
- Excellent implementation of calibration curve with all required spec features
17+
(diagonal reference, binning, Brier score, histogram)
18+
- Clean, readable code following KISS principles with proper seed for reproducibility
19+
- Good use of Altair layering and vconcat for combining charts
20+
- Appropriate color scheme with good contrast between calibration line (blue) and
21+
reference line (yellow dashed)
22+
- Subtitle elegantly displays the Brier score metric
23+
- Tooltips included on data points for interactivity
24+
weaknesses:
25+
- Missing subtle grid lines which would aid in reading exact values from the calibration
26+
curve
27+
- Histogram bars could use slight spacing/gap for better visual separation

0 commit comments

Comments
 (0)