Skip to content

Commit 1fe9e2c

Browse files
spec: add calibration-curve specification (#2337)
## New Specification: `calibration-curve` Related to #2331 --- ### specification.md # calibration-curve: Calibration Curve ## Description A calibration curve (reliability diagram) visualizes how well the predicted probabilities of a binary classifier match actual outcomes. By plotting the fraction of positives against mean predicted probability in binned intervals, it reveals whether a model is well-calibrated, overconfident, or underconfident. A perfectly calibrated model follows the diagonal line where predicted probability equals observed frequency. ## Applications - Evaluating probability predictions in credit scoring models where accurate risk estimates determine loan pricing - Assessing diagnostic confidence in medical screening systems where probability calibration affects treatment decisions - Comparing calibration across multiple classifiers to select models that produce reliable probability estimates - Validating risk assessment models in insurance and fraud detection where miscalibrated probabilities lead to financial losses ## Data - `y_true` (binary array) - Ground truth binary labels (0 or 1) - `y_prob` (numeric array) - Predicted probabilities from classifier (0 to 1) - Size: 500-10000 samples recommended for reliable binning - Example: Binary classification predictions from sklearn classifier with `predict_proba()` output ## Notes - Include a diagonal reference line representing perfect calibration (predicted probability = observed frequency) - Use 10 bins by default, with bin edges at equal probability intervals (0.0-0.1, 0.1-0.2, etc.) - Display Brier score or Expected Calibration Error (ECE) as a summary metric - Optional: include histogram of predicted probabilities as a secondary subplot to show prediction distribution - For multiple models comparison, use distinct colors with clear legend --- **Next:** Add `approved` label to the issue to merge this PR. --- :robot: *[spec-create workflow](https://github.com/MarkusNeusinger/pyplots/actions/runs/20528130950)* Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
1 parent 2dc4a06 commit 1fe9e2c

4 files changed

Lines changed: 59 additions & 0 deletions

File tree

plots/calibration-curve/implementations/.gitkeep

Whitespace-only changes.

plots/calibration-curve/metadata/.gitkeep

Whitespace-only changes.
Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
# calibration-curve: Calibration Curve
2+
3+
## Description
4+
5+
A calibration curve (reliability diagram) visualizes how well the predicted probabilities of a binary classifier match actual outcomes. By plotting the fraction of positives against mean predicted probability in binned intervals, it reveals whether a model is well-calibrated, overconfident, or underconfident. A perfectly calibrated model follows the diagonal line where predicted probability equals observed frequency.
6+
7+
## Applications
8+
9+
- Evaluating probability predictions in credit scoring models where accurate risk estimates determine loan pricing
10+
- Assessing diagnostic confidence in medical screening systems where probability calibration affects treatment decisions
11+
- Comparing calibration across multiple classifiers to select models that produce reliable probability estimates
12+
- Validating risk assessment models in insurance and fraud detection where miscalibrated probabilities lead to financial losses
13+
14+
## Data
15+
16+
- `y_true` (binary array) - Ground truth binary labels (0 or 1)
17+
- `y_prob` (numeric array) - Predicted probabilities from classifier (0 to 1)
18+
- Size: 500-10000 samples recommended for reliable binning
19+
- Example: Binary classification predictions from sklearn classifier with `predict_proba()` output
20+
21+
## Notes
22+
23+
- Include a diagonal reference line representing perfect calibration (predicted probability = observed frequency)
24+
- Use 10 bins by default, with bin edges at equal probability intervals (0.0-0.1, 0.1-0.2, etc.)
25+
- Display Brier score or Expected Calibration Error (ECE) as a summary metric
26+
- Optional: include histogram of predicted probabilities as a secondary subplot to show prediction distribution
27+
- For multiple models comparison, use distinct colors with clear legend
Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
# Specification-level metadata for calibration-curve
2+
# Auto-synced to PostgreSQL on push to main
3+
4+
spec_id: calibration-curve
5+
title: Calibration Curve
6+
7+
# Specification tracking
8+
created: 2025-12-26T19:28:20Z
9+
updated: null
10+
issue: 2331
11+
suggested: MarkusNeusinger
12+
13+
# Classification tags (applies to all library implementations)
14+
# See docs/concepts/tagging-system.md for detailed guidelines
15+
tags:
16+
plot_type:
17+
- calibration
18+
- line
19+
- reliability-diagram
20+
data_type:
21+
- numeric
22+
- probability
23+
- binary
24+
domain:
25+
- statistics
26+
- machine-learning
27+
- finance
28+
- healthcare
29+
features:
30+
- basic
31+
- model-evaluation
32+
- reference-line

0 commit comments

Comments
 (0)