Skip to content

Commit 95e3f29

Browse files
spec: add precision-recall specification (#2279)
## New Specification: `precision-recall` Related to #2274 --- ### specification.md # precision-recall: Precision-Recall Curve ## Description A Precision-Recall curve plots precision (positive predictive value) against recall (sensitivity) at various classification thresholds. This visualization is essential for evaluating binary classifiers on imbalanced datasets where accuracy alone is misleading. The area under the curve (Average Precision) summarizes classifier performance, with higher values indicating better performance. ## Applications - Evaluating fraud detection models where fraudulent transactions are rare compared to legitimate ones - Assessing medical diagnostic systems where correctly identifying positive cases (high recall) is critical - Comparing information retrieval systems for document search relevance ranking - Optimizing spam filters to balance catching spam (recall) with minimizing false positives (precision) ## Data - `y_true` (binary array) - Ground truth binary labels (0 or 1) - `y_scores` (numeric array) - Predicted probabilities or decision function scores from classifier - Size: 100-10000 samples typical for evaluation - Example: Binary classification predictions from sklearn classifier with `predict_proba()` output ## Notes - Display Average Precision (AP) score in legend or annotation - Include baseline reference line showing random classifier performance (horizontal line at positive class ratio) - Use stepped line style to accurately represent threshold-based curve - Consider showing iso-F1 curves as contour lines for F1 score reference - For multiple classifiers comparison, use distinct colors with clear legend --- **Next:** Add `approved` label to the issue to merge this PR. --- :robot: *[spec-create workflow](https://github.com/MarkusNeusinger/pyplots/actions/runs/20526505250)* Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
1 parent 022c7e7 commit 95e3f29

2 files changed

Lines changed: 60 additions & 0 deletions

File tree

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
# precision-recall: Precision-Recall Curve
2+
3+
## Description
4+
5+
A Precision-Recall curve plots precision (positive predictive value) against recall (sensitivity) at various classification thresholds. This visualization is essential for evaluating binary classifiers on imbalanced datasets where accuracy alone is misleading. The area under the curve (Average Precision) summarizes classifier performance, with higher values indicating better performance.
6+
7+
## Applications
8+
9+
- Evaluating fraud detection models where fraudulent transactions are rare compared to legitimate ones
10+
- Assessing medical diagnostic systems where correctly identifying positive cases (high recall) is critical
11+
- Comparing information retrieval systems for document search relevance ranking
12+
- Optimizing spam filters to balance catching spam (recall) with minimizing false positives (precision)
13+
14+
## Data
15+
16+
- `y_true` (binary array) - Ground truth binary labels (0 or 1)
17+
- `y_scores` (numeric array) - Predicted probabilities or decision function scores from classifier
18+
- Size: 100-10000 samples typical for evaluation
19+
- Example: Binary classification predictions from sklearn classifier with `predict_proba()` output
20+
21+
## Notes
22+
23+
- Display Average Precision (AP) score in legend or annotation
24+
- Include baseline reference line showing random classifier performance (horizontal line at positive class ratio)
25+
- Use stepped line style to accurately represent threshold-based curve
26+
- Consider showing iso-F1 curves as contour lines for F1 score reference
27+
- For multiple classifiers comparison, use distinct colors with clear legend
Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
# Specification-level metadata for precision-recall
2+
# Auto-synced to PostgreSQL on push to main
3+
4+
spec_id: precision-recall
5+
title: Precision-Recall Curve
6+
7+
# Specification tracking
8+
created: 2025-12-26T17:28:06Z
9+
updated: null
10+
issue: 2274
11+
suggested: MarkusNeusinger
12+
13+
# Classification tags (applies to all library implementations)
14+
# See docs/concepts/tagging-system.md for detailed guidelines
15+
tags:
16+
plot_type:
17+
- line
18+
- curve
19+
- evaluation
20+
data_type:
21+
- numeric
22+
- binary
23+
- probability
24+
domain:
25+
- machine-learning
26+
- statistics
27+
- healthcare
28+
- information-retrieval
29+
features:
30+
- basic
31+
- classification
32+
- model-evaluation
33+
- imbalanced-data

0 commit comments

Comments
 (0)