Skip to content

Commit 72363f4

Browse files
feat(matplotlib): implement gain-curve (#2455)
## Implementation: `gain-curve` - matplotlib Implements the **matplotlib** version of `gain-curve`. **File:** `plots/gain-curve/implementations/matplotlib.py` --- :robot: *[impl-generate workflow](https://github.com/MarkusNeusinger/pyplots/actions/runs/20584327624)* --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
1 parent 763edde commit 72363f4

2 files changed

Lines changed: 119 additions & 0 deletions

File tree

Lines changed: 92 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,92 @@
1+
""" pyplots.ai
2+
gain-curve: Cumulative Gains Chart
3+
Library: matplotlib 3.10.8 | Python 3.13.11
4+
Quality: 93/100 | Created: 2025-12-29
5+
"""
6+
7+
import matplotlib.pyplot as plt
8+
import numpy as np
9+
10+
11+
# Generate synthetic classification data (customer response model)
12+
np.random.seed(42)
13+
n_samples = 1000
14+
15+
# Create customer features that influence response
16+
customer_value = np.random.randn(n_samples)
17+
customer_engagement = np.random.randn(n_samples)
18+
19+
# True underlying probability (strong signal)
20+
latent_score = 1.5 * customer_value + 1.0 * customer_engagement
21+
true_prob = 1 / (1 + np.exp(-latent_score))
22+
y_true = (np.random.rand(n_samples) < true_prob).astype(int)
23+
24+
# Model predicted probabilities (captures signal well with some noise)
25+
# A good model that shows clear lift over random
26+
y_score = 1 / (1 + np.exp(-(latent_score + np.random.randn(n_samples) * 0.5)))
27+
28+
# Calculate cumulative gains curve
29+
sorted_indices = np.argsort(y_score)[::-1]
30+
y_true_sorted = y_true[sorted_indices]
31+
32+
# Cumulative gains: percentage of population vs percentage of positives captured
33+
total_positives = np.sum(y_true)
34+
cumulative_positives = np.cumsum(y_true_sorted)
35+
gains = cumulative_positives / total_positives * 100
36+
37+
# Percentage of population targeted
38+
n_samples = len(y_true)
39+
population_percentage = np.arange(1, n_samples + 1) / n_samples * 100
40+
41+
# Add origin point (0, 0) for proper plotting
42+
population_percentage = np.insert(population_percentage, 0, 0)
43+
gains = np.insert(gains, 0, 0)
44+
45+
# Create perfect model curve (captures all positives immediately)
46+
positive_rate = total_positives / n_samples * 100
47+
perfect_x = np.array([0, positive_rate, 100])
48+
perfect_y = np.array([0, 100, 100])
49+
50+
# Create plot
51+
fig, ax = plt.subplots(figsize=(16, 9))
52+
53+
# Plot model gains curve
54+
ax.plot(population_percentage, gains, color="#306998", linewidth=3, label="Model", zorder=3)
55+
56+
# Plot random baseline (diagonal)
57+
ax.plot([0, 100], [0, 100], color="#888888", linewidth=2, linestyle="--", label="Random (Baseline)", zorder=2)
58+
59+
# Plot perfect model
60+
ax.plot(perfect_x, perfect_y, color="#FFD43B", linewidth=2, linestyle=":", label="Perfect Model", zorder=2)
61+
62+
# Fill area between model and random baseline
63+
ax.fill_between(population_percentage, gains, population_percentage, alpha=0.2, color="#306998", zorder=1)
64+
65+
# Styling
66+
ax.set_xlabel("Population Targeted (%)", fontsize=20)
67+
ax.set_ylabel("Positive Cases Captured (%)", fontsize=20)
68+
ax.set_title("gain-curve · matplotlib · pyplots.ai", fontsize=24)
69+
70+
ax.set_xlim(0, 100)
71+
ax.set_ylim(0, 100)
72+
ax.set_aspect("equal")
73+
74+
ax.tick_params(axis="both", labelsize=16)
75+
ax.grid(True, alpha=0.3, linestyle="--")
76+
ax.legend(fontsize=16, loc="lower right")
77+
78+
# Add annotation showing key insight
79+
# Find where 20% of population is targeted
80+
idx_20 = np.searchsorted(population_percentage, 20)
81+
gain_at_20 = gains[idx_20]
82+
ax.annotate(
83+
f"Top 20% captures {gain_at_20:.0f}%\nof positive cases",
84+
xy=(20, gain_at_20),
85+
xytext=(35, gain_at_20 - 15),
86+
fontsize=14,
87+
arrowprops={"arrowstyle": "->", "color": "#306998", "lw": 2},
88+
bbox={"boxstyle": "round,pad=0.3", "facecolor": "white", "edgecolor": "#306998", "alpha": 0.9},
89+
)
90+
91+
plt.tight_layout()
92+
plt.savefig("plot.png", dpi=300, bbox_inches="tight")
Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
library: matplotlib
2+
specification_id: gain-curve
3+
created: '2025-12-29T22:47:36Z'
4+
updated: '2025-12-29T22:51:48Z'
5+
generated_by: claude-opus-4-5-20251101
6+
workflow_run: 20584327624
7+
issue: 0
8+
python_version: 3.13.11
9+
library_version: 3.10.8
10+
preview_url: https://storage.googleapis.com/pyplots-images/plots/gain-curve/matplotlib/plot.png
11+
preview_thumb: https://storage.googleapis.com/pyplots-images/plots/gain-curve/matplotlib/plot_thumb.png
12+
preview_html: null
13+
quality_score: 93
14+
review:
15+
strengths:
16+
- Excellent implementation of the gains curve with all spec-required elements (model,
17+
baseline, perfect model)
18+
- Clear visual hierarchy with distinct line styles (solid, dashed, dotted) and colors
19+
- Informative annotation highlighting the key insight at 20% targeting threshold
20+
- Shaded area effectively communicates the model lift over random selection
21+
- Clean, well-structured code following KISS principles
22+
- Proper use of matplotlib Axes methods throughout
23+
weaknesses:
24+
- Legend positioned in lower right overlaps with the data region where curves pass
25+
through
26+
- Square aspect ratio (1:1) while appropriate for the 0-100% axes creates some unused
27+
corner space compared to 16:9

0 commit comments

Comments
 (0)