Skip to content

Commit d67f408

Browse files
feat(plotnine): implement learning-curve-basic (#2290)
## Implementation: `learning-curve-basic` - plotnine Implements the **plotnine** version of `learning-curve-basic`. **File:** `plots/learning-curve-basic/implementations/plotnine.py` --- :robot: *[impl-generate workflow](https://github.com/MarkusNeusinger/pyplots/actions/runs/20526602205)* --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
1 parent e9d5b72 commit d67f408

2 files changed

Lines changed: 118 additions & 0 deletions

File tree

Lines changed: 93 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,93 @@
1+
""" pyplots.ai
2+
learning-curve-basic: Model Learning Curve
3+
Library: plotnine 0.15.2 | Python 3.13.11
4+
Quality: 92/100 | Created: 2025-12-26
5+
"""
6+
7+
import numpy as np
8+
import pandas as pd
9+
from plotnine import (
10+
aes,
11+
element_text,
12+
geom_line,
13+
geom_ribbon,
14+
ggplot,
15+
labs,
16+
scale_color_manual,
17+
scale_fill_manual,
18+
theme,
19+
theme_minimal,
20+
)
21+
22+
23+
# Data - Simulating learning curve with typical ML model behavior
24+
np.random.seed(42)
25+
26+
# Training set sizes (10 points from 50 to 800 samples)
27+
train_sizes = np.linspace(50, 800, 10).astype(int)
28+
29+
# Simulate cross-validation folds (5 folds)
30+
n_folds = 5
31+
32+
# Training scores: start high, stay high (model learns training data well)
33+
train_mean = 0.99 - 0.15 * np.exp(-train_sizes / 150)
34+
train_std = 0.02 * np.exp(-train_sizes / 300) + 0.005
35+
36+
# Validation scores: start lower, improve with more data (learning pattern)
37+
val_mean = 0.65 + 0.25 * (1 - np.exp(-train_sizes / 250))
38+
val_std = 0.08 * np.exp(-train_sizes / 400) + 0.01
39+
40+
# Create DataFrame for plotting
41+
df_train = pd.DataFrame(
42+
{
43+
"Training Set Size": train_sizes,
44+
"Score": train_mean,
45+
"Score_low": train_mean - train_std,
46+
"Score_high": train_mean + train_std,
47+
"Type": "Training Score",
48+
}
49+
)
50+
51+
df_val = pd.DataFrame(
52+
{
53+
"Training Set Size": train_sizes,
54+
"Score": val_mean,
55+
"Score_low": val_mean - val_std,
56+
"Score_high": val_mean + val_std,
57+
"Type": "Validation Score",
58+
}
59+
)
60+
61+
df = pd.concat([df_train, df_val], ignore_index=True)
62+
63+
# Colors: Python Blue for training, Python Yellow for validation
64+
colors = {"Training Score": "#306998", "Validation Score": "#FFD43B"}
65+
66+
# Create plot
67+
plot = (
68+
ggplot(df, aes(x="Training Set Size", y="Score", color="Type", fill="Type"))
69+
+ geom_ribbon(aes(ymin="Score_low", ymax="Score_high"), alpha=0.25, color="none")
70+
+ geom_line(size=2)
71+
+ scale_color_manual(values=colors)
72+
+ scale_fill_manual(values=colors)
73+
+ labs(
74+
x="Training Set Size",
75+
y="Accuracy Score",
76+
title="learning-curve-basic · plotnine · pyplots.ai",
77+
color="",
78+
fill="",
79+
)
80+
+ theme_minimal()
81+
+ theme(
82+
figure_size=(16, 9),
83+
text=element_text(size=14),
84+
axis_title=element_text(size=20),
85+
axis_text=element_text(size=16),
86+
plot_title=element_text(size=24),
87+
legend_text=element_text(size=16),
88+
legend_position=(0.85, 0.25),
89+
)
90+
)
91+
92+
# Save
93+
plot.save("plot.png", dpi=300)
Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
library: plotnine
2+
specification_id: learning-curve-basic
3+
created: '2025-12-26T17:37:26Z'
4+
updated: '2025-12-26T17:45:22Z'
5+
generated_by: claude-opus-4-5-20251101
6+
workflow_run: 20526602205
7+
issue: 0
8+
python_version: 3.13.11
9+
library_version: 0.15.2
10+
preview_url: https://storage.googleapis.com/pyplots-images/plots/learning-curve-basic/plotnine/plot.png
11+
preview_thumb: https://storage.googleapis.com/pyplots-images/plots/learning-curve-basic/plotnine/plot_thumb.png
12+
preview_html: null
13+
quality_score: 92
14+
review:
15+
strengths:
16+
- Excellent use of plotnine grammar of graphics with geom_ribbon for confidence
17+
bands
18+
- Clean, well-structured code following KISS principles
19+
- Color scheme is visually appealing and colorblind-safe (blue/gold contrast)
20+
- Legend positioning works well within the plot area
21+
- Realistic ML learning curve behavior accurately depicted
22+
- Proper title format following pyplots.ai conventions
23+
weaknesses:
24+
- Missing grid lines which would help read exact values from the plot
25+
- Y-axis label could include units or range indicator

0 commit comments

Comments
 (0)