Skip to content

Commit de3121e

Browse files
feat(pygal): implement logistic-regression (#3575)
## Implementation: `logistic-regression` - pygal Implements the **pygal** version of `logistic-regression`. **File:** `plots/logistic-regression/implementations/pygal.py` **Parent Issue:** #3550 --- :robot: *[impl-generate workflow](https://github.com/MarkusNeusinger/pyplots/actions/runs/20868565603)* --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
1 parent a4d8524 commit de3121e

2 files changed

Lines changed: 335 additions & 0 deletions

File tree

Lines changed: 122 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,122 @@
1+
""" pyplots.ai
2+
logistic-regression: Logistic Regression Curve Plot
3+
Library: pygal 3.1.0 | Python 3.13.11
4+
Quality: 91/100 | Created: 2026-01-09
5+
"""
6+
7+
import numpy as np
8+
import pygal
9+
from pygal.style import Style
10+
11+
12+
# Data - Credit approval example based on credit score
13+
np.random.seed(42)
14+
n_samples = 150
15+
16+
# Generate credit scores (300-850 range)
17+
credit_scores = np.concatenate(
18+
[
19+
np.random.normal(520, 70, n_samples // 2), # Lower scores (more rejections)
20+
np.random.normal(720, 60, n_samples // 2), # Higher scores (more approvals)
21+
]
22+
)
23+
credit_scores = np.clip(credit_scores, 300, 850)
24+
25+
# Generate binary outcomes with logistic probability
26+
true_probs = 1 / (1 + np.exp(-0.015 * (credit_scores - 600)))
27+
y = (np.random.random(n_samples) < true_probs).astype(int)
28+
29+
# Fit logistic regression using gradient descent (numpy only)
30+
X = (credit_scores - credit_scores.mean()) / credit_scores.std() # Normalize for stability
31+
b0, b1 = 0.0, 0.0
32+
learning_rate = 0.1
33+
for _ in range(1000):
34+
z = b0 + b1 * X
35+
p = 1 / (1 + np.exp(-np.clip(z, -500, 500)))
36+
grad_b0 = np.mean(p - y)
37+
grad_b1 = np.mean((p - y) * X)
38+
b0 -= learning_rate * grad_b0
39+
b1 -= learning_rate * grad_b1
40+
41+
# Generate smooth curve for predictions
42+
x_curve = np.linspace(300, 850, 100)
43+
x_curve_norm = (x_curve - credit_scores.mean()) / credit_scores.std()
44+
y_proba = 1 / (1 + np.exp(-np.clip(b0 + b1 * x_curve_norm, -500, 500)))
45+
46+
# Confidence interval (approximate using binomial SE)
47+
se = np.sqrt(y_proba * (1 - y_proba) / n_samples) * 1.5
48+
ci_lower = np.clip(y_proba - 1.96 * se, 0, 1)
49+
ci_upper = np.clip(y_proba + 1.96 * se, 0, 1)
50+
51+
# Jitter y values for visibility
52+
y_jittered = y + np.random.uniform(-0.025, 0.025, n_samples)
53+
54+
# Custom style for large canvas
55+
custom_style = Style(
56+
background="white",
57+
plot_background="white",
58+
foreground="#333333",
59+
foreground_strong="#333333",
60+
foreground_subtle="#666666",
61+
colors=("#306998", "#306998", "#306998", "#888888", "#E74C3C", "#FFD43B"),
62+
title_font_size=56,
63+
label_font_size=36,
64+
major_label_font_size=32,
65+
legend_font_size=32,
66+
value_font_size=24,
67+
stroke_width=4,
68+
opacity=0.7,
69+
opacity_hover=0.95,
70+
font_family="sans-serif",
71+
)
72+
73+
# Create XY chart
74+
chart = pygal.XY(
75+
width=4800,
76+
height=2700,
77+
style=custom_style,
78+
title="logistic-regression · pygal · pyplots.ai",
79+
x_title="Credit Score",
80+
y_title="Probability of Approval",
81+
show_dots=True,
82+
stroke=True,
83+
show_x_guides=True,
84+
show_y_guides=True,
85+
dots_size=10,
86+
stroke_style={"width": 4},
87+
range=(0, 1.05),
88+
xrange=(280, 870),
89+
explicit_size=True,
90+
legend_at_bottom=True,
91+
legend_box_size=28,
92+
truncate_legend=-1,
93+
print_values=False,
94+
)
95+
96+
# Add logistic regression curve (main feature)
97+
curve_points = [(float(x_curve[i]), float(y_proba[i])) for i in range(len(x_curve))]
98+
chart.add("Logistic Fit", curve_points, stroke_style={"width": 5}, dots_size=0, show_dots=False)
99+
100+
# Add confidence interval bounds
101+
ci_upper_pts = [(float(x_curve[i]), float(ci_upper[i])) for i in range(0, len(x_curve), 2)]
102+
ci_lower_pts = [(float(x_curve[i]), float(ci_lower[i])) for i in range(0, len(x_curve), 2)]
103+
chart.add("95% CI Upper", ci_upper_pts, stroke_style={"width": 2, "dasharray": "8,4"}, dots_size=0, show_dots=False)
104+
chart.add("95% CI Lower", ci_lower_pts, stroke_style={"width": 2, "dasharray": "8,4"}, dots_size=0, show_dots=False)
105+
106+
# Add decision threshold line (y = 0.5)
107+
threshold_pts = [(300.0, 0.5), (850.0, 0.5)]
108+
chart.add(
109+
"Threshold (p=0.5)", threshold_pts, stroke_style={"width": 3, "dasharray": "12,6"}, dots_size=0, show_dots=False
110+
)
111+
112+
# Add data points - Rejected (Class 0)
113+
rejected_pts = [(float(credit_scores[i]), float(y_jittered[i])) for i in range(n_samples) if y[i] == 0]
114+
chart.add("Rejected (0)", rejected_pts, stroke=False, dots_size=14)
115+
116+
# Add data points - Approved (Class 1)
117+
approved_pts = [(float(credit_scores[i]), float(y_jittered[i])) for i in range(n_samples) if y[i] == 1]
118+
chart.add("Approved (1)", approved_pts, stroke=False, dots_size=14)
119+
120+
# Save as PNG and HTML
121+
chart.render_to_png("plot.png")
122+
chart.render_to_file("plot.html")
Lines changed: 213 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,213 @@
1+
library: pygal
2+
specification_id: logistic-regression
3+
created: '2026-01-09T23:32:39Z'
4+
updated: '2026-01-09T23:35:42Z'
5+
generated_by: claude-opus-4-5-20251101
6+
workflow_run: 20868565603
7+
issue: 3550
8+
python_version: 3.13.11
9+
library_version: 3.1.0
10+
preview_url: https://storage.googleapis.com/pyplots-images/plots/logistic-regression/pygal/plot.png
11+
preview_thumb: https://storage.googleapis.com/pyplots-images/plots/logistic-regression/pygal/plot_thumb.png
12+
preview_html: https://storage.googleapis.com/pyplots-images/plots/logistic-regression/pygal/plot.html
13+
quality_score: 91
14+
review:
15+
strengths:
16+
- Excellent sigmoid curve representation with clear S-shape transition
17+
- Good use of custom pygal Style with appropriate font sizes for large canvas
18+
- Realistic credit approval scenario with proper jittering of binary outcomes
19+
- Clean implementation of gradient descent for logistic regression without external
20+
dependencies
21+
- Decision threshold line clearly visible at p=0.5
22+
- Proper title format following pyplots.ai conventions
23+
weaknesses:
24+
- Confidence interval shown as dashed lines rather than semi-transparent shaded
25+
band (spec preference, though pygal has limited fill capabilities)
26+
- All blue elements (logistic curve, CI bounds) use same color making them less
27+
visually distinct
28+
- Legend uses square markers for all items instead of line symbols for line-based
29+
elements
30+
image_description: 'The plot displays a logistic regression visualization for credit
31+
approval based on credit score. It shows a characteristic S-shaped (sigmoid) curve
32+
in blue representing the fitted logistic model. The x-axis displays "Credit Score"
33+
ranging from 300 to 850, and the y-axis shows "Probability of Approval" from 0
34+
to 1. Red dots represent rejected applicants (Class 0), clustered near y=0 with
35+
jittering, while yellow/gold dots represent approved applicants (Class 1), clustered
36+
near y=1. Two dashed blue lines show the 95% confidence interval bounds around
37+
the main curve. A horizontal dashed gray line at y=0.5 indicates the decision
38+
threshold. The legend at the bottom shows six items: Logistic Fit, 95% CI Upper,
39+
95% CI Lower, Threshold (p=0.5), Rejected (0), and Approved (1). The title follows
40+
the correct format: "logistic-regression · pygal · pyplots.ai".'
41+
criteria_checklist:
42+
visual_quality:
43+
score: 36
44+
max: 40
45+
items:
46+
- id: VQ-01
47+
name: Text Legibility
48+
score: 9
49+
max: 10
50+
passed: true
51+
comment: All text readable, good font sizes for large canvas
52+
- id: VQ-02
53+
name: No Overlap
54+
score: 8
55+
max: 8
56+
passed: true
57+
comment: No overlapping text elements
58+
- id: VQ-03
59+
name: Element Visibility
60+
score: 7
61+
max: 8
62+
passed: true
63+
comment: Markers visible with good size, though same color for all blue elements
64+
makes CI bounds less distinct
65+
- id: VQ-04
66+
name: Color Accessibility
67+
score: 4
68+
max: 5
69+
passed: true
70+
comment: Red/yellow distinction is clear, but using same blue for all lines
71+
reduces differentiation
72+
- id: VQ-05
73+
name: Layout Balance
74+
score: 5
75+
max: 5
76+
passed: true
77+
comment: Good use of canvas, balanced margins
78+
- id: VQ-06
79+
name: Axis Labels
80+
score: 2
81+
max: 2
82+
passed: true
83+
comment: Clear descriptive labels Credit Score and Probability of Approval
84+
- id: VQ-07
85+
name: Grid & Legend
86+
score: 1
87+
max: 2
88+
passed: true
89+
comment: Legend at bottom is functional but legend items use colored squares
90+
rather than line symbols
91+
spec_compliance:
92+
score: 24
93+
max: 25
94+
items:
95+
- id: SC-01
96+
name: Plot Type
97+
score: 8
98+
max: 8
99+
passed: true
100+
comment: Correct logistic regression curve with sigmoid shape
101+
- id: SC-02
102+
name: Data Mapping
103+
score: 5
104+
max: 5
105+
passed: true
106+
comment: x=credit score, y=probability/binary outcome correctly mapped
107+
- id: SC-03
108+
name: Required Features
109+
score: 4
110+
max: 5
111+
passed: true
112+
comment: Has curve, CI, threshold line, jittered points; spec suggested semi-transparent
113+
shading for CI but implementation uses dashed lines
114+
- id: SC-04
115+
name: Data Range
116+
score: 3
117+
max: 3
118+
passed: true
119+
comment: Axes show full data range appropriately
120+
- id: SC-05
121+
name: Legend Accuracy
122+
score: 2
123+
max: 2
124+
passed: true
125+
comment: Legend labels correctly identify all elements
126+
- id: SC-06
127+
name: Title Format
128+
score: 2
129+
max: 2
130+
passed: true
131+
comment: Correct format logistic-regression · pygal · pyplots.ai
132+
data_quality:
133+
score: 19
134+
max: 20
135+
items:
136+
- id: DQ-01
137+
name: Feature Coverage
138+
score: 7
139+
max: 8
140+
passed: true
141+
comment: Shows both classes, sigmoid transition, but density of rejected cases
142+
at higher scores could be more balanced
143+
- id: DQ-02
144+
name: Realistic Context
145+
score: 7
146+
max: 7
147+
passed: true
148+
comment: Credit score approval is a real, neutral, comprehensible scenario
149+
- id: DQ-03
150+
name: Appropriate Scale
151+
score: 5
152+
max: 5
153+
passed: true
154+
comment: Credit scores 300-850 and probability 0-1 are realistic ranges
155+
code_quality:
156+
score: 9
157+
max: 10
158+
items:
159+
- id: CQ-01
160+
name: KISS Structure
161+
score: 3
162+
max: 3
163+
passed: true
164+
comment: Flat script structure, no functions/classes
165+
- id: CQ-02
166+
name: Reproducibility
167+
score: 3
168+
max: 3
169+
passed: true
170+
comment: np.random.seed(42) set
171+
- id: CQ-03
172+
name: Clean Imports
173+
score: 2
174+
max: 2
175+
passed: true
176+
comment: Only necessary imports used
177+
- id: CQ-04
178+
name: No Deprecated API
179+
score: 1
180+
max: 1
181+
passed: true
182+
comment: Uses current pygal API
183+
- id: CQ-05
184+
name: Output Correct
185+
score: 0
186+
max: 1
187+
passed: false
188+
comment: Saves both plot.png and plot.html which is correct for pygal
189+
library_features:
190+
score: 3
191+
max: 5
192+
items:
193+
- id: LF-01
194+
name: Distinctive Features
195+
score: 3
196+
max: 5
197+
passed: true
198+
comment: Uses XY chart, custom Style, stroke_style, but CI as dashed lines
199+
rather than filled area (pygal limitation)
200+
verdict: APPROVED
201+
impl_tags:
202+
dependencies: []
203+
techniques:
204+
- html-export
205+
patterns:
206+
- data-generation
207+
- iteration-over-groups
208+
dataprep:
209+
- normalization
210+
- regression
211+
styling:
212+
- alpha-blending
213+
- grid-styling

0 commit comments

Comments
 (0)