Skip to content

Commit d30b48c

Browse files
feat(bokeh): implement line-retention-cohort (#4929)
## Implementation: `line-retention-cohort` - bokeh Implements the **bokeh** version of `line-retention-cohort`. **File:** `plots/line-retention-cohort/implementations/bokeh.py` **Parent Issue:** #4572 --- :robot: *[impl-generate workflow](https://github.com/MarkusNeusinger/pyplots/actions/runs/23164943124)* --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
1 parent fd724d1 commit d30b48c

2 files changed

Lines changed: 380 additions & 0 deletions

File tree

Lines changed: 148 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,148 @@
1+
""" pyplots.ai
2+
line-retention-cohort: User Retention Curve by Cohort
3+
Library: bokeh 3.9.0 | Python 3.14.3
4+
Quality: 90/100 | Created: 2026-03-16
5+
"""
6+
7+
import numpy as np
8+
from bokeh.io import export_png, output_file, save
9+
from bokeh.models import ColumnDataSource, HoverTool, Label, Legend, Span
10+
from bokeh.plotting import figure
11+
12+
13+
# Data
14+
np.random.seed(42)
15+
weeks = np.arange(0, 13)
16+
17+
cohorts = {
18+
"Jan 2025": {"size": 1245, "decay": 0.18},
19+
"Feb 2025": {"size": 1380, "decay": 0.16},
20+
"Mar 2025": {"size": 1520, "decay": 0.14},
21+
"Apr 2025": {"size": 1410, "decay": 0.12},
22+
"May 2025": {"size": 1680, "decay": 0.10},
23+
}
24+
25+
retention_data = {}
26+
for cohort, params in cohorts.items():
27+
base = 100 * np.exp(-params["decay"] * weeks)
28+
noise = np.random.normal(0, 1.5, len(weeks))
29+
retention = np.clip(base + noise, 0, 100)
30+
retention[0] = 100.0
31+
retention_data[cohort] = retention
32+
33+
# Plot — diverse hue palette (colorblind-safe)
34+
colors = ["#D4A03C", "#2A9D8F", "#306998", "#7B4F9E", "#1A4D6E"]
35+
line_widths = [3, 3.5, 4, 4.5, 5]
36+
alphas = [0.70, 0.75, 0.82, 0.90, 1.0]
37+
38+
p = figure(
39+
width=4800,
40+
height=2700,
41+
title="line-retention-cohort · bokeh · pyplots.ai",
42+
x_axis_label="Weeks Since Signup",
43+
y_axis_label="Retention Rate (%)",
44+
)
45+
46+
legend_items = []
47+
for i, (cohort, params) in enumerate(cohorts.items()):
48+
source = ColumnDataSource(
49+
data={
50+
"week": weeks,
51+
"retention": retention_data[cohort],
52+
"cohort": [cohort] * len(weeks),
53+
"size": [params["size"]] * len(weeks),
54+
"retention_fmt": [f"{r:.1f}" for r in retention_data[cohort]],
55+
}
56+
)
57+
label = f"{cohort} (n={params['size']:,})"
58+
59+
line = p.line(
60+
x="week", y="retention", source=source, line_width=line_widths[i], line_color=colors[i], line_alpha=alphas[i]
61+
)
62+
scatter = p.scatter(
63+
x="week",
64+
y="retention",
65+
source=source,
66+
size=12 + i * 2,
67+
fill_color=colors[i],
68+
fill_alpha=alphas[i],
69+
line_color="white",
70+
line_width=2,
71+
)
72+
legend_items.append((label, [line, scatter]))
73+
74+
# HoverTool for interactive HTML output
75+
hover = HoverTool(
76+
tooltips=[("Cohort", "@cohort"), ("Week", "@week"), ("Retention", "@retention_fmt%"), ("Cohort Size", "@size{,}")],
77+
mode="mouse",
78+
)
79+
p.add_tools(hover)
80+
81+
# Reference line at 20% retention threshold
82+
threshold = Span(
83+
location=20, dimension="width", line_color="#888888", line_dash="dashed", line_width=2.5, line_alpha=0.6
84+
)
85+
p.add_layout(threshold)
86+
87+
# Label for the threshold line
88+
threshold_label = Label(
89+
x=12,
90+
y=20,
91+
text="20% Threshold",
92+
text_font_size="20pt",
93+
text_color="#666666",
94+
x_offset=-10,
95+
y_offset=8,
96+
text_align="right",
97+
)
98+
p.add_layout(threshold_label)
99+
100+
# Legend
101+
legend = Legend(items=legend_items, location="top_right")
102+
legend.label_text_font_size = "20pt"
103+
legend.glyph_height = 30
104+
legend.glyph_width = 30
105+
legend.spacing = 12
106+
legend.padding = 20
107+
legend.background_fill_alpha = 0.85
108+
legend.background_fill_color = "white"
109+
legend.border_line_alpha = 0.2
110+
legend.border_line_color = "#cccccc"
111+
p.add_layout(legend)
112+
113+
# Style
114+
p.title.text_font_size = "42pt"
115+
p.title.text_color = "#2c3e50"
116+
p.xaxis.axis_label_text_font_size = "32pt"
117+
p.yaxis.axis_label_text_font_size = "32pt"
118+
p.xaxis.major_label_text_font_size = "24pt"
119+
p.yaxis.major_label_text_font_size = "24pt"
120+
p.xaxis.axis_label_text_color = "#444444"
121+
p.yaxis.axis_label_text_color = "#444444"
122+
123+
p.y_range.start = 0
124+
p.y_range.end = 105
125+
p.x_range.start = -0.3
126+
p.x_range.end = 12.3
127+
128+
p.ygrid.grid_line_alpha = 0.15
129+
p.ygrid.grid_line_dash = "dashed"
130+
p.xgrid.grid_line_alpha = 0
131+
132+
p.background_fill_color = "#f8f9fa"
133+
p.border_fill_color = "white"
134+
135+
p.axis.axis_line_width = 2
136+
p.axis.axis_line_color = "#333333"
137+
p.axis.major_tick_line_width = 2
138+
p.axis.minor_tick_line_width = 0
139+
140+
p.toolbar_location = None
141+
142+
# Save PNG (toolbar hidden)
143+
export_png(p, filename="plot.png")
144+
145+
# Save interactive HTML with toolbar
146+
p.toolbar_location = "above"
147+
output_file("plot.html")
148+
save(p)
Lines changed: 232 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,232 @@
1+
library: bokeh
2+
specification_id: line-retention-cohort
3+
created: '2026-03-16T20:44:38Z'
4+
updated: '2026-03-16T21:02:56Z'
5+
generated_by: claude-opus-4-5-20251101
6+
workflow_run: 23164943124
7+
issue: 4572
8+
python_version: 3.14.3
9+
library_version: 3.9.0
10+
preview_url: https://storage.googleapis.com/pyplots-images/plots/line-retention-cohort/bokeh/plot.png
11+
preview_thumb: https://storage.googleapis.com/pyplots-images/plots/line-retention-cohort/bokeh/plot_thumb.png
12+
preview_html: https://storage.googleapis.com/pyplots-images/plots/line-retention-cohort/bokeh/plot.html
13+
quality_score: 90
14+
review:
15+
strengths:
16+
- Excellent spec compliance — every requirement from the specification is implemented
17+
- Strong visual hierarchy through progressive line width and opacity that emphasizes
18+
newer cohorts
19+
- Idiomatic Bokeh usage with ColumnDataSource, HoverTool, and dual PNG/HTML output
20+
- Custom colorblind-safe palette with warm-to-cool progression
21+
- Clean, well-structured code with good reproducibility
22+
weaknesses:
23+
- Legend and threshold label text could be slightly larger for better readability
24+
on the large canvas
25+
- Axis spines are still fully present — removing top/right spines would improve
26+
visual refinement
27+
- Cohort curves converge at later weeks, making individual lines harder to distinguish
28+
image_description: The plot displays 5 user retention cohort curves (Jan–May 2025)
29+
on a light gray background. All curves start at 100% retention at week 0 and decay
30+
over 12 weeks. Colors progress from warm gold/amber (Jan 2025) through teal (Feb),
31+
dark blue (Mar), purple (Apr), to dark navy (May 2025). Line widths increase and
32+
opacity grows from older to newer cohorts, visually emphasizing the most recent
33+
cohort. Each data point has a circular marker with a white edge. A horizontal
34+
dashed gray line at 20% marks a retention threshold, labeled "20% Threshold" near
35+
the right edge. The legend in the top-right corner lists all cohorts with sample
36+
sizes (e.g., "Jan 2025 (n=1,245)"). Y-axis runs from 0 to ~105% with subtle dashed
37+
gridlines; X-axis shows "Weeks Since Signup" from 0 to 12. Title reads "line-retention-cohort
38+
· bokeh · pyplots.ai" in dark text at top-left.
39+
criteria_checklist:
40+
visual_quality:
41+
score: 28
42+
max: 30
43+
items:
44+
- id: VQ-01
45+
name: Text Legibility
46+
score: 7
47+
max: 8
48+
passed: true
49+
comment: All font sizes explicitly set (title 42pt, labels 32pt, ticks 24pt,
50+
legend 20pt). Legend and threshold label readable but slightly small for
51+
canvas size.
52+
- id: VQ-02
53+
name: No Overlap
54+
score: 6
55+
max: 6
56+
passed: true
57+
comment: No text overlaps. Legend positioned clear of data.
58+
- id: VQ-03
59+
name: Element Visibility
60+
score: 5
61+
max: 6
62+
passed: true
63+
comment: Lines and markers clearly visible. Curves converge at later weeks
64+
making distinction harder.
65+
- id: VQ-04
66+
name: Color Accessibility
67+
score: 4
68+
max: 4
69+
passed: true
70+
comment: Gold, teal, blue, purple, navy palette avoids red-green confusion.
71+
- id: VQ-05
72+
name: Layout & Canvas
73+
score: 4
74+
max: 4
75+
passed: true
76+
comment: Plot fills canvas well with balanced margins.
77+
- id: VQ-06
78+
name: Axis Labels & Title
79+
score: 2
80+
max: 2
81+
passed: true
82+
comment: 'Descriptive labels with units: Weeks Since Signup, Retention Rate
83+
(%).'
84+
design_excellence:
85+
score: 14
86+
max: 20
87+
items:
88+
- id: DE-01
89+
name: Aesthetic Sophistication
90+
score: 6
91+
max: 8
92+
passed: true
93+
comment: Custom palette with warm-to-cool progression. Intentional hierarchy
94+
through line width and opacity. Above defaults but not FiveThirtyEight-level.
95+
- id: DE-02
96+
name: Visual Refinement
97+
score: 4
98+
max: 6
99+
passed: true
100+
comment: Subtle y-grid (dashed, alpha 0.15), x-grid removed, light gray background.
101+
Axis spines still fully present.
102+
- id: DE-03
103+
name: Data Storytelling
104+
score: 4
105+
max: 6
106+
passed: true
107+
comment: Progressive line width/opacity emphasizes newer cohorts. 20% threshold
108+
adds business context. Data shows improving retention over time.
109+
spec_compliance:
110+
score: 15
111+
max: 15
112+
items:
113+
- id: SC-01
114+
name: Plot Type
115+
score: 5
116+
max: 5
117+
passed: true
118+
comment: Correct line chart showing retention curves by cohort.
119+
- id: SC-02
120+
name: Required Features
121+
score: 4
122+
max: 4
123+
passed: true
124+
comment: 'All spec requirements implemented: 100% start, distinct colors,
125+
legend with sizes, gridlines, reference line.'
126+
- id: SC-03
127+
name: Data Mapping
128+
score: 3
129+
max: 3
130+
passed: true
131+
comment: X=weeks since signup, Y=retention rate percentage. Correct mapping.
132+
- id: SC-04
133+
name: Title & Legend
134+
score: 3
135+
max: 3
136+
passed: true
137+
comment: Title follows spec-id · library · pyplots.ai format. Legend labels
138+
use cohort (n=size) format.
139+
data_quality:
140+
score: 14
141+
max: 15
142+
items:
143+
- id: DQ-01
144+
name: Feature Coverage
145+
score: 5
146+
max: 6
147+
passed: true
148+
comment: 5 cohorts with varying decay rates. Good spread but could show more
149+
variation in curve shapes.
150+
- id: DQ-02
151+
name: Realistic Context
152+
score: 5
153+
max: 5
154+
passed: true
155+
comment: Monthly signup cohorts for a product. Cohort sizes 1,245-1,680 are
156+
plausible for mid-size SaaS.
157+
- id: DQ-03
158+
name: Appropriate Scale
159+
score: 4
160+
max: 4
161+
passed: true
162+
comment: Retention decays realistically from 100%. Final values 12-32% at
163+
week 12 are consistent with typical retention curves.
164+
code_quality:
165+
score: 10
166+
max: 10
167+
items:
168+
- id: CQ-01
169+
name: KISS Structure
170+
score: 3
171+
max: 3
172+
passed: true
173+
comment: Clean Imports → Data → Plot → Save structure. No functions or classes.
174+
- id: CQ-02
175+
name: Reproducibility
176+
score: 2
177+
max: 2
178+
passed: true
179+
comment: np.random.seed(42) set.
180+
- id: CQ-03
181+
name: Clean Imports
182+
score: 2
183+
max: 2
184+
passed: true
185+
comment: All imports used including output_file/save for HTML export.
186+
- id: CQ-04
187+
name: Code Elegance
188+
score: 2
189+
max: 2
190+
passed: true
191+
comment: Clean, well-organized code with appropriate complexity.
192+
- id: CQ-05
193+
name: Output & API
194+
score: 1
195+
max: 1
196+
passed: true
197+
comment: Saves as plot.png via export_png. Also saves interactive HTML.
198+
library_mastery:
199+
score: 9
200+
max: 10
201+
items:
202+
- id: LM-01
203+
name: Idiomatic Usage
204+
score: 5
205+
max: 5
206+
passed: true
207+
comment: Excellent use of ColumnDataSource, figure API, Legend model, Span,
208+
Label, HoverTool.
209+
- id: LM-02
210+
name: Distinctive Features
211+
score: 4
212+
max: 5
213+
passed: true
214+
comment: HoverTool with tooltips, dual PNG+HTML output, ColumnDataSource data
215+
binding, Span for reference lines.
216+
verdict: APPROVED
217+
impl_tags:
218+
dependencies: []
219+
techniques:
220+
- hover-tooltips
221+
- custom-legend
222+
- annotations
223+
- html-export
224+
patterns:
225+
- data-generation
226+
- iteration-over-groups
227+
- columndatasource
228+
dataprep: []
229+
styling:
230+
- alpha-blending
231+
- edge-highlighting
232+
- grid-styling

0 commit comments

Comments
 (0)