Skip to content

Commit 3241306

Browse files
feat(letsplot): implement line-retention-cohort (#4928)
## Implementation: `line-retention-cohort` - letsplot Implements the **letsplot** version of `line-retention-cohort`. **File:** `plots/line-retention-cohort/implementations/letsplot.py` **Parent Issue:** #4572 --- :robot: *[impl-generate workflow](https://github.com/MarkusNeusinger/pyplots/actions/runs/23164943466)* --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
1 parent 80354c5 commit 3241306

2 files changed

Lines changed: 333 additions & 0 deletions

File tree

Lines changed: 108 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,108 @@
1+
""" pyplots.ai
2+
line-retention-cohort: User Retention Curve by Cohort
3+
Library: letsplot 4.9.0 | Python 3.14.3
4+
Quality: 91/100 | Created: 2026-03-16
5+
"""
6+
7+
import numpy as np
8+
import pandas as pd
9+
from lets_plot import *
10+
11+
12+
LetsPlot.setup_html()
13+
14+
# Data: Monthly signup cohorts tracked weekly for 12 weeks
15+
np.random.seed(42)
16+
weeks = np.arange(0, 13)
17+
18+
cohorts = {
19+
"Jan 2025": {"size": 1245, "decay": 0.18},
20+
"Feb 2025": {"size": 1102, "decay": 0.16},
21+
"Mar 2025": {"size": 1380, "decay": 0.14},
22+
"Apr 2025": {"size": 1510, "decay": 0.12},
23+
"May 2025": {"size": 1425, "decay": 0.10},
24+
}
25+
26+
rows = []
27+
for cohort_name, params in cohorts.items():
28+
retention = 100 * np.exp(-params["decay"] * weeks)
29+
noise = np.random.normal(0, 1.5, len(weeks))
30+
noise[0] = 0
31+
retention = np.clip(retention + noise, 0, 100)
32+
retention[0] = 100.0
33+
label = f"{cohort_name} (n={params['size']:,})"
34+
for w, r in zip(weeks, retention):
35+
rows.append({"Week": w, "Retention": r, "Cohort": label})
36+
37+
df = pd.DataFrame(rows)
38+
39+
# Endpoint labels: last data point per cohort, with nudge to avoid overlap
40+
endpoints = df[df["Week"] == 12].copy()
41+
endpoints["label"] = endpoints["Retention"].apply(lambda x: f"{x:.0f}%")
42+
# Adjust y positions to prevent label overlap (spread close values apart)
43+
sorted_ep = endpoints.sort_values("Retention").reset_index(drop=True)
44+
min_gap = 3.5
45+
for i in range(1, len(sorted_ep)):
46+
if sorted_ep.loc[i, "Retention"] - sorted_ep.loc[i - 1, "Retention"] < min_gap:
47+
sorted_ep.loc[i, "Retention"] = sorted_ep.loc[i - 1, "Retention"] + min_gap
48+
endpoints = sorted_ep
49+
50+
# Colorblind-friendly palette with distinct hues (oldest=lightest, newest=boldest)
51+
colors = ["#A6CEE3", "#B2DF8A", "#FDBF6F", "#E31A1C", "#306998"]
52+
53+
# Line widths: older cohorts thinner, newer cohorts bolder
54+
line_widths = [1.5, 1.8, 2.0, 2.5, 3.0]
55+
56+
# Build plot with per-cohort layers for varying line widths
57+
cohort_labels = df["Cohort"].unique().tolist()
58+
59+
plot = ggplot()
60+
61+
# Add lines and points per cohort with distinct widths
62+
for i, cohort_label in enumerate(cohort_labels):
63+
cdf = df[df["Cohort"] == cohort_label]
64+
plot = plot + geom_line(
65+
aes(x="Week", y="Retention", color="Cohort"),
66+
data=cdf,
67+
size=line_widths[i],
68+
alpha=0.9,
69+
tooltips=layer_tooltips().line("@Cohort").line("Week @Week").line("Retention @Retention{.1f}%"),
70+
)
71+
72+
plot = (
73+
plot
74+
+ geom_point(aes(x="Week", y="Retention", color="Cohort"), data=df, size=4, alpha=0.85)
75+
+ geom_hline(yintercept=20, linetype="dashed", color="#999999", size=0.8)
76+
+ geom_text(
77+
aes(x="Week", y="Retention", label="label", color="Cohort"), data=endpoints, size=14, nudge_x=0.6, hjust=0
78+
)
79+
+ geom_text(
80+
aes(x="x", y="y", label="label"),
81+
data=pd.DataFrame({"x": [0.2], "y": [20], "label": ["20% threshold"]}),
82+
size=12,
83+
color="#999999",
84+
hjust=0,
85+
vjust=-1.2,
86+
)
87+
+ scale_color_manual(values=colors)
88+
+ scale_x_continuous(breaks=list(range(0, 13, 2)), limits=[0, 14.5])
89+
+ scale_y_continuous(breaks=list(range(0, 101, 20)), limits=[0, 105])
90+
+ labs(title="line-retention-cohort · letsplot · pyplots.ai", x="Weeks Since Signup", y="Retained Users (%)")
91+
+ theme_minimal()
92+
+ theme(
93+
plot_title=element_text(size=28, hjust=0.5, face="bold"),
94+
axis_title=element_text(size=22),
95+
axis_text=element_text(size=18),
96+
legend_title=element_blank(),
97+
legend_text=element_text(size=16),
98+
legend_position="right",
99+
panel_grid_major=element_line(color="#EBEBEB", size=0.4),
100+
panel_grid_minor=element_blank(),
101+
plot_background=element_rect(color="white", fill="white"),
102+
)
103+
+ ggsize(1600, 900)
104+
)
105+
106+
# Save
107+
ggsave(plot, "plot.png", path=".", scale=3)
108+
ggsave(plot, "plot.html", path=".")
Lines changed: 225 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,225 @@
1+
library: letsplot
2+
specification_id: line-retention-cohort
3+
created: '2026-03-16T20:44:20Z'
4+
updated: '2026-03-16T20:57:20Z'
5+
generated_by: claude-opus-4-5-20251101
6+
workflow_run: 23164943466
7+
issue: 4572
8+
python_version: 3.14.3
9+
library_version: 4.9.0
10+
preview_url: https://storage.googleapis.com/pyplots-images/plots/line-retention-cohort/letsplot/plot.png
11+
preview_thumb: https://storage.googleapis.com/pyplots-images/plots/line-retention-cohort/letsplot/plot_thumb.png
12+
preview_html: https://storage.googleapis.com/pyplots-images/plots/line-retention-cohort/letsplot/plot.html
13+
quality_score: 91
14+
review:
15+
strengths:
16+
- Excellent data storytelling through progressive line weights and color intensity
17+
emphasizing newer cohorts
18+
- Endpoint labels with overlap prevention provide clear final-retention context
19+
- 20% threshold reference line adds analytical value
20+
- Full spec compliance with all required features implemented
21+
- Interactive tooltips leverage lets-plot distinctive capabilities
22+
weaknesses:
23+
- Endpoint labels for lower cohorts (12%, 15%, 17%) are slightly tight despite overlap
24+
prevention
25+
image_description: 'The plot displays 5 retention curves for monthly signup cohorts
26+
(Jan–May 2025) on a clean white background. All curves start at 100% at week 0
27+
and decay over 12 weeks with exponential profiles. Colors progress from light
28+
blue (Jan 2025, oldest) through green (Feb), orange (Mar), red (Apr), to dark
29+
navy blue (May 2025, newest). Newer cohorts have thicker lines, creating clear
30+
visual hierarchy. Data points are marked along each curve. Endpoint percentage
31+
labels (12%, 15%, 17%, 23%, 31%) are displayed at week 12, color-matched to their
32+
respective cohorts. A dashed gray horizontal line at y=20 marks the "20% threshold"
33+
benchmark. The legend on the right lists each cohort with sample size (e.g., "Jan
34+
2025 (n=1,245)"). Title reads "line-retention-cohort · letsplot · pyplots.ai".
35+
X-axis: "Weeks Since Signup", Y-axis: "Retained Users (%)". Subtle light gray
36+
major gridlines on a minimal theme.'
37+
criteria_checklist:
38+
visual_quality:
39+
score: 29
40+
max: 30
41+
items:
42+
- id: VQ-01
43+
name: Text Legibility
44+
score: 8
45+
max: 8
46+
passed: true
47+
comment: 'All font sizes explicitly set: title=28, axis_title=22, axis_text=18,
48+
legend_text=16'
49+
- id: VQ-02
50+
name: No Overlap
51+
score: 5
52+
max: 6
53+
passed: true
54+
comment: Endpoint labels use overlap prevention but 12%/15%/17% are still
55+
fairly tight
56+
- id: VQ-03
57+
name: Element Visibility
58+
score: 6
59+
max: 6
60+
passed: true
61+
comment: Lines well-sized with progressive widths 1.5-3.0, points at size=4
62+
clearly visible
63+
- id: VQ-04
64+
name: Color Accessibility
65+
score: 4
66+
max: 4
67+
passed: true
68+
comment: 'Distinct colorblind-friendly palette: light blue, green, orange,
69+
red, dark blue'
70+
- id: VQ-05
71+
name: Layout & Canvas
72+
score: 4
73+
max: 4
74+
passed: true
75+
comment: Plot fills canvas well, x-axis extended to accommodate endpoint labels
76+
- id: VQ-06
77+
name: Axis Labels & Title
78+
score: 2
79+
max: 2
80+
passed: true
81+
comment: 'Descriptive labels with units: Weeks Since Signup, Retained Users
82+
(%)'
83+
design_excellence:
84+
score: 15
85+
max: 20
86+
items:
87+
- id: DE-01
88+
name: Aesthetic Sophistication
89+
score: 6
90+
max: 8
91+
passed: true
92+
comment: Custom palette with light-to-bold progression, endpoint labels and
93+
threshold annotation add polish
94+
- id: DE-02
95+
name: Visual Refinement
96+
score: 4
97+
max: 6
98+
passed: true
99+
comment: theme_minimal(), subtle grid, no minor grid, white background
100+
- id: DE-03
101+
name: Data Storytelling
102+
score: 5
103+
max: 6
104+
passed: true
105+
comment: Strong visual hierarchy through line width/color, endpoint labels,
106+
threshold reference line
107+
spec_compliance:
108+
score: 15
109+
max: 15
110+
items:
111+
- id: SC-01
112+
name: Plot Type
113+
score: 5
114+
max: 5
115+
passed: true
116+
comment: Correct line chart with multiple cohort retention curves
117+
- id: SC-02
118+
name: Required Features
119+
score: 4
120+
max: 4
121+
passed: true
122+
comment: All spec features present including threshold line, varying line
123+
thickness, legend with sizes
124+
- id: SC-03
125+
name: Data Mapping
126+
score: 3
127+
max: 3
128+
passed: true
129+
comment: X=weeks since signup, Y=retention percentage, correctly mapped
130+
- id: SC-04
131+
name: Title & Legend
132+
score: 3
133+
max: 3
134+
passed: true
135+
comment: Title format correct, legend labels match spec format with cohort
136+
size
137+
data_quality:
138+
score: 15
139+
max: 15
140+
items:
141+
- id: DQ-01
142+
name: Feature Coverage
143+
score: 6
144+
max: 6
145+
passed: true
146+
comment: 5 cohorts with different decay rates showing clear variation
147+
- id: DQ-02
148+
name: Realistic Context
149+
score: 5
150+
max: 5
151+
passed: true
152+
comment: Monthly signup cohorts with realistic sizes and plausible retention
153+
decay rates
154+
- id: DQ-03
155+
name: Appropriate Scale
156+
score: 4
157+
max: 4
158+
passed: true
159+
comment: Retention values 12-31% at week 12, cohort sizes 1102-1510 realistic
160+
code_quality:
161+
score: 10
162+
max: 10
163+
items:
164+
- id: CQ-01
165+
name: KISS Structure
166+
score: 3
167+
max: 3
168+
passed: true
169+
comment: Clean imports-data-plot-save flow
170+
- id: CQ-02
171+
name: Reproducibility
172+
score: 2
173+
max: 2
174+
passed: true
175+
comment: np.random.seed(42) set
176+
- id: CQ-03
177+
name: Clean Imports
178+
score: 2
179+
max: 2
180+
passed: true
181+
comment: 'All imports used: numpy, pandas, lets_plot'
182+
- id: CQ-04
183+
name: Code Elegance
184+
score: 2
185+
max: 2
186+
passed: true
187+
comment: Clean code with thoughtful endpoint label overlap prevention
188+
- id: CQ-05
189+
name: Output & API
190+
score: 1
191+
max: 1
192+
passed: true
193+
comment: Saves as plot.png with scale=3 and plot.html
194+
library_mastery:
195+
score: 7
196+
max: 10
197+
items:
198+
- id: LM-01
199+
name: Idiomatic Usage
200+
score: 4
201+
max: 5
202+
passed: true
203+
comment: Good ggplot grammar usage, per-group loop justified for varying line
204+
widths
205+
- id: LM-02
206+
name: Distinctive Features
207+
score: 3
208+
max: 5
209+
passed: true
210+
comment: Uses layer_tooltips() for interactive hover, HTML export, ggsize()
211+
verdict: APPROVED
212+
impl_tags:
213+
dependencies: []
214+
techniques:
215+
- annotations
216+
- layer-composition
217+
- hover-tooltips
218+
- html-export
219+
patterns:
220+
- data-generation
221+
- iteration-over-groups
222+
dataprep: []
223+
styling:
224+
- alpha-blending
225+
- grid-styling

0 commit comments

Comments
 (0)