Skip to content

Commit 498f2df

Browse files
feat(altair): implement heatmap-cohort-retention (#4939)
## Implementation: `heatmap-cohort-retention` - altair Implements the **altair** version of `heatmap-cohort-retention`. **File:** `plots/heatmap-cohort-retention/implementations/altair.py` **Parent Issue:** #4570 --- :robot: *[impl-generate workflow](https://github.com/MarkusNeusinger/pyplots/actions/runs/23165007983)* --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
1 parent d30b48c commit 498f2df

2 files changed

Lines changed: 398 additions & 0 deletions

File tree

Lines changed: 171 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,171 @@
1+
""" pyplots.ai
2+
heatmap-cohort-retention: Cohort Retention Heatmap
3+
Library: altair 6.0.0 | Python 3.14.3
4+
Quality: 90/100 | Created: 2026-03-16
5+
"""
6+
7+
import altair as alt
8+
import numpy as np
9+
import pandas as pd
10+
11+
12+
# Data
13+
np.random.seed(42)
14+
15+
cohort_labels = [
16+
"Jan 2024",
17+
"Feb 2024",
18+
"Mar 2024",
19+
"Apr 2024",
20+
"May 2024",
21+
"Jun 2024",
22+
"Jul 2024",
23+
"Aug 2024",
24+
"Sep 2024",
25+
"Oct 2024",
26+
]
27+
n_cohorts = len(cohort_labels)
28+
n_periods = 10
29+
cohort_sizes = np.random.randint(800, 2500, size=n_cohorts)
30+
31+
# Build retention data with realistic decay patterns
32+
rows = []
33+
for i, cohort in enumerate(cohort_labels):
34+
max_periods = n_cohorts - i
35+
for period in range(max_periods):
36+
if period == 0:
37+
retention = 100.0
38+
elif period == 1:
39+
retention = np.random.uniform(55, 72)
40+
else:
41+
decay = np.random.uniform(0.85, 0.95)
42+
retention = rows[-1]["retention_rate"] * decay
43+
retention += np.random.uniform(-2, 2)
44+
retention = max(5, min(retention, 100))
45+
rows.append(
46+
{
47+
"cohort": cohort,
48+
"cohort_label": f"{cohort} (n={cohort_sizes[i]:,})",
49+
"period": period,
50+
"period_label": f"Month {period}",
51+
"retention_rate": round(retention, 1),
52+
}
53+
)
54+
55+
df = pd.DataFrame(rows)
56+
57+
# Sort orders
58+
cohort_order = [f"{c} (n={s:,})" for c, s in zip(cohort_labels, cohort_sizes, strict=True)]
59+
period_order = [f"Month {p}" for p in range(n_periods)]
60+
61+
# Custom dark teal-to-gold diverging-inspired sequential palette for sophistication
62+
color_domain = [0, 20, 40, 60, 80, 100]
63+
color_range = ["#f7f7f7", "#d4e8e0", "#7bc8b5", "#2a9d8f", "#264653", "#1d3557"]
64+
65+
# Heatmap rectangles
66+
heatmap = (
67+
alt.Chart(df)
68+
.mark_rect(stroke="#e8e8e8", strokeWidth=1.5, cornerRadius=3)
69+
.encode(
70+
x=alt.X(
71+
"period_label:O",
72+
title="Months Since Signup",
73+
sort=period_order,
74+
axis=alt.Axis(
75+
labelFontSize=17,
76+
titleFontSize=22,
77+
titleFontWeight="bold",
78+
labelAngle=0,
79+
domainWidth=0,
80+
tickWidth=0,
81+
titlePadding=16,
82+
labelPadding=8,
83+
),
84+
),
85+
y=alt.Y(
86+
"cohort_label:O",
87+
title="Signup Cohort",
88+
sort=cohort_order,
89+
axis=alt.Axis(
90+
labelFontSize=17,
91+
titleFontSize=22,
92+
titleFontWeight="bold",
93+
domainWidth=0,
94+
tickWidth=0,
95+
titlePadding=16,
96+
labelPadding=8,
97+
),
98+
),
99+
color=alt.Color(
100+
"retention_rate:Q",
101+
scale=alt.Scale(domain=color_domain, range=color_range),
102+
legend=alt.Legend(
103+
title="Retention %",
104+
titleFontSize=18,
105+
titleFontWeight="bold",
106+
labelFontSize=16,
107+
gradientLength=400,
108+
gradientThickness=18,
109+
orient="right",
110+
offset=12,
111+
),
112+
),
113+
tooltip=[
114+
alt.Tooltip("cohort:N", title="Cohort"),
115+
alt.Tooltip("period_label:O", title="Period"),
116+
alt.Tooltip("retention_rate:Q", title="Retention %", format=".1f"),
117+
],
118+
)
119+
)
120+
121+
# Text annotations with suffix
122+
text = (
123+
alt.Chart(df)
124+
.mark_text(fontSize=15, fontWeight="bold")
125+
.encode(
126+
x=alt.X("period_label:O", sort=period_order),
127+
y=alt.Y("cohort_label:O", sort=cohort_order),
128+
text=alt.Text("retention_rate:Q", format=".1f"),
129+
color=alt.condition(alt.datum.retention_rate > 50, alt.value("white"), alt.value("#333333")),
130+
)
131+
)
132+
133+
# Percent symbol as separate smaller text layer for polish
134+
pct = (
135+
alt.Chart(df)
136+
.mark_text(fontSize=10, fontWeight="normal", dx=20)
137+
.encode(
138+
x=alt.X("period_label:O", sort=period_order),
139+
y=alt.Y("cohort_label:O", sort=cohort_order),
140+
text=alt.value("%"),
141+
color=alt.condition(
142+
alt.datum.retention_rate > 50, alt.value("rgba(255,255,255,0.7)"), alt.value("rgba(51,51,51,0.5)")
143+
),
144+
)
145+
)
146+
147+
# Combine
148+
chart = (
149+
alt.layer(heatmap, text, pct)
150+
.properties(
151+
width=1400,
152+
height=900,
153+
title=alt.Title(
154+
"heatmap-cohort-retention · altair · pyplots.ai",
155+
fontSize=28,
156+
fontWeight="bold",
157+
anchor="middle",
158+
subtitle="Monthly SaaS user retention — earliest cohorts show strongest long-term engagement",
159+
subtitleFontSize=18,
160+
subtitleColor="#666666",
161+
subtitlePadding=8,
162+
),
163+
)
164+
.configure_view(strokeWidth=0)
165+
.configure(padding={"left": 20, "right": 20, "top": 20, "bottom": 20}, background="#ffffff")
166+
.configure_axis(labelColor="#444444", titleColor="#333333")
167+
)
168+
169+
# Save
170+
chart.save("plot.png", scale_factor=3.0)
171+
chart.save("plot.html")
Lines changed: 227 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,227 @@
1+
library: altair
2+
specification_id: heatmap-cohort-retention
3+
created: '2026-03-16T20:46:10Z'
4+
updated: '2026-03-16T21:04:16Z'
5+
generated_by: claude-opus-4-5-20251101
6+
workflow_run: 23165007983
7+
issue: 4570
8+
python_version: 3.14.3
9+
library_version: 6.0.0
10+
preview_url: https://storage.googleapis.com/pyplots-images/plots/heatmap-cohort-retention/altair/plot.png
11+
preview_thumb: https://storage.googleapis.com/pyplots-images/plots/heatmap-cohort-retention/altair/plot_thumb.png
12+
preview_html: https://storage.googleapis.com/pyplots-images/plots/heatmap-cohort-retention/altair/plot.html
13+
quality_score: 90
14+
review:
15+
strengths:
16+
- Custom 6-stop sequential palette with sophisticated color progression from light
17+
to dark
18+
- Separate % symbol layer at smaller font size adds typographic polish
19+
- Informative subtitle that communicates the key data insight
20+
- 'Expert-level Altair idioms: conditional text color, layer composition, declarative
21+
encoding'
22+
- Perfect spec compliance with all required features present
23+
- Realistic SaaS retention data with plausible decay patterns
24+
weaknesses:
25+
- Chart width is 1400 instead of 1600, resulting in 4200px instead of the ideal
26+
4800px output width
27+
- The % suffix fontSize=10 (30px at 3x) is on the small side
28+
image_description: The plot displays a triangular cohort retention heatmap with
29+
10 monthly cohorts (Jan 2024 through Oct 2024) on the y-axis and months since
30+
signup (Month 0 through Month 9) on the x-axis. The cells use a custom sequential
31+
palette transitioning from light gray/white (low retention) through teals to dark
32+
navy (high retention). Each cell contains a bold retention percentage with a smaller
33+
"%" suffix. The triangular shape is correctly formed — Jan 2024 has all 10 periods
34+
while Oct 2024 has only 1. Cohort sizes (e.g., n=1,926) appear in parentheses
35+
next to each cohort label. A vertical color bar legend ("Retention %") is positioned
36+
on the right. The title reads "heatmap-cohort-retention · altair · pyplots.ai"
37+
with a subtitle "Monthly SaaS user retention — earliest cohorts show strongest
38+
long-term engagement". Cell borders are subtle light gray with rounded corners.
39+
The background is clean white.
40+
criteria_checklist:
41+
visual_quality:
42+
score: 28
43+
max: 30
44+
items:
45+
- id: VQ-01
46+
name: Text Legibility
47+
score: 7
48+
max: 8
49+
passed: true
50+
comment: All font sizes explicitly set (title=28, axis titles=22, tick labels=17,
51+
cell text=15). The % suffix at fontSize=10 is small but readable.
52+
- id: VQ-02
53+
name: No Overlap
54+
score: 6
55+
max: 6
56+
passed: true
57+
comment: No overlapping text. Separate % layer with dx offset avoids collision.
58+
- id: VQ-03
59+
name: Element Visibility
60+
score: 6
61+
max: 6
62+
passed: true
63+
comment: Rectangles well-sized with cornerRadius and subtle stroke borders.
64+
- id: VQ-04
65+
name: Color Accessibility
66+
score: 4
67+
max: 4
68+
passed: true
69+
comment: Sequential teal-to-navy palette is colorblind-safe with good luminance
70+
contrast.
71+
- id: VQ-05
72+
name: Layout & Canvas
73+
score: 3
74+
max: 4
75+
passed: true
76+
comment: Width=1400 (4200px at 3x) slightly narrower than ideal 4800px. Triangular
77+
shape leaves inherent empty space.
78+
- id: VQ-06
79+
name: Axis Labels & Title
80+
score: 2
81+
max: 2
82+
passed: true
83+
comment: 'Descriptive labels: Months Since Signup, Signup Cohort.'
84+
design_excellence:
85+
score: 14
86+
max: 20
87+
items:
88+
- id: DE-01
89+
name: Aesthetic Sophistication
90+
score: 6
91+
max: 8
92+
passed: true
93+
comment: Custom 6-stop gradient palette, rounded cell corners, subtle borders,
94+
separate % layer, informative subtitle.
95+
- id: DE-02
96+
name: Visual Refinement
97+
score: 4
98+
max: 6
99+
passed: true
100+
comment: View stroke removed, axis domains/ticks hidden, generous padding,
101+
subtle cell borders.
102+
- id: DE-03
103+
name: Data Storytelling
104+
score: 4
105+
max: 6
106+
passed: true
107+
comment: Subtitle communicates key insight. Color gradient guides the eye.
108+
Triangular shape tells temporal story.
109+
spec_compliance:
110+
score: 15
111+
max: 15
112+
items:
113+
- id: SC-01
114+
name: Plot Type
115+
score: 5
116+
max: 5
117+
passed: true
118+
comment: Correct triangular heatmap.
119+
- id: SC-02
120+
name: Required Features
121+
score: 4
122+
max: 4
123+
passed: true
124+
comment: 'All features: triangular shape, retention % in cells, cohort sizes,
125+
colorbar, sequential colormap.'
126+
- id: SC-03
127+
name: Data Mapping
128+
score: 3
129+
max: 3
130+
passed: true
131+
comment: X=periods, Y=cohorts. All data correctly mapped.
132+
- id: SC-04
133+
name: Title & Legend
134+
score: 3
135+
max: 3
136+
passed: true
137+
comment: Title format correct. Color bar legend with Retention % label.
138+
data_quality:
139+
score: 15
140+
max: 15
141+
items:
142+
- id: DQ-01
143+
name: Feature Coverage
144+
score: 6
145+
max: 6
146+
passed: true
147+
comment: Realistic decay, period 0=100%, varying cohort sizes, different retention
148+
curves, full triangular structure.
149+
- id: DQ-02
150+
name: Realistic Context
151+
score: 5
152+
max: 5
153+
passed: true
154+
comment: Monthly SaaS user retention — real, neutral business scenario.
155+
- id: DQ-03
156+
name: Appropriate Scale
157+
score: 4
158+
max: 4
159+
passed: true
160+
comment: Retention decays 100% to ~19.5%. Cohort sizes 921-2438 plausible.
161+
code_quality:
162+
score: 10
163+
max: 10
164+
items:
165+
- id: CQ-01
166+
name: KISS Structure
167+
score: 3
168+
max: 3
169+
passed: true
170+
comment: 'Clean linear flow: imports, data, layers, combine, save.'
171+
- id: CQ-02
172+
name: Reproducibility
173+
score: 2
174+
max: 2
175+
passed: true
176+
comment: np.random.seed(42) set.
177+
- id: CQ-03
178+
name: Clean Imports
179+
score: 2
180+
max: 2
181+
passed: true
182+
comment: Only altair, numpy, pandas — all used.
183+
- id: CQ-04
184+
name: Code Elegance
185+
score: 2
186+
max: 2
187+
passed: true
188+
comment: Clean three-layer composition. Not over-engineered.
189+
- id: CQ-05
190+
name: Output & API
191+
score: 1
192+
max: 1
193+
passed: true
194+
comment: Saves plot.png and plot.html. Current Altair 6.0 API.
195+
library_mastery:
196+
score: 8
197+
max: 10
198+
items:
199+
- id: LM-01
200+
name: Idiomatic Usage
201+
score: 5
202+
max: 5
203+
passed: true
204+
comment: 'Expert Altair: mark_rect/mark_text, ordinal encoding with sort,
205+
alt.condition, alt.layer, alt.Title, configure_view/axis.'
206+
- id: LM-02
207+
name: Distinctive Features
208+
score: 3
209+
max: 5
210+
passed: true
211+
comment: alt.condition for dynamic text color, tooltip interactivity, HTML
212+
export, declarative layer composition.
213+
verdict: APPROVED
214+
impl_tags:
215+
dependencies: []
216+
techniques:
217+
- annotations
218+
- layer-composition
219+
- hover-tooltips
220+
- html-export
221+
patterns:
222+
- data-generation
223+
- iteration-over-groups
224+
dataprep: []
225+
styling:
226+
- custom-colormap
227+
- edge-highlighting

0 commit comments

Comments
 (0)