Skip to content

Commit 46fdc1d

Browse files
feat(plotnine): implement scatter-matrix-interactive (#3609)
## Implementation: `scatter-matrix-interactive` - plotnine Implements the **plotnine** version of `scatter-matrix-interactive`. **File:** `plots/scatter-matrix-interactive/implementations/plotnine.py` **Parent Issue:** #3604 --- :robot: *[impl-generate workflow](https://github.com/MarkusNeusinger/pyplots/actions/runs/20870868420)* --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
1 parent 7cfedd4 commit 46fdc1d

2 files changed

Lines changed: 333 additions & 0 deletions

File tree

Lines changed: 121 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,121 @@
1+
""" pyplots.ai
2+
scatter-matrix-interactive: Interactive Scatter Plot Matrix (SPLOM)
3+
Library: plotnine 0.15.2 | Python 3.13.11
4+
Quality: 72/100 | Created: 2026-01-10
5+
"""
6+
7+
import numpy as np
8+
import pandas as pd
9+
from plotnine import (
10+
aes,
11+
element_line,
12+
element_rect,
13+
element_text,
14+
facet_grid,
15+
geom_point,
16+
geom_ribbon,
17+
ggplot,
18+
labs,
19+
scale_color_manual,
20+
scale_fill_manual,
21+
theme,
22+
theme_minimal,
23+
)
24+
from sklearn.datasets import load_iris
25+
26+
27+
# Data: Iris dataset for multivariate analysis
28+
np.random.seed(42)
29+
iris = load_iris()
30+
df = pd.DataFrame(iris.data, columns=["Sepal Length (cm)", "Sepal Width (cm)", "Petal Length (cm)", "Petal Width (cm)"])
31+
df["Species"] = pd.Categorical([iris.target_names[i] for i in iris.target])
32+
33+
# Variables for the matrix (already include units)
34+
variables = ["Sepal Length (cm)", "Sepal Width (cm)", "Petal Length (cm)", "Petal Width (cm)"]
35+
36+
# Colorblind-safe palette (Dark2 inspired - teal, orange, purple)
37+
colors = ["#1B9E77", "#D95F02", "#7570B3"]
38+
39+
# Create long-form data for scatter matrix
40+
scatter_data = []
41+
density_data = []
42+
43+
for i, var_y in enumerate(variables):
44+
for j, var_x in enumerate(variables):
45+
if i == j:
46+
# Diagonal: Create normalized density data that fits within the variable's range
47+
var_min, var_max = df[var_x].min(), df[var_x].max()
48+
var_range = var_max - var_min
49+
# Use baseline slightly above min for visual clarity
50+
baseline = var_min
51+
52+
for species in df["Species"].unique():
53+
species_vals = df[df["Species"] == species][var_x].values
54+
# Simple histogram-based density
55+
hist, edges = np.histogram(species_vals, bins=20, range=(var_min, var_max), density=True)
56+
# Normalize to fit within the y-axis range (scale to var_range * 0.5)
57+
max_density = hist.max() if hist.max() > 0 else 1
58+
hist_scaled = hist / max_density * var_range * 0.5 + baseline
59+
bin_centers = (edges[:-1] + edges[1:]) / 2
60+
61+
for k in range(len(bin_centers)):
62+
density_data.append(
63+
{
64+
"x": bin_centers[k],
65+
"ymin": baseline,
66+
"ymax": hist_scaled[k],
67+
"Species": species,
68+
"var_x": var_x,
69+
"var_y": var_y,
70+
}
71+
)
72+
else:
73+
# Off-diagonal: scatter data
74+
for _, row in df.iterrows():
75+
scatter_data.append(
76+
{"x": row[var_x], "y": row[var_y], "Species": row["Species"], "var_x": var_x, "var_y": var_y}
77+
)
78+
79+
scatter_df = pd.DataFrame(scatter_data)
80+
density_df = pd.DataFrame(density_data)
81+
82+
# Set factor levels for proper ordering
83+
scatter_df["var_x"] = pd.Categorical(scatter_df["var_x"], categories=variables, ordered=True)
84+
scatter_df["var_y"] = pd.Categorical(scatter_df["var_y"], categories=variables[::-1], ordered=True)
85+
density_df["var_x"] = pd.Categorical(density_df["var_x"], categories=variables, ordered=True)
86+
density_df["var_y"] = pd.Categorical(density_df["var_y"], categories=variables[::-1], ordered=True)
87+
88+
# Sort density data for proper ribbon rendering
89+
density_df = density_df.sort_values(["var_x", "var_y", "Species", "x"])
90+
91+
# Create scatter plot matrix with density ribbons on diagonal
92+
plot = (
93+
ggplot(mapping=aes(x="x"))
94+
+ geom_point(data=scatter_df, mapping=aes(y="y", color="Species"), size=3.5, alpha=0.7)
95+
+ geom_ribbon(data=density_df, mapping=aes(ymin="ymin", ymax="ymax", fill="Species"), alpha=0.5)
96+
+ facet_grid("var_y ~ var_x", scales="free")
97+
+ scale_color_manual(values=colors)
98+
+ scale_fill_manual(values=colors)
99+
+ labs(title="scatter-matrix-interactive · plotnine · pyplots.ai", x="", y="")
100+
+ theme_minimal()
101+
+ theme(
102+
figure_size=(16, 16),
103+
plot_title=element_text(size=24, weight="bold", ha="left"),
104+
strip_text_x=element_text(size=14),
105+
strip_text_y=element_text(size=14, angle=0),
106+
axis_text=element_text(size=11),
107+
axis_title_x=element_text(size=16),
108+
axis_title_y=element_text(size=16),
109+
legend_title=element_text(size=16),
110+
legend_text=element_text(size=14),
111+
legend_position="bottom",
112+
legend_background=element_rect(fill="white", alpha=0.9),
113+
panel_spacing=0.03,
114+
panel_grid_major=element_line(color="#cccccc", alpha=0.3),
115+
panel_grid_minor=element_line(color="#eeeeee", alpha=0.2),
116+
panel_background=element_rect(fill="white"),
117+
)
118+
)
119+
120+
# Save plot
121+
plot.save("plot.png", dpi=300, width=16, height=16, verbose=False)
Lines changed: 212 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,212 @@
1+
library: plotnine
2+
specification_id: scatter-matrix-interactive
3+
created: '2026-01-10T01:57:41Z'
4+
updated: '2026-01-10T02:24:36Z'
5+
generated_by: claude-opus-4-5-20251101
6+
workflow_run: 20870868420
7+
issue: 3604
8+
python_version: 3.13.11
9+
library_version: 0.15.2
10+
preview_url: https://storage.googleapis.com/pyplots-images/plots/scatter-matrix-interactive/plotnine/plot.png
11+
preview_thumb: https://storage.googleapis.com/pyplots-images/plots/scatter-matrix-interactive/plotnine/plot_thumb.png
12+
preview_html: null
13+
quality_score: 72
14+
review:
15+
strengths:
16+
- Excellent use of plotnine grammar of graphics with facet_grid for matrix layout
17+
- Creative solution using geom_ribbon for histogram-based density distributions
18+
on diagonal
19+
- Colorblind-safe palette (Dark2-inspired teal, orange, purple)
20+
- Clean data transformation from wide to long format for faceting
21+
- Good visual separation of the three Iris species clusters
22+
- Proper title format following specification
23+
weaknesses:
24+
- Missing interactive features (brushing, linked selection, zoom/pan) - plotnine
25+
is static library
26+
- Axis labels are empty strings - variable names only shown in strip text
27+
- Diagonal density plots y-axis scale does not match the variable natural scale
28+
image_description: 'The plot displays a 4×4 scatter plot matrix (SPLOM) using the
29+
Iris dataset with four variables: Sepal Length (cm), Sepal Width (cm), Petal Length
30+
(cm), and Petal Width (cm). The matrix uses a colorblind-safe Dark2-inspired palette
31+
with teal (#1B9E77) for setosa, orange (#D95F02) for versicolor, and purple (#7570B3)
32+
for virginica. Off-diagonal cells show scatter plots with points colored by species
33+
at alpha=0.7. Diagonal cells display histogram-based density distributions as
34+
filled ribbon areas with alpha=0.5, showing the univariate distribution of each
35+
variable per species. The title "scatter-matrix-interactive · plotnine · pyplots.ai"
36+
appears at top-left in bold. Strip labels on top show variable names for x-axis,
37+
and strip labels on right show variable names for y-axis. A legend at the bottom
38+
indicates species color mapping. The grid is subtle with light gray lines. The
39+
layout is clean with minimal spacing between panels.'
40+
criteria_checklist:
41+
visual_quality:
42+
score: 32
43+
max: 40
44+
items:
45+
- id: VQ-01
46+
name: Text Legibility
47+
score: 8
48+
max: 10
49+
passed: true
50+
comment: Title and labels readable, strip text slightly small but acceptable
51+
- id: VQ-02
52+
name: No Overlap
53+
score: 8
54+
max: 8
55+
passed: true
56+
comment: No overlapping text elements
57+
- id: VQ-03
58+
name: Element Visibility
59+
score: 6
60+
max: 8
61+
passed: true
62+
comment: Points visible with good alpha, density ribbons clear but could use
63+
more contrast
64+
- id: VQ-04
65+
name: Color Accessibility
66+
score: 5
67+
max: 5
68+
passed: true
69+
comment: Colorblind-safe palette (Dark2 teal/orange/purple)
70+
- id: VQ-05
71+
name: Layout Balance
72+
score: 3
73+
max: 5
74+
passed: true
75+
comment: Good 4x4 grid but some wasted space around edges
76+
- id: VQ-06
77+
name: Axis Labels
78+
score: 0
79+
max: 2
80+
passed: false
81+
comment: Axis labels are empty (shown in strip text instead)
82+
- id: VQ-07
83+
name: Grid & Legend
84+
score: 2
85+
max: 2
86+
passed: true
87+
comment: Subtle grid, legend well-placed at bottom
88+
spec_compliance:
89+
score: 17
90+
max: 25
91+
items:
92+
- id: SC-01
93+
name: Plot Type
94+
score: 8
95+
max: 8
96+
passed: true
97+
comment: Correct scatter matrix with density diagonals
98+
- id: SC-02
99+
name: Data Mapping
100+
score: 5
101+
max: 5
102+
passed: true
103+
comment: All 4 variables correctly mapped to pairwise scatter
104+
- id: SC-03
105+
name: Required Features
106+
score: 0
107+
max: 5
108+
passed: false
109+
comment: 'Missing interactive features: no brushing/linked selection, no zoom/pan
110+
(plotnine is static - spec notes limitations)'
111+
- id: SC-04
112+
name: Data Range
113+
score: 3
114+
max: 3
115+
passed: true
116+
comment: All data visible within axes
117+
- id: SC-05
118+
name: Legend Accuracy
119+
score: 1
120+
max: 2
121+
passed: false
122+
comment: Species legend correct but no interactive indicator
123+
- id: SC-06
124+
name: Title Format
125+
score: 2
126+
max: 2
127+
passed: true
128+
comment: 'Correct format: scatter-matrix-interactive · plotnine · pyplots.ai'
129+
data_quality:
130+
score: 18
131+
max: 20
132+
items:
133+
- id: DQ-01
134+
name: Feature Coverage
135+
score: 7
136+
max: 8
137+
passed: true
138+
comment: Shows correlations, clusters separation visible, density distributions
139+
on diagonal
140+
- id: DQ-02
141+
name: Realistic Context
142+
score: 7
143+
max: 7
144+
passed: true
145+
comment: Classic Iris dataset - real botanical data
146+
- id: DQ-03
147+
name: Appropriate Scale
148+
score: 5
149+
max: 5
150+
passed: true
151+
comment: Real measurements in centimeters
152+
code_quality:
153+
score: 10
154+
max: 10
155+
items:
156+
- id: CQ-01
157+
name: KISS Structure
158+
score: 3
159+
max: 3
160+
passed: true
161+
comment: 'Linear flow: imports -> data -> plot -> save'
162+
- id: CQ-02
163+
name: Reproducibility
164+
score: 3
165+
max: 3
166+
passed: true
167+
comment: np.random.seed(42) set
168+
- id: CQ-03
169+
name: Clean Imports
170+
score: 2
171+
max: 2
172+
passed: true
173+
comment: All imports used
174+
- id: CQ-04
175+
name: No Deprecated API
176+
score: 1
177+
max: 1
178+
passed: true
179+
comment: Current plotnine syntax
180+
- id: CQ-05
181+
name: Output Correct
182+
score: 1
183+
max: 1
184+
passed: true
185+
comment: Saves as plot.png
186+
library_features:
187+
score: 5
188+
max: 5
189+
items:
190+
- id: LF-01
191+
name: Distinctive Features
192+
score: 5
193+
max: 5
194+
passed: true
195+
comment: 'Excellent use of ggplot2 grammar: facet_grid, geom_ribbon for density,
196+
aes mapping, scale_color_manual, theme customization'
197+
verdict: APPROVED
198+
impl_tags:
199+
dependencies:
200+
- sklearn
201+
techniques:
202+
- faceting
203+
- layer-composition
204+
patterns:
205+
- dataset-loading
206+
- wide-to-long
207+
- iteration-over-groups
208+
dataprep:
209+
- binning
210+
styling:
211+
- alpha-blending
212+
- grid-styling

0 commit comments

Comments
 (0)