Skip to content

Commit df465f4

Browse files
feat(plotnine): implement volcano-basic (#2943)
## Implementation: `volcano-basic` - plotnine Implements the **plotnine** version of `volcano-basic`. **File:** `plots/volcano-basic/implementations/plotnine.py` **Parent Issue:** #2924 --- :robot: *[impl-generate workflow](https://github.com/MarkusNeusinger/pyplots/actions/runs/20612783423)* --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
1 parent f8483fd commit df465f4

2 files changed

Lines changed: 132 additions & 0 deletions

File tree

Lines changed: 105 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,105 @@
1+
""" pyplots.ai
2+
volcano-basic: Volcano Plot for Statistical Significance
3+
Library: plotnine 0.15.2 | Python 3.13.11
4+
Quality: 91/100 | Created: 2025-12-31
5+
"""
6+
7+
import numpy as np
8+
import pandas as pd
9+
from plotnine import (
10+
aes,
11+
element_text,
12+
geom_hline,
13+
geom_point,
14+
geom_text,
15+
geom_vline,
16+
ggplot,
17+
labs,
18+
scale_color_manual,
19+
theme,
20+
theme_minimal,
21+
)
22+
23+
24+
# Data - simulated differential gene expression results
25+
np.random.seed(42)
26+
n_genes = 500
27+
28+
# Generate log2 fold changes (centered around 0 with some outliers)
29+
log2_fold_change = np.concatenate(
30+
[
31+
np.random.normal(0, 0.8, 400), # Most genes have small changes
32+
np.random.normal(-2.5, 0.5, 50), # Down-regulated genes
33+
np.random.normal(2.5, 0.5, 50), # Up-regulated genes
34+
]
35+
)
36+
37+
# Generate p-values with a realistic range (avoiding extreme values)
38+
pvalues = np.concatenate(
39+
[
40+
np.random.uniform(0.05, 1.0, 400), # Most genes not significant
41+
np.random.uniform(0.0001, 0.01, 50), # Down-regulated significant
42+
np.random.uniform(0.0001, 0.01, 50), # Up-regulated significant
43+
]
44+
)
45+
46+
neg_log10_pvalue = -np.log10(pvalues)
47+
48+
# Create gene labels
49+
gene_labels = [f"Gene_{i + 1}" for i in range(n_genes)]
50+
51+
# Determine significance status based on thresholds
52+
# Significant: p-value < 0.05 (neg_log10 > 1.3) AND |log2FC| > 1
53+
significance_threshold = -np.log10(0.05) # ~1.3
54+
fold_change_threshold = 1.0
55+
56+
status = []
57+
for fc, nlp in zip(log2_fold_change, neg_log10_pvalue, strict=True):
58+
if nlp > significance_threshold and fc > fold_change_threshold:
59+
status.append("Up-regulated")
60+
elif nlp > significance_threshold and fc < -fold_change_threshold:
61+
status.append("Down-regulated")
62+
else:
63+
status.append("Not significant")
64+
65+
# Create DataFrame
66+
df = pd.DataFrame(
67+
{
68+
"log2_fold_change": log2_fold_change,
69+
"neg_log10_pvalue": neg_log10_pvalue,
70+
"label": gene_labels,
71+
"status": pd.Categorical(status, categories=["Down-regulated", "Not significant", "Up-regulated"]),
72+
}
73+
)
74+
75+
# Identify top genes to label (top 3 by significance in each direction to avoid overlap)
76+
df_up = df[df["status"] == "Up-regulated"].nlargest(3, "neg_log10_pvalue")
77+
df_down = df[df["status"] == "Down-regulated"].nlargest(3, "neg_log10_pvalue")
78+
df_labels = pd.concat([df_up, df_down])
79+
80+
# Create volcano plot
81+
plot = (
82+
ggplot(df, aes(x="log2_fold_change", y="neg_log10_pvalue", color="status"))
83+
+ geom_point(size=3, alpha=0.7)
84+
+ geom_hline(yintercept=significance_threshold, linetype="dashed", color="#333333", size=0.8)
85+
+ geom_vline(xintercept=-fold_change_threshold, linetype="dashed", color="#333333", size=0.8)
86+
+ geom_vline(xintercept=fold_change_threshold, linetype="dashed", color="#333333", size=0.8)
87+
+ geom_text(data=df_labels, mapping=aes(label="label"), size=10, nudge_y=0.3, color="#333333")
88+
+ scale_color_manual(values={"Down-regulated": "#306998", "Not significant": "#888888", "Up-regulated": "#D62728"})
89+
+ labs(
90+
x="Log2 Fold Change", y="-Log10(p-value)", title="volcano-basic · plotnine · pyplots.ai", color="Significance"
91+
)
92+
+ theme_minimal()
93+
+ theme(
94+
figure_size=(16, 9),
95+
text=element_text(size=14),
96+
axis_title=element_text(size=20),
97+
axis_text=element_text(size=16),
98+
plot_title=element_text(size=24),
99+
legend_text=element_text(size=16),
100+
legend_title=element_text(size=18),
101+
)
102+
)
103+
104+
# Save plot
105+
plot.save("plot.png", dpi=300)
Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
library: plotnine
2+
specification_id: volcano-basic
3+
created: '2025-12-31T05:32:18Z'
4+
updated: '2025-12-31T05:40:21Z'
5+
generated_by: claude-opus-4-5-20251101
6+
workflow_run: 20612783423
7+
issue: 2924
8+
python_version: 3.13.11
9+
library_version: 0.15.2
10+
preview_url: https://storage.googleapis.com/pyplots-images/plots/volcano-basic/plotnine/plot.png
11+
preview_thumb: https://storage.googleapis.com/pyplots-images/plots/volcano-basic/plotnine/plot_thumb.png
12+
preview_html: null
13+
quality_score: 91
14+
review:
15+
strengths:
16+
- Excellent implementation of the volcano plot specification with all required features
17+
(threshold lines, color coding, gene labels)
18+
- Clean grammar of graphics approach using plotnine ggplot2-style syntax
19+
- Good data simulation that realistically represents differential gene expression
20+
results
21+
- Text sizes properly configured for 4800x2700 output
22+
- Proper use of categorical coloring with meaningful significance categories
23+
weaknesses:
24+
- Minor label overlap between Gene_469 and Gene_478 on the right side of the plot
25+
- Could use colorblind-safe palette instead of red-blue (consider using orange or
26+
different saturation)
27+
- Grid lines not visible (theme_minimal removes them) which could aid reading values

0 commit comments

Comments
 (0)