Skip to content

Commit 84ef45b

Browse files
feat(matplotlib): implement sequence-logo-basic (#4611)
## Implementation: `sequence-logo-basic` - matplotlib Implements the **matplotlib** version of `sequence-logo-basic`. **File:** `plots/sequence-logo-basic/implementations/matplotlib.py` **Parent Issue:** #4421 --- :robot: *[impl-generate workflow](https://github.com/MarkusNeusinger/pyplots/actions/runs/22780524847)* --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
1 parent baab181 commit 84ef45b

2 files changed

Lines changed: 350 additions & 0 deletions

File tree

Lines changed: 126 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,126 @@
1+
""" pyplots.ai
2+
sequence-logo-basic: Sequence Logo for Motif Visualization
3+
Library: matplotlib 3.10.8 | Python 3.14.3
4+
Quality: 92/100 | Created: 2026-03-06
5+
"""
6+
7+
import matplotlib.pyplot as plt
8+
import matplotlib.transforms as transforms
9+
import numpy as np
10+
from matplotlib.font_manager import FontProperties
11+
from matplotlib.lines import Line2D
12+
from matplotlib.patches import FancyBboxPatch, PathPatch
13+
from matplotlib.textpath import TextPath
14+
15+
16+
# Data — a 10-position DNA transcription factor binding site motif (ETS-family-like)
17+
position_freqs = [
18+
{"A": 0.25, "C": 0.25, "G": 0.25, "T": 0.25},
19+
{"A": 0.10, "C": 0.60, "G": 0.10, "T": 0.20},
20+
{"A": 0.05, "C": 0.05, "G": 0.85, "T": 0.05},
21+
{"A": 0.90, "C": 0.02, "G": 0.03, "T": 0.05},
22+
{"A": 0.02, "C": 0.02, "G": 0.94, "T": 0.02},
23+
{"A": 0.02, "C": 0.02, "G": 0.02, "T": 0.94},
24+
{"A": 0.15, "C": 0.35, "G": 0.15, "T": 0.35},
25+
{"A": 0.30, "C": 0.20, "G": 0.30, "T": 0.20},
26+
{"A": 0.05, "C": 0.05, "G": 0.05, "T": 0.85},
27+
{"A": 0.25, "C": 0.25, "G": 0.25, "T": 0.25},
28+
]
29+
30+
# Colorblind-safe palette: replaced green/red with teal/purple for A/T
31+
dna_colors = {"A": "#1b7837", "C": "#1f77b4", "G": "#ff7f0e", "T": "#9467bd"}
32+
letters = ["A", "C", "G", "T"]
33+
n_positions = len(position_freqs)
34+
max_bits = 2.0
35+
36+
# Compute information content per position
37+
info_contents = []
38+
for freqs in position_freqs:
39+
entropy = sum(-f * np.log2(f) for f in freqs.values() if f > 0)
40+
info_contents.append(max_bits - entropy)
41+
42+
# Plot
43+
fig, ax = plt.subplots(figsize=(16, 9))
44+
fp = FontProperties(family="DejaVu Sans", weight="bold")
45+
bar_width = 0.9
46+
47+
# Highlight conserved core region (positions 3-6) with background shading
48+
core_start, core_end = 3, 6
49+
highlight = FancyBboxPatch(
50+
(core_start - 0.48, -0.02),
51+
core_end - core_start + 0.96,
52+
max_bits + 0.04,
53+
boxstyle="round,pad=0.02",
54+
facecolor="#f0e68c",
55+
edgecolor="#c4a000",
56+
alpha=0.25,
57+
linewidth=1.5,
58+
zorder=0,
59+
)
60+
ax.add_patch(highlight)
61+
62+
for pos_idx, freqs in enumerate(position_freqs):
63+
ic = info_contents[pos_idx]
64+
letter_heights = {lt: freqs[lt] * ic for lt in letters}
65+
sorted_letters = sorted(letters, key=lambda lt: letter_heights[lt])
66+
67+
y_offset = 0.0
68+
x_start = pos_idx + 1 - bar_width / 2
69+
for letter in sorted_letters:
70+
h = letter_heights[letter]
71+
if h < 0.01:
72+
continue
73+
tp = TextPath((0, 0), letter, size=1, prop=fp)
74+
bbox = tp.get_extents()
75+
if bbox.width == 0 or bbox.height == 0:
76+
continue
77+
sx = bar_width / bbox.width
78+
sy = h / bbox.height
79+
t = transforms.Affine2D().translate(-bbox.x0, -bbox.y0).scale(sx, sy).translate(x_start, y_offset)
80+
patch = PathPatch(tp.transformed(t), facecolor=dna_colors[letter], edgecolor="none", linewidth=0, zorder=2)
81+
ax.add_patch(patch)
82+
y_offset += h
83+
84+
# Annotate conserved core motif
85+
ax.annotate(
86+
"Conserved core",
87+
xy=((core_start + core_end) / 2, max_bits * 0.88),
88+
fontsize=14,
89+
fontweight="medium",
90+
color="#6b5900",
91+
ha="center",
92+
va="center",
93+
zorder=3,
94+
)
95+
96+
# Color legend for nucleotides using matplotlib legend API
97+
98+
legend_handles = [
99+
Line2D([0], [0], marker="s", color="w", markerfacecolor=dna_colors[lt], markersize=12, label=lt, linewidth=0)
100+
for lt in letters
101+
]
102+
ax.legend(
103+
handles=legend_handles,
104+
loc="upper right",
105+
fontsize=14,
106+
framealpha=0.8,
107+
edgecolor="#cccccc",
108+
handletextpad=0.4,
109+
labelspacing=0.3,
110+
)
111+
112+
# Style
113+
ax.set_xlim(0.5, n_positions + 0.5)
114+
ax.set_ylim(0, max_bits)
115+
ax.set_xticks(range(1, n_positions + 1))
116+
ax.set_xticklabels(range(1, n_positions + 1), fontsize=16)
117+
ax.set_xlabel("Position", fontsize=20)
118+
ax.set_ylabel("Information content (bits)", fontsize=20)
119+
ax.set_title("sequence-logo-basic \u00b7 matplotlib \u00b7 pyplots.ai", fontsize=24, fontweight="medium")
120+
ax.tick_params(axis="both", labelsize=16)
121+
ax.spines["top"].set_visible(False)
122+
ax.spines["right"].set_visible(False)
123+
ax.yaxis.grid(True, alpha=0.2, linewidth=0.8, zorder=0)
124+
125+
plt.tight_layout()
126+
plt.savefig("plot.png", dpi=300, bbox_inches="tight")
Lines changed: 224 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,224 @@
1+
library: matplotlib
2+
specification_id: sequence-logo-basic
3+
created: '2026-03-06T20:25:27Z'
4+
updated: '2026-03-06T20:36:39Z'
5+
generated_by: claude-opus-4-5-20251101
6+
workflow_run: 22780524847
7+
issue: 4421
8+
python_version: 3.14.3
9+
library_version: 3.10.8
10+
preview_url: https://storage.googleapis.com/pyplots-images/plots/sequence-logo-basic/matplotlib/plot.png
11+
preview_thumb: https://storage.googleapis.com/pyplots-images/plots/sequence-logo-basic/matplotlib/plot_thumb.png
12+
preview_html: null
13+
quality_score: 92
14+
review:
15+
strengths:
16+
- Excellent glyph rendering using TextPath + Affine2D transforms — the proper matplotlib
17+
technique for sequence logos
18+
- Conserved core highlight with FancyBboxPatch adds meaningful data storytelling
19+
- Biologically realistic ETS-family binding site data with clear conservation patterns
20+
- 'Full spec compliance: all required features implemented correctly'
21+
- Clean, well-structured code with all font sizes explicitly set
22+
weaknesses:
23+
- Color comment claims teal for A but uses standard green (#1b7837) — minor inconsistency
24+
- Very small letters at low-information positions are hard to distinguish individually
25+
image_description: 'The plot displays a DNA sequence logo for a 10-position transcription
26+
factor binding site motif. Positions 1-10 are shown on the x-axis, with information
27+
content (bits) on the y-axis ranging from 0 to 2.0. At each position, nucleotide
28+
letters (A, C, G, T) are stacked vertically as scaled glyphs, with height proportional
29+
to their contribution to information content. Colors are: A=green, C=blue, G=orange,
30+
T=purple. Positions 3-6 form a highly conserved core (highlighted with a pale
31+
yellow rounded rectangle and labeled "Conserved core"), showing dominant G, A,
32+
G, T letters reaching ~1.4-1.6 bits. Positions 1, 8, and 10 show near-zero information
33+
(uniform distribution). Position 2 has moderate conservation (dominant C). Position
34+
9 shows a strong T. A color legend in the upper right identifies all four nucleotides.
35+
Top and right spines are removed, with a subtle y-axis grid. The title reads "sequence-logo-basic
36+
· matplotlib · pyplots.ai".'
37+
criteria_checklist:
38+
visual_quality:
39+
score: 28
40+
max: 30
41+
items:
42+
- id: VQ-01
43+
name: Text Legibility
44+
score: 8
45+
max: 8
46+
passed: true
47+
comment: 'All font sizes explicitly set: title 24pt, axis labels 20pt, ticks
48+
16pt, legend 14pt'
49+
- id: VQ-02
50+
name: No Overlap
51+
score: 6
52+
max: 6
53+
passed: true
54+
comment: No text collisions anywhere
55+
- id: VQ-03
56+
name: Element Visibility
57+
score: 5
58+
max: 6
59+
passed: true
60+
comment: Letter glyphs well-sized at conserved positions; very small at low-info
61+
positions but correct behavior
62+
- id: VQ-04
63+
name: Color Accessibility
64+
score: 3
65+
max: 4
66+
passed: true
67+
comment: Green/blue/orange/purple avoids red-green conflict; comment claims
68+
teal but uses standard green
69+
- id: VQ-05
70+
name: Layout & Canvas
71+
score: 4
72+
max: 4
73+
passed: true
74+
comment: Good 16:9 proportions, tight_layout, plot fills canvas well
75+
- id: VQ-06
76+
name: Axis Labels & Title
77+
score: 2
78+
max: 2
79+
passed: true
80+
comment: 'Descriptive labels with units: Position, Information content (bits)'
81+
design_excellence:
82+
score: 16
83+
max: 20
84+
items:
85+
- id: DE-01
86+
name: Aesthetic Sophistication
87+
score: 6
88+
max: 8
89+
passed: true
90+
comment: Custom nucleotide palette, highlighted conserved core, refined legend,
91+
clean typography hierarchy
92+
- id: DE-02
93+
name: Visual Refinement
94+
score: 5
95+
max: 6
96+
passed: true
97+
comment: Spines removed, subtle y-axis grid, rounded highlight box, generous
98+
whitespace
99+
- id: DE-03
100+
name: Data Storytelling
101+
score: 5
102+
max: 6
103+
passed: true
104+
comment: Yellow highlight on conserved core with annotation creates clear
105+
focal point and biological story
106+
spec_compliance:
107+
score: 15
108+
max: 15
109+
items:
110+
- id: SC-01
111+
name: Plot Type
112+
score: 5
113+
max: 5
114+
passed: true
115+
comment: Correct sequence logo with stacked letters scaled by information
116+
content
117+
- id: SC-02
118+
name: Required Features
119+
score: 4
120+
max: 4
121+
passed: true
122+
comment: 'All spec features: vertical stacking, frequency ordering, info content
123+
scaling, standard colors, scaled glyphs'
124+
- id: SC-03
125+
name: Data Mapping
126+
score: 3
127+
max: 3
128+
passed: true
129+
comment: X=position (1-10), Y=information content (0-2 bits)
130+
- id: SC-04
131+
name: Title & Legend
132+
score: 3
133+
max: 3
134+
passed: true
135+
comment: Correct title format and nucleotide color legend
136+
data_quality:
137+
score: 15
138+
max: 15
139+
items:
140+
- id: DQ-01
141+
name: Feature Coverage
142+
score: 6
143+
max: 6
144+
passed: true
145+
comment: 'Full range: uniform, moderate, and highly conserved positions'
146+
- id: DQ-02
147+
name: Realistic Context
148+
score: 5
149+
max: 5
150+
passed: true
151+
comment: ETS-family transcription factor binding site motif with plausible
152+
GAGT core
153+
- id: DQ-03
154+
name: Appropriate Scale
155+
score: 4
156+
max: 4
157+
passed: true
158+
comment: Frequencies sum to 1, information content correct for DNA (max 2
159+
bits)
160+
code_quality:
161+
score: 10
162+
max: 10
163+
items:
164+
- id: CQ-01
165+
name: KISS Structure
166+
score: 3
167+
max: 3
168+
passed: true
169+
comment: 'Linear flow: imports, data, compute, plot, style, save'
170+
- id: CQ-02
171+
name: Reproducibility
172+
score: 2
173+
max: 2
174+
passed: true
175+
comment: Fully deterministic with hardcoded frequency data
176+
- id: CQ-03
177+
name: Clean Imports
178+
score: 2
179+
max: 2
180+
passed: true
181+
comment: All imports used
182+
- id: CQ-04
183+
name: Code Elegance
184+
score: 2
185+
max: 2
186+
passed: true
187+
comment: Clean TextPath + Affine2D + PathPatch approach
188+
- id: CQ-05
189+
name: Output & API
190+
score: 1
191+
max: 1
192+
passed: true
193+
comment: Saves as plot.png with dpi=300, current API
194+
library_mastery:
195+
score: 8
196+
max: 10
197+
items:
198+
- id: LM-01
199+
name: Idiomatic Usage
200+
score: 4
201+
max: 5
202+
passed: true
203+
comment: Ax-based API, TextPath with Affine2D transforms for glyph rendering
204+
- id: LM-02
205+
name: Distinctive Features
206+
score: 4
207+
max: 5
208+
passed: true
209+
comment: TextPath, Affine2D, FancyBboxPatch, PathPatch, custom Line2D legend
210+
handles
211+
verdict: APPROVED
212+
impl_tags:
213+
dependencies: []
214+
techniques:
215+
- annotations
216+
- custom-legend
217+
- patches
218+
- manual-ticks
219+
patterns:
220+
- iteration-over-groups
221+
dataprep: []
222+
styling:
223+
- grid-styling
224+
- alpha-blending

0 commit comments

Comments
 (0)