Skip to content

feat(bokeh): implement ecdf-basic#9486

Merged
MarkusNeusinger merged 6 commits into
mainfrom
implementation/ecdf-basic/bokeh
Jun 25, 2026
Merged

feat(bokeh): implement ecdf-basic#9486
MarkusNeusinger merged 6 commits into
mainfrom
implementation/ecdf-basic/bokeh

Conversation

@github-actions

Copy link
Copy Markdown
Contributor

Implementation: ecdf-basic - python/bokeh

Implements the python/bokeh version of ecdf-basic.

File: plots/ecdf-basic/implementations/python/bokeh.py

Parent Issue: #976


🤖 impl-generate workflow

github-actions Bot added 2 commits June 25, 2026 09:36
Regen from quality 87. Addressed:
- Canvas fixed from 4800×2700 → 3200×1800 (canonical Bokeh landscape)
- Font sizes updated to canonical: 50pt title, 42pt axis labels, 34pt ticks
- Replaced export_png with Selenium + PIL crop pattern (matching sudoku-basic)
- Added sys.path fix to prevent bokeh.py self-shadowing installed package
- Added IQR shaded band (Q1–Q3 BoxAnnotation) for distribution spread focal point (DE-03)
- Added subtle area fill under ECDF step curve for aesthetic sophistication (DE-01)
- Staggered percentile label y-positions (0.03/0.09/0.15) to avoid overlap (VQ-02)
- Added dotted horizontal reference line at y=0.5 for effortless median reading (DE-02)
- Added CrosshairTool for richer Bokeh interactivity (LM-02)
- Fixed comment: "Okabe-Ito" → "Imprint palette"
- Added "python" language token to title
@claude

claude Bot commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

AI Review - Attempt 1/3

Image Description

Light render (plot-light.png): Warm off-white (#FAF8F1) background — correct. Bold dark title "Marathon Finish Times · ecdf-basic · python · bokeh · anyplot.ai" fills ~70% of canvas width (expected for the mandated long title). Y-axis label "Cumulative Proportion of Runners" and X-axis label "Finish Time (minutes)" in dark italic-styled 42pt labels. Tick labels at 34pt in INK_SOFT. ECDF step line in brand green #009E73 at 4.5pt width, rising from ~150 min to ~375 min. Light-green area fill (alpha=0.07) under curve. IQR box annotation (alpha=0.08) from Q1≈218 to Q3≈260. Staggered italic annotations "25th: 218 min", "50th (median): 242 min", "75th: 260 min" at y=0.03/0.09/0.15 in INK_SOFT. Horizontal dotted reference at y=0.5. Y-only subtle grid at alpha=0.10. All text clearly readable against the warm off-white background — no light-on-light issues.

Dark render (plot-dark.png): Warm near-black (#1A1A17) background — correct. Title and axis labels in light #F0EFE8 ink; tick labels and annotations in #B8B7B0. All text clearly readable against the dark background. Data colors are identical to the light render — ECDF line remains #009E73, confirming only chrome flips between themes. IQR band and area fill create subtle dark-green tones against the near-black background, consistent with low alpha fills. No dark-on-dark failures detected.

Both paragraphs are included. Both renders are valid.

Score: 89/100

Category Score Max
Visual Quality 29 30
Design Excellence 13 20
Spec Compliance 15 15
Data Quality 15 15
Code Quality 10 10
Library Mastery 7 10
Total 89 100

Visual Quality (29/30)

  • VQ-01: Text Legibility (7/8) — All font sizes explicitly set (title 50pt, axis 42pt, ticks 34pt, annotations 26pt). Minor deduction: percentile annotations at 26pt are below tick label size (34pt) — marginal at ~400px mobile scale.
  • VQ-02: No Overlap (6/6) — Staggered annotations at y=0.03/0.09/0.15 cleanly separated. No text/data overlap.
  • VQ-03: Element Visibility (6/6) — ECDF step line at 4.5pt is appropriately prominent for 300-point data.
  • VQ-04: Color Accessibility (2/2) — Single brand-green series on warm cream/near-black. High contrast in both themes.
  • VQ-05: Layout & Canvas (4/4) — No canvas gate failure. Generous min_border_* margins. No overflow or clipping.
  • VQ-06: Axis Labels & Title (2/2) — X: "Finish Time (minutes)", Y: "Cumulative Proportion of Runners" — descriptive with units.
  • VQ-07: Palette Compliance (2/2) — First/only series #009E73. Light bg #FAF8F1, dark bg #1A1A17. All chrome tokens flip correctly.

Design Excellence (13/20)

  • DE-01: Aesthetic Sophistication (5/8) — Above "well-configured default" (4): IQR band, area fill, staggered percentile annotations, and median reference show intentional choices. Not yet "strong design" (6) — single series only, overall look is clean but not publication-ready.
  • DE-02: Visual Refinement (4/6) — Box outline removed, y-only grid at alpha=0.10 (very subtle), no minor ticks, no toolbar. Above default but axis lines remain.
  • DE-03: Data Storytelling (4/6) — IQR band creates focal emphasis, staggered annotations enable direct percentile reading, horizontal reference at y=0.5 makes median effortless. Clearly above "data displayed but not interpreted" (2).

Spec Compliance (15/15)

  • SC-01: Plot Type (5/5) — Correct ECDF step function with mode='after' — accurate 1/n increments.
  • SC-02: Required Features (4/4) — Step function, y-range 0–1.05, grid lines, distinct line style — all present.
  • SC-03: Data Mapping (3/3) — X=finish time (min), Y=cumulative proportion (0–1). All 300 points plotted.
  • SC-04: Title & Legend (3/3) — Title in correct {Descriptive} · ecdf-basic · python · bokeh · anyplot.ai format. No legend (single series, N/A).

Data Quality (15/15)

  • DQ-01: Feature Coverage (6/6) — Full distribution, IQR highlighted, percentile markers (Q1/Q2/Q3), area fill — all ECDF aspects covered.
  • DQ-02: Realistic Context (5/5) — Marathon finish times for 300 recreational runners — real, neutral, plausible domain.
  • DQ-03: Appropriate Scale (4/4) — Mean 240 min (4h), SD 32 min — factually correct for recreational marathon times. 300 runners within spec range (50–500).

Code Quality (10/10)

  • CQ-01: KISS Structure (3/3) — Linear structure: imports → data → ECDF calc → plot → save. No functions/classes.
  • CQ-02: Reproducibility (2/2) — np.random.seed(42) set.
  • CQ-03: Clean Imports (2/2) — All imports used.
  • CQ-04: Code Elegance (2/2) — Clean Pythonic code. Staircase fill computation is appropriate. No fake UI.
  • CQ-05: Output & API (1/1) — Saves plot-{THEME}.html via output_file/save and plot-{THEME}.png via Selenium + PIL crop.

Library Mastery (7/10)

  • LM-01: Idiomatic Usage (4/5) — Strong Bokeh idioms: ColumnDataSource, HoverTool (vline mode), CrosshairTool, BoxAnnotation, Span, Label, varea, step renderer with mode='after'. Above default (3).
  • LM-02: Distinctive Features (3/5) — HoverTool vline mode and BoxAnnotation are Bokeh-distinctive. Interactive HTML artifact alongside PNG is Bokeh-specific. Above default (1) but could push further.

Score Caps Applied

  • None — all caps cleared (VQ-02>0, VQ-03>0, SC-01>0, DQ-02>0, DE-01=5>2, DE-02=4>2, CQ-04=2>0)

Strengths

  • Correct ECDF step function with mode='after' producing accurate 1/n increments
  • Realistic marathon finish times dataset (300 runners, mean 240 min) — plausible and neutral
  • IQR band + area fill + staggered percentile annotations create strong distribution storytelling
  • Excellent Bokeh model usage: ColumnDataSource, HoverTool vline mode, BoxAnnotation, Span, Label, varea
  • Perfect theme-adaptive chrome in both renders — no dark-on-dark issues, backgrounds correct
  • Perfect code quality: seed set, clean imports, KISS structure, correct file output

Weaknesses

  • DE-01 (5/8): Design just misses the 6/8 threshold — consider strengthening the typographic hierarchy (e.g., bolder title, or a subtitle/annotation summarizing the distribution) or adding a more distinctive visual element beyond the IQR + area fill combo.
  • VQ-01 (7/8): Percentile annotation labels at 26pt are smaller than tick labels (34pt) — increase to 30pt for better mobile legibility when the PNG scales to ~400px.
  • LM-02 (3/5): HoverTool and CrosshairTool are used generically. Consider adding a more distinctively Bokeh feature, such as a Div widget showing the clicked percentile, a TapTool callback, or a CustomJS tooltip displaying more distribution context.

Issues Found

  1. DE-01 BELOW-THRESHOLD: Aesthetic sophistication scores 5/8 — good but not strong design. The single-series presentation with IQR + fill is thoughtful but the overall visual doesn't yet feel publication-ready.
    • Fix: Strengthen typographic hierarchy; consider a bolder visual separation between the main ECDF line and annotation elements, or add a subtle annotation summarizing key insight (e.g., "50% finish within 218–260 min").
  2. VQ-01 MINOR: Annotation labels at 26pt are below tick label size at mobile scale.
    • Fix: Increase annotation label font size from 26pt to 30pt.

AI Feedback for Next Attempt

Push design to 6/8: increase annotation font from 26pt to 30pt, consider a stronger typographic or visual device to make the plot feel publication-ready (e.g., a callout annotation summarizing the IQR range, or more intentional font weight differentiation between the title, axis labels, and percentile annotations). For LM-02, add a more distinctively Bokeh feature — a CustomJS callback on the HoverTool that shows a dynamically computed insight, or a Div widget displaying the hovered percentile in a readable format would clearly differentiate this from a matplotlib plot.

Verdict: REJECTED

@github-actions github-actions Bot added quality:89 Quality score 89/100 ai-rejected Quality not OK, triggers update labels Jun 25, 2026
@github-actions github-actions Bot added ai-attempt-1 First repair attempt and removed ai-rejected Quality not OK, triggers update labels Jun 25, 2026
Attempt 1/3 - fixes based on AI review
@github-actions

Copy link
Copy Markdown
Contributor Author

🔧 Repair Attempt 1/4

Applied fixes based on AI review feedback.

Status: Repair completed, re-triggering review...


🤖 impl-repair

@claude

claude Bot commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

AI Review - Attempt 2/3

Image Description

Light render (plot-light.png): Warm off-white #FAF8F1 background. Title "Marathon Finish Times · ecdf-basic · python · bokeh · anyplot.ai" in bold dark INK at 50pt — fully readable. X-axis label "Finish Time (minutes)" and Y-axis label "Cumulative Proportion of Runners" in INK at 42pt, both legible. Tick labels in INK_SOFT at 34pt, readable. ECDF step line in brand green #009E73 at 4.5px is clearly visible, tracing the S-curve from ~120 to ~370 min. Area fill under the curve at 7% alpha provides subtle visual mass. IQR band (Q25–Q75) shaded in green at 8% alpha. Three dashed Span reference lines mark 25th/50th/75th percentiles. Staggered annotation callout boxes at y=0.03/0.09/0.15 labelled "25th: 218 min", "50th (median): 242 min", "75th: 260 min" at 30pt italic — readable but smaller than tick labels (34pt). Y-only grid at 10% alpha is subtle and appropriate. All text readable against the light background.

Dark render (plot-dark.png): Warm near-black #1A1A17 background. Title and axis labels switch to light INK (#F0EFE8) and INK_SOFT (#B8B7B0) tokens — all clearly readable against the dark surface. No dark-on-dark text failures. Data color #009E73 is identical to the light render — brand green is clearly visible on the dark background. Annotation callout boxes use ELEVATED_BG (#242420) fill, giving them a slightly lighter dark-on-dark callout appearance that is still readable. IQR band and area fill at 7–8% alpha are extremely faint and near-invisible in the dark render but the primary ECDF step line is the focal element and reads cleanly. Both renders pass the theme-readability check.

Both paragraphs are required. A review that only describes one render is invalid.

Score: 89/100

Category Score Max
Visual Quality 28 30
Design Excellence 13 20
Spec Compliance 15 15
Data Quality 15 15
Code Quality 10 10
Library Mastery 8 10
Total 89 100

Visual Quality (28/30)

  • VQ-01: Text Legibility (6/8) — all primary text legible in both themes; annotation text (30pt) is smaller than tick labels (34pt), creating an inverted typography hierarchy
  • VQ-02: No Overlap (6/6) — staggered annotation boxes at y=0.03/0.09/0.15 prevent overlap; no text-on-text or text-on-data collisions
  • VQ-03: Element Visibility (6/6) — ECDF step line at 4.5px in brand green is clearly visible; area fill and IQR band are subtle but the step line is the primary focal element
  • VQ-04: Color Accessibility (2/2) — single-series brand green, CVD-safe by palette design
  • VQ-05: Layout & Canvas (4/4) — canvas gate passed; title at 64 chars/50pt spans ~70-75% width (expected); generous min_border margins; no clipping
  • VQ-06: Axis Labels & Title (2/2) — "Finish Time (minutes)" and "Cumulative Proportion of Runners" — descriptive with units; correct title format
  • VQ-07: Palette Compliance (2/2) — first series #009E73 ✓; light #FAF8F1 / dark #1A1A17 backgrounds ✓; INK/INK_SOFT chrome tokens adapt correctly ✓

Design Excellence (13/20)

  • DE-01: Aesthetic Sophistication (5/8) — above default (4): multi-layer annotation system (IQR band + area fill + staggered callout boxes + median reference line) is intentional and insightful; not exceptional but clearly above-average for a basic ECDF
  • DE-02: Visual Refinement (4/6) — above default (2): Y-only grid at 10% alpha is an ideal choice for ECDF; outline_line_color=None removes box border; minor ticks removed; generous margins; grid color token uses INK rather than the rgba rule token (minor)
  • DE-03: Data Storytelling (4/6) — above default (2): percentile annotations actively answer "when do 25/50/75% of runners finish?"; IQR band frames the central distribution mass; median reference line aids percentile reading

Spec Compliance (15/15)

  • SC-01: Plot Type (5/5) — correct ECDF step function with mode='after', increasing by 1/n at each data point
  • SC-02: Required Features (4/4) — step function ✓, Y-axis 0–1 ✓, distinct line style ✓, grid lines for reading percentiles ✓
  • SC-03: Data Mapping (3/3) — X: finish times (continuous), Y: cumulative proportion 0–1; correctly sorted and computed as rank/n
  • SC-04: Title & Legend (3/3) — correct optional-prefix format "Marathon Finish Times · ecdf-basic · python · bokeh · anyplot.ai"; no legend appropriate for single series

Data Quality (15/15)

  • DQ-01: Feature Coverage (6/6) — full ECDF with IQR band, area fill, percentile markers (Q1/Q2/Q3), median reference
  • DQ-02: Realistic Context (5/5) — marathon finish times for 300 recreational runners; mean=240min (4hr) is realistic; neutral, non-controversial
  • DQ-03: Appropriate Scale (4/4) — 300 points (ideal per spec's 50–500 target); range ~120–370 min; correct 0–1 cumulative scale

Code Quality (10/10)

  • CQ-01: KISS Structure (3/3) — flat script, no functions or classes; logical flow
  • CQ-02: Reproducibility (2/2) — np.random.seed(42) before data generation
  • CQ-03: Clean Imports (2/2) — all imported names are used
  • CQ-04: Code Elegance (2/2) — percentile loop avoids repetition; CustomJSHover formatters are clean; no fake UI
  • CQ-05: Output & API (1/1) — plot-{THEME}.html + plot-{THEME}.png both produced correctly

Library Mastery (8/10)

  • LM-01: Idiomatic Usage (5/5) — ColumnDataSource, figure(), p.step(), p.varea(), p.add_layout(), output_file()+save(); Selenium+PIL crop follows canonical Bokeh save pattern
  • LM-02: Distinctive Features (3/5) — CustomJSHover with runner-count format ("63.5% — 191/300 runners") is distinctively Bokeh; CrosshairTool, BoxAnnotation, Span, Label are proper Bokeh models; HTML interactive artifact leverages Bokeh's strength; linked selections or tap callbacks would push to 4–5

Score Caps Applied

  • None

Strengths

  • Correct ECDF step function (mode='after') with area fill and IQR band that highlight distribution spread
  • Staggered percentile annotation callout boxes (25th/50th/75th) enable direct value reading without overlap
  • CustomJSHover shows both percentage and runner count ("63.5% — 191/300 runners") — a distinctively Bokeh interactive feature
  • Perfect theme adaptation: #FAF8F1 / #1A1A17 backgrounds with correct INK/INK_SOFT chrome tokens in both renders
  • Both HTML and PNG artifacts produced correctly via the Selenium + PIL crop pipeline
  • Perfect spec compliance, data quality, and code quality

Weaknesses

  • Annotation label text is 30pt while tick labels are 34pt — inverted typography hierarchy (annotations are secondary but should not be smaller than ticks; raise to 36pt or match tick size at 34pt)
  • IQR band (fill_alpha=0.08) and area fill (fill_alpha=0.07) are extremely subtle — near-invisible in the dark render; raise to 0.12–0.15 for the band and 0.10 for the area fill
  • Grid grid_line_color uses INK token (dark ink) instead of the style-guide's rgba rule token; alpha=0.10 compensates, but the colour token is technically incorrect

Issues Found

  1. VQ-01 MINOR: Annotation text (30pt) is smaller than tick labels (34pt) — unusual hierarchy
    • Fix: Raise annotation text_font_size to "34pt" to match tick labels or "36pt" for a sensible secondary annotation size
  2. VQ-03 / DE-01 MINOR: IQR band and area fill are near-invisible in dark render at 7–8% alpha
    • Fix: Raise fill_alpha for iqr_band to 0.13 and for p.varea() to 0.10

AI Feedback for Next Attempt

Implementation is strong and APPROVED. If a repair attempt runs: (1) raise annotation text_font_size to "34pt" to restore correct typography hierarchy; (2) raise IQR band fill_alpha to 0.13 and area fill_alpha to 0.10 so both are visible in dark mode; (3) optionally set ygrid.grid_line_color to INK_SOFT with alpha=0.15 for correct style-guide compliance. Do not change the ECDF logic, data context, or overall annotation structure — those are all correct and well-executed.

Verdict: APPROVED

@github-actions github-actions Bot added the ai-approved Quality OK, ready for merge label Jun 25, 2026
@MarkusNeusinger MarkusNeusinger merged commit 634bf6e into main Jun 25, 2026
3 checks passed
@MarkusNeusinger MarkusNeusinger deleted the implementation/ecdf-basic/bokeh branch June 25, 2026 09:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ai-approved Quality OK, ready for merge ai-attempt-1 First repair attempt quality:89 Quality score 89/100

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant