Skip to content

Commit 72039fc

Browse files
authored
Add post: mplhep: matplotlib for particle physics (#271)
1 parent fed8fc3 commit 72039fc

10 files changed

Lines changed: 257 additions & 0 deletions

File tree

10.6 KB
Loading
10.6 KB
Loading
22.1 KB
Loading
Lines changed: 257 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,257 @@
1+
---
2+
title: "mplhep: matplotlib for particle physics"
3+
date: 2026-05-14
4+
draft: false
5+
description: "mplhep is a Scikit-HEP package that turns matplotlib into a comfortable plotting environment for high-energy physics. It plots pre-binned histograms (and helped get plt.stairs into matplotlib), provides ratio/pull comparison panels, and ships the official styles used by ATLAS, CMS, LHCb, ALICE and DUNE."
6+
tags: ["mplhep", "matplotlib", "scikit-hep", "physics", "histograms"]
7+
displayInList: true
8+
authors: ["Andrzej Novak"]
9+
10+
resources:
11+
- name: featuredImage
12+
src: "style-cms.png"
13+
params:
14+
description: "A stacked histogram with data points, rendered in mplhep's CMS style: bold experiment tag and integrated-luminosity / centre-of-mass-energy string in the figure margin."
15+
showOnTop: true
16+
17+
summary: |
18+
[mplhep](https://github.com/scikit-hep/mplhep) is the matplotlib companion library for high-energy physics. It plots pre-binned histograms (and helped land `plt.stairs` upstream), builds ratio/pull comparison panels the way collider analyses expect them, and ships the official styles of ATLAS, CMS, LHCb, ALICE and DUNE.
19+
---
20+
21+
In particle or high energy physics (HEP), by the time you draw a plot the data are almost always _already binned_. A long stretch of the analysis pipeline — [Uproot](https://uproot.readthedocs.io/), [Coffea](https://coffea-hep.readthedocs.io/), [boost-histogram](https://boost-histogram.readthedocs.io/), [hist](https://hist.readthedocs.io/) — has reduced terabytes of events into a handful of histograms that you now want to display. That single fact bends what a good plotting API for HEP needs to look like, and it is where [mplhep](https://github.com/scikit-hep/mplhep) — a thin, focused matplotlib wrapper in the [Scikit-HEP](https://scikit-hep.org/) ecosystem — sits.
22+
23+
This post walks through three things mplhep contributes: a histogram plotting function for pre-binned data, comparison panels (ratio/pull/efficiency) on top of it, and a set of experiment style sheets that match the conventions ATLAS, CMS, LHCb, ALICE and DUNE publications require.
24+
25+
## Plotting pre-binned histograms
26+
27+
If you want to plot a histogram matplotlib had a great function for it - `plt.hist`, except in its convenience it not only serves the plotting, but also wraps the histogramming - `(counts, edges)` from `np.histogram`. But if the histogram you want to visualize is already _made_ you used to have to either "hack" `plt.hist` by filling 1's and passing histogram values as weights, or use `plt.step` and hack your `len(x) = len(y) + 1` input information into the same length or accept `plt.bar` with its own limitations.
28+
29+
To improve this particular user experience the `mplhep` authors contributed a new distinct primitive [`plt.stairs`](https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.stairs.html), which was added in matplotlib 3.4 specifically for pre-binned data. This simplifies the syntax for HEP users significantly, but at the same time `plt.stairs` is still just a primitive function compared to the rich functionality of `plt.hist`. To mimic and indeed extend this functionality for the needs of particle physicists and indeed anyone who handles pre-binned histograms, we present the `mplhep` (imported as `mh`) library with [`mh.histplot`](https://scikit-hep.org/mplhep/latest/guide_basic_plotting/) at its core (see also the full [docs](https://scikit-hep.org/mplhep/latest/)).
30+
31+
<div style="display: grid; grid-template-columns: 1fr 1fr; column-gap: 1.5rem; row-gap: 0.5rem; margin: 1.5rem 0; align-items: start;">
32+
33+
<div>
34+
35+
**`plt.stairs` (matplotlib primitive)**
36+
37+
</div>
38+
<div>
39+
40+
**`mh.histplot` (mplhep wrapper)**
41+
42+
</div>
43+
44+
<div>
45+
46+
```python
47+
import matplotlib.pyplot as plt
48+
import numpy as np
49+
50+
cumulative = np.zeros_like(ha, dtype=float)
51+
for cnt, lab in zip([ha, hb, hc], labels):
52+
new = cumulative + cnt
53+
plt.stairs(new, edges, baseline=cumulative, fill=True, label=lab)
54+
cumulative = new
55+
plt.legend()
56+
```
57+
58+
</div>
59+
<div>
60+
61+
```python
62+
import matplotlib.pyplot as plt
63+
import mplhep as mh
64+
65+
mh.histplot(
66+
[ha, hb, hc],
67+
edges,
68+
stack=True,
69+
histtype="fill",
70+
label=labels,
71+
)
72+
plt.legend()
73+
```
74+
75+
</div>
76+
77+
<div>
78+
79+
![Stacked histogram drawn by calling plt.stairs three times, accumulating a baseline manually so each component sits on top of the previous one.](api-stairs.png)
80+
81+
</div>
82+
<div>
83+
84+
![The same stacked histogram produced by a single mh.histplot call with stack=True; identical output, much less ceremony.](api-histplot.png)
85+
86+
</div>
87+
88+
</div>
89+
90+
Same output, half the code. And the savings compound once you actually use the keyword arguments. `mh.histplot` accepts a NumPy tuple, a `hist.Hist`, a `boost_histogram.Histogram`, or any object implementing the [PlottableProtocol](https://uhi.readthedocs.io/), so the same call works regardless of what your analysis framework hands you. From there, the keywords most analyses lean on:
91+
92+
- `yerr=True` → Poisson intervals for integer counts; pass a 1D array for symmetric errors, a 2D `(2, N)` array for asymmetric ones, or `yerr=False` to suppress them entirely.
93+
- `w2=variances` → sum-of-weights-squared propagation for weighted MC. When combined with `yerr=True`, mplhep picks Poisson intervals for integer-like `w2` and `sqrt(w2)` otherwise; `w2method=` lets you force one or the other.
94+
- `sort="yield"` → auto-sort a stack by total yield (largest at the bottom); `"label"` sorts alphabetically; append `_r` to reverse.
95+
- `histtype=``"step"`, `"fill"`, `"errorbar"`, `"bar"`, `"barstep"`, or `"band"` (which spans the `yerr` range — perfect for systematic uncertainty bands without a second call).
96+
- `density=True` / `binwnorm=1.0` → normalise to unit area or per unit bin width.
97+
- `flow="show"` / `"sum"` / `"hint"` → handle under- and overflow bins explicitly.
98+
- `blind=(lo, hi)` (or `mh.loc[lo:hi]`) → hide bins in a signal region for blind analyses.
99+
100+
The full list is in the [`mh.histplot` API reference](https://scikit-hep.org/mplhep/latest/api/#mplhep.histplot). A short example that exercises several of these — sum-of-weights-squared on a weighted MC stack, auto-sorting by yield, a hatched MC uncertainty band, and Poisson-interval errors on the data overlay:
101+
102+
```python
103+
mh.histplot(
104+
mc_components,
105+
edges,
106+
w2=mc_variances, # propagate Sumw2 for weighted MC
107+
stack=True,
108+
sort="yield", # smallest yield on top of the stack
109+
histtype="fill",
110+
label=["Background", "Other bkg.", "Signal"],
111+
)
112+
mh.histplot(
113+
mc_total,
114+
edges,
115+
yerr=np.sqrt(mc_total_var),
116+
histtype="band", # filled band spanning ±yerr
117+
label="MC stat. unc.",
118+
color="gray",
119+
alpha=0.4,
120+
)
121+
mh.histplot(
122+
data_counts,
123+
edges,
124+
yerr=True, # Poisson intervals for integer counts
125+
histtype="errorbar",
126+
color="black",
127+
label="Data",
128+
)
129+
```
130+
131+
<div style="max-width: 60%; margin: 1.5rem auto;">
132+
133+
![Stacked weighted MC with three components auto-sorted by yield, a hatched MC statistical uncertainty band spanning the model total, and data points with Poisson-interval error bars. The full figure is composed by three independent mh.histplot calls onto the same axes.](kwargs-sugar.png)
134+
135+
</div>
136+
137+
## Stacks and comparison panels
138+
139+
A HEP plot rarely stops at a single histogram. The canonical figure has a stacked background model, an unstacked signal or systematic-uncertainty band, data points with errors on top, and a thinner _comparison_ panel underneath: a ratio, a pull, an efficiency. Those panels all share a layout — twinned bins, reference line at 1 or 0 — and they're surprisingly tedious to assemble in matplotlib.
140+
141+
`mh.comp.hists` builds one in a single call for the two-histogram case; `mh.comp.data_model` handles the full data-versus-model figure with stacked and unstacked components, MC statistical uncertainty band, and any of the same comparison types in the lower panel:
142+
143+
<div style="display: grid; grid-template-columns: 1fr 1fr; column-gap: 1.5rem; row-gap: 0.5rem; margin: 1.5rem 0; align-items: start;">
144+
145+
<div>
146+
147+
**Two histograms with a ratio panel**
148+
149+
</div>
150+
<div>
151+
152+
**Data vs model with a pull panel**
153+
154+
</div>
155+
156+
<div>
157+
158+
```python
159+
fig, ax_main, ax_comp = mh.comp.hists(
160+
h1,
161+
h2,
162+
xlabel="Discriminator",
163+
h1_label="Sample A",
164+
h2_label="Sample B",
165+
comparison="ratio",
166+
)
167+
```
168+
169+
</div>
170+
<div>
171+
172+
```python
173+
fig, ax_main, ax_comp = mh.comp.data_model(
174+
data_hist=data,
175+
stacked_components=[bkg_a, bkg_b],
176+
stacked_labels=["Bkg 1", "Bkg 2"],
177+
unstacked_components=[signal],
178+
unstacked_labels=["Signal"],
179+
comparison="pull",
180+
)
181+
```
182+
183+
</div>
184+
185+
<div>
186+
187+
![Two histograms overlaid in the main panel with their ratio in a thin lower panel; the ratio drops sharply where Sample A's spectrum extends past Sample B's.](ratio.png)
188+
189+
</div>
190+
<div>
191+
192+
![A stacked background model with an unstacked signal component overlaid, data points with error bars and an MC statistical uncertainty band, and a pull panel below showing per-bin (data minus MC) divided by combined uncertainty.](data-model.png)
193+
194+
</div>
195+
196+
</div>
197+
198+
`comparison=` also accepts `"difference"`, `"relative_difference"`, `"asymmetry"` and `"efficiency"`; the MC statistical uncertainty is propagated through all of them. Swapping `"pull"` for `"ratio"` in the second example swaps the lower panel out with no other code changes. The [comparisons guide](https://scikit-hep.org/mplhep/latest/guide_comparisons/) covers every variant with worked examples; the [gallery](https://scikit-hep.org/mplhep/latest/gallery/) is the fastest way to find a plot that looks like the one you're trying to make.
199+
200+
## Experiment styles
201+
202+
The third thing mplhep does is take care of the typography. Every collaboration has a house style — a font, a "CMS" / "ATLAS" / "LHCb" label with a status qualifier, a √s and integrated-luminosity string, specific tick directions and minor-tick behaviour, a colour cycle. `mh.style.use("CMS")` (or `"ATLAS"`, `"LHCb2"`, `"ALICE"`, `"DUNE"`) sets matplotlib's `rcParams` accordingly and bundles the open fonts (TeX Gyre Heroes as a Helvetica stand-in, Fira Sans, etc.) so the result is reproducible across operating systems. The collaboration tag is placed by a matching helper — `mh.cms.label`, `mh.atlas.label`, `mh.lhcb.label`, `mh.alice.label`, `mh.dune.label` — which knows where each one is meant to live (CMS above the axes in the figure margin; ATLAS, LHCb and ALICE _inside_ the axes at top-left). For figures heading somewhere that doesn't fit a single collaboration's house style, `mh.style.use("plothist")` provides a neutral serif look with the same comparison-panel ergonomics and no experiment tag. The [styling guide](https://scikit-hep.org/mplhep/latest/guide_styling/) catalogues every available style and the exact arguments each `.label()` helper accepts.
203+
204+
```python
205+
with plt.style.context(mh.style.CMS):
206+
fig, ax = plt.subplots()
207+
mh.histplot(
208+
[ha, hb, hc],
209+
edges,
210+
stack=True,
211+
histtype="fill",
212+
label=["Background", "Other bkg.", "Signal"],
213+
ax=ax,
214+
)
215+
mh.histplot(
216+
ha + hb + hc, edges, histtype="errorbar", color="black", label="Data", ax=ax
217+
)
218+
mh.cms.label("Plot Demo", data=True, lumi=138, com=13, ax=ax)
219+
mh.mpl_magic(ax=ax)
220+
```
221+
222+
The same three-component stack with data points rendered four ways. Each style picks its own colour cycle, font, and label conventions; [`mh.mpl_magic`](https://scikit-hep.org/mplhep/latest/guide_utilities/) auto-grows the y-axis so the experiment tag, legend and data don't fight for the same space, and is one of a small set of layout helpers (`yscale_legend`, `yscale_anchored_text`, `sort_legend`, `append_axes`, …) that the [utilities guide](https://scikit-hep.org/mplhep/latest/guide_utilities/) covers in full.
223+
224+
<div style="display: grid; grid-template-columns: repeat(4, 1fr); gap: 0.75rem; margin: 1.5rem 0; align-items: start;">
225+
<div>
226+
227+
![CMS style: bold 'CMS Plot Demo' in the figure margin, '138 fb⁻¹ (13 TeV)' right-justified; CMS colour cycle.](style-cms.png)
228+
229+
</div>
230+
<div>
231+
232+
![ATLAS style: italic 'ATLAS Plot Demo' inside top-left, '√s = 13 TeV, 140 fb⁻¹' on a second line; ATLAS colour cycle.](style-atlas.png)
233+
234+
</div>
235+
<div>
236+
237+
![LHCb style: bold 'LHCb Plot Demo' inside the axes top-left; '9 fb⁻¹ (13 TeV)' in the margin above; LHCb colour cycle.](style-lhcb2.png)
238+
239+
</div>
240+
<div>
241+
242+
![plothist style: no experiment label, serif typography, neutral colour palette. The same stacked-histogram-with-data plot rendered in mplhep's non-experiment style.](style-plothist.png)
243+
244+
</div>
245+
</div>
246+
247+
The same data, the same single `mh.histplot` call — only the active style context changes.
248+
249+
## Where it fits
250+
251+
mplhep is part of [Scikit-HEP](https://scikit-hep.org/), a collection of pure-Python tools for particle physics that also includes [hist](https://hist.readthedocs.io/), [Uproot](https://uproot.readthedocs.io/), [Awkward Array](https://awkward-array.org/), [vector](https://vector.readthedocs.io/) and [pyhf](https://pyhf.readthedocs.io/), among many others. It deliberately stays a thin layer on top of plain matplotlib rather than replacing it — every figure mplhep produces is a regular `Figure`/`Axes` pair you can keep customising with the matplotlib API you already know. The point is to remove the friction of the conventions, not the flexibility underneath them.
252+
253+
If you work in HEP, `pip install mplhep` followed by `mh.style.use(...)` should be the first two lines of any plotting notebook. If you don't, `mh.histplot` for pre-binned data and the comparison-panel machinery are still useful well outside the field — anywhere "two histograms and their ratio" is the natural unit of a figure.
254+
255+
- Docs: [scikit-hep.org/mplhep](https://scikit-hep.org/mplhep/latest/) — start with the [basic plotting](https://scikit-hep.org/mplhep/latest/guide_basic_plotting/), [comparisons](https://scikit-hep.org/mplhep/latest/guide_comparisons/), [styling](https://scikit-hep.org/mplhep/latest/guide_styling/) and [utilities](https://scikit-hep.org/mplhep/latest/guide_utilities/) guides, browse the [gallery](https://scikit-hep.org/mplhep/latest/gallery/) for inspiration, or jump to the full [API reference](https://scikit-hep.org/mplhep/latest/api/).
256+
- Source: [github.com/scikit-hep/mplhep](https://github.com/scikit-hep/mplhep)
257+
- Discussion: [github.com/scikit-hep/mplhep/discussions](https://github.com/scikit-hep/mplhep/discussions)
17.5 KB
Loading
14.8 KB
Loading
25.8 KB
Loading
40.2 KB
Loading
43.2 KB
Loading
14.1 KB
Loading

0 commit comments

Comments
 (0)