Skip to content

Commit c1f841b

Browse files
committed
Implementation of HT and power corrections
1 parent 8289f75 commit c1f841b

8 files changed

Lines changed: 1352 additions & 31 deletions

File tree

Lines changed: 238 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,238 @@
1+
.. _vptheorycov-pc:
2+
3+
Power corrections
4+
=================
5+
6+
Power corrections (also referred to as higher twist corrections for DIS-like
7+
processes) model contributions from non-perturbative effects that scale as
8+
inverse powers of the hard scale. They are implemented in the
9+
``theorycovariance`` module and can be included as a theory covariance matrix in
10+
a fit. Power corrections for jets and higher twists for DIS data have been
11+
determined in :cite:p:`Ball:2025xtj`, based on NNPDF4.0, where the reader can
12+
find further details on the methodology and phenomenological implications.
13+
14+
The implementation is in
15+
`higher_twist_functions.py <https://github.com/NNPDF/nnpdf/tree/master/validphys2/src/validphys/theorycovariance/higher_twist_functions.py>`_
16+
and
17+
`construction.py <https://github.com/NNPDF/nnpdf/tree/master/validphys2/src/validphys/theorycovariance/construction.py>`_.
18+
19+
20+
Overview
21+
--------
22+
23+
In NNPDF, power corrections modify theoretical predictions by introducing multiplicative
24+
shifts. For a generic observable :math:`O`, the corrected prediction is
25+
26+
.. math:: O \to O \times (1 + \mathrm{PC}),
27+
28+
where :math:`\mathrm{PC}` is the power correction. The shift to the prediction
29+
is therefore
30+
31+
.. math:: \Delta O = O \times \mathrm{PC}.
32+
33+
Different functional forms for the power correction are used depending on the
34+
process type:
35+
36+
- **DIS** (neutral current and charged current): the correction depends on
37+
Bjorken-:math:`x` and :math:`Q^2`, and scales as :math:`1/Q^2`.
38+
- **Single-inclusive jets**: the correction depends on rapidity and transverse
39+
momentum :math:`p_T`, and scales as :math:`1/p_T`.
40+
- **Dijets**: the correction depends on a rapidity variable and the dijet
41+
invariant mass :math:`m_{jj}`, and scales as :math:`1/m_{jj}`.
42+
43+
44+
Parametrisation
45+
---------------
46+
47+
Power corrections are parametrised using a piecewise-linear interpolation
48+
between a set of nodes. The node positions (``nodes``) and the function values
49+
at each node (``yshift``) are specified in the runcard.
50+
51+
The interpolation is constructed as a sum of triangular basis functions: each
52+
node :math:`i` is associated with a triangle that peaks at the node position
53+
with value ``yshift[i]`` and drops linearly to zero at the two neighbouring
54+
nodes. The resulting function is continuous and piecewise-linear.
55+
56+
For DIS processes, the nodes are placed in Bjorken-:math:`x` and the power
57+
correction for a data point at :math:`(x, Q^2)` is
58+
59+
.. math:: \mathrm{PC}(x, Q^2) = \frac{h(x)}{Q^2},
60+
61+
where :math:`h(x)` is the piecewise-linear interpolation.
62+
63+
For jet processes, the nodes are placed in rapidity and the correction at
64+
:math:`(y, p_T)` is
65+
66+
.. math:: \mathrm{PC}(y, p_T) = \frac{h(y)}{p_T}.
67+
68+
For dijets, the same functional form is used but the suppression scale is the
69+
dijet invariant mass :math:`m_{jj}`.
70+
71+
72+
Dataset routing
73+
---------------
74+
75+
Each dataset is mapped to one or more power correction parameter keys via the
76+
function ``get_pc_type``. The mapping depends on the process type and dataset
77+
name:
78+
79+
.. list-table::
80+
:header-rows: 1
81+
:widths: 30 30 40
82+
83+
* - Process type
84+
- PC type key
85+
- Datasets
86+
* - DIS NC (proton :math:`F_2`)
87+
- ``f2p``
88+
- SLAC, BCDMS proton :math:`F_2`; NMC, HERA :math:`\sigma_{\mathrm{red}}`
89+
* - DIS NC (deuteron :math:`F_2`)
90+
- ``f2d``
91+
- SLAC, BCDMS deuteron :math:`F_2`
92+
* - DIS NC (NMC ratio :math:`F_2^d / F_2^p`)
93+
- ``(f2p, f2d)``
94+
- NMC ratio dataset
95+
* - DIS CC
96+
- ``dis_cc``
97+
- CHORUS, NuTeV, HERA CC
98+
* - Jets
99+
- ``Hj``
100+
- Single-inclusive jet datasets
101+
* - Dijets (ATLAS)
102+
- ``H2j_ATLAS``
103+
- ATLAS dijet datasets (falls back to ``H2j`` if key absent)
104+
* - Dijets (CMS)
105+
- ``H2j_CMS``
106+
- CMS dijet datasets (falls back to ``H2j`` if key absent)
107+
108+
109+
Special case: NMC ratio
110+
~~~~~~~~~~~~~~~~~~~~~~~~
111+
112+
The NMC ratio dataset :math:`F_2^d / F_2^p` receives contributions from both
113+
the proton and deuteron power corrections. The corrected ratio is
114+
115+
.. math::
116+
117+
\frac{F_2^d}{F_2^p} \to \frac{F_2^d \,(1 + \mathrm{PC}_d)}{F_2^p \,(1 + \mathrm{PC}_p)},
118+
119+
and the shift is
120+
121+
.. math::
122+
123+
\Delta\!\left(\frac{F_2^d}{F_2^p}\right) = \frac{F_2^d}{F_2^p} \,
124+
\frac{\mathrm{PC}_d - \mathrm{PC}_p}{1 + \mathrm{PC}_p}.
125+
126+
127+
Covariance matrix construction
128+
------------------------------
129+
130+
The theory covariance matrix is constructed from the shifts :math:`\Delta O` by
131+
taking outer products. For each combination of power correction parameters, a
132+
shift vector is computed per dataset. The sub-matrix between datasets :math:`i`
133+
and :math:`j` is then
134+
135+
.. math::
136+
137+
S_{ij} = \sum_k \Delta_i^{(k)} \otimes \Delta_j^{(k)},
138+
139+
where :math:`k` runs over all parameter combinations (one non-zero ``yshift``
140+
entry at a time, with all others set to zero). This corresponds to the
141+
``covmat_power_corrections`` function in ``construction.py``.
142+
143+
144+
Runcard configuration
145+
---------------------
146+
147+
Power corrections are included via the ``theorycovmatconfig`` section of the
148+
runcard. The key ``"power corrections"`` must be added to the
149+
``point_prescriptions`` list, alongside any scale variation prescriptions.
150+
151+
The following keys are used:
152+
153+
- ``pc_parameters``: a dictionary mapping PC type keys to their parametrisation
154+
(``yshift`` and ``nodes`` arrays). The length of ``yshift`` must match the
155+
length of ``nodes``.
156+
- ``pc_included_procs``: list of process types to which power corrections
157+
are applied (e.g. ``["DIS NC", "DIS CC", "JETS", "DIJET"]``).
158+
- ``pc_excluded_datasets``: list of dataset names to exclude from power corrections
159+
even if their process type is included.
160+
- ``pdf``: the PDF used for computing the theory predictions that enter the
161+
multiplicative shifts.
162+
163+
Example
164+
~~~~~~~
165+
166+
.. code:: yaml
167+
168+
theorycovmatconfig:
169+
point_prescriptions: ["9 point", "power corrections"]
170+
pc_parameters:
171+
f2p:
172+
yshift: [0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.0]
173+
nodes: [0.0, 0.001, 0.01, 0.1, 0.3, 0.5, 0.7, 0.9, 1.0]
174+
f2d:
175+
yshift: [0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.0]
176+
nodes: [0.0, 0.001, 0.01, 0.1, 0.3, 0.5, 0.7, 0.9, 1.0]
177+
dis_cc:
178+
yshift: [0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.0]
179+
nodes: [0.0, 0.001, 0.01, 0.1, 0.3, 0.5, 0.7, 0.9, 1.0]
180+
Hj:
181+
yshift: [2.0, 2.0, 2.0, 2.0, 2.0, 2.0]
182+
nodes: [0.25, 0.75, 1.25, 1.75, 2.25, 2.75]
183+
H2j_ATLAS:
184+
yshift: [2.0, 2.0, 2.0, 2.0, 2.0, 2.0]
185+
nodes: [0.25, 0.75, 1.25, 1.75, 2.25, 2.75]
186+
H2j_CMS:
187+
yshift: [2.0, 2.0, 2.0, 2.0, 2.0]
188+
nodes: [0.25, 0.75, 1.25, 1.75, 2.25]
189+
pc_included_procs: ["JETS", "DIJET", "DIS NC", "DIS CC"]
190+
pc_excluded_datasets:
191+
- HERA_NC_318GEV_EAVG_CHARM-SIGMARED
192+
- HERA_NC_318GEV_EAVG_BOTTOM-SIGMARED
193+
pdf: NNPDF40_nnlo_as_01180
194+
use_thcovmat_in_fitting: true
195+
use_thcovmat_in_sampling: true
196+
197+
.. warning::
198+
The lengths of ``yshift`` and ``nodes`` must be equal for each PC type.
199+
A mismatch will raise an error at initialisation time.
200+
201+
.. note::
202+
Power corrections can be combined with scale variation prescriptions.
203+
Both contributions are summed into a single theory covariance matrix.
204+
See the tutorial on :ref:`including a theory covmat in a fit <thcov_tutorial>`.
205+
206+
207+
Module reference
208+
----------------
209+
210+
``higher_twist_functions.py`` provides the following public functions:
211+
212+
- ``get_pc_type(exp_name, process_type, experiment, pc_dict)``:
213+
determines which PC type key(s) apply to a given dataset.
214+
- ``linear_bin_function(a, y_shift, bin_edges)``:
215+
evaluates the piecewise-linear triangular interpolation at points ``a``.
216+
- ``dis_pc_func(delta_h, nodes, x, Q2)``:
217+
computes the DIS power correction :math:`h(x)/Q^2`.
218+
- ``jets_pc_func(delta_h, nodes, pT, rap)``:
219+
computes the jet power correction :math:`h(y)/p_T`.
220+
- ``mult_dis_pc(nodes, x, q2, dataset_sp, pdf)``:
221+
returns a function that computes the multiplicative DIS shift given node values.
222+
- ``mult_dis_ratio_pc(p_nodes, d_nodes, x, q2, dataset_sp, pdf)``:
223+
returns a function that computes the shift for the :math:`F_2^d/F_2^p` ratio.
224+
- ``mult_jet_pc(nodes, pT, rap, dataset_sp, pdf)``:
225+
returns a function that computes the multiplicative jet shift given node values.
226+
- ``construct_pars_combs(parameters_dict)``:
227+
builds the list of one-at-a-time parameter combinations used to construct
228+
the covariance matrix.
229+
- ``compute_deltas_pc(dataset_sp, pdf, pc_dict)``:
230+
computes the full set of shifts for a single dataset.
231+
232+
``construction.py`` provides:
233+
234+
- ``covmat_power_corrections(deltas1, deltas2)``:
235+
computes the theory covariance sub-matrix between two datasets from their
236+
shift dictionaries.
237+
- ``covs_pt_prescrip_pc(combine_by_type, point_prescription, pdf, pc_parameters, pc_included_procs, pc_excluded_datasets)``:
238+
assembles the full power correction covariance matrix across all datasets.
Lines changed: 113 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,113 @@
1+
#
2+
# Configuration file for n3fit
3+
#
4+
######################################################################################
5+
description: NNPDF4.1 with TCM higher twists and jet power corrections
6+
7+
######################################################################################
8+
dataset_inputs:
9+
- {dataset: NMC_NC_NOTFIXED_EM-F2, variant: legacy_dw}
10+
- {dataset: NMC_NC_NOTFIXED_P_EM-SIGMARED, variant: legacy}
11+
- {dataset: SLAC_NC_NOTFIXED_P_EM-F2, variant: legacy_dw}
12+
- {dataset: SLAC_NC_NOTFIXED_D_EM-F2, variant: legacy_dw}
13+
- {dataset: BCDMS_NC_NOTFIXED_P_EM-F2, variant: legacy_dw}
14+
- {dataset: BCDMS_NC_NOTFIXED_D_EM-F2, variant: legacy_dw}
15+
- {dataset: CHORUS_CC_NOTFIXED_PB_NU-SIGMARED, variant: legacy_dw}
16+
- {dataset: CHORUS_CC_NOTFIXED_PB_NB-SIGMARED, variant: legacy_dw}
17+
- {dataset: NUTEV_CC_NOTFIXED_FE_NU-SIGMARED, cfac: [MAS], variant: legacy_dw}
18+
- {dataset: NUTEV_CC_NOTFIXED_FE_NB-SIGMARED, cfac: [MAS], variant: legacy_dw}
19+
- {dataset: HERA_NC_318GEV_EM-SIGMARED}
20+
- {dataset: ATLAS_1JET_8TEV_R06_PTY, variant: decorrelated}
21+
- {dataset: ATLAS_2JET_7TEV_R06_M12Y}
22+
- {dataset: CMS_1JET_8TEV_PTY}
23+
- {dataset: CMS_2JET_7TEV_M12-Y}
24+
- {dataset: CMS_2JET_13TEV_M12-YSTAR-YB-R08}
25+
26+
################################################################################
27+
diagonal_frac: 0.75
28+
29+
datacuts:
30+
t0pdfset: 260202-jk-nnpdf41-mhou
31+
q2min: 2.5
32+
w2min: 3.24
33+
34+
theory:
35+
theoryid: 41_000_000
36+
37+
theorycovmatconfig:
38+
point_prescriptions: ["power corrections"]
39+
pc_parameters:
40+
f2p: {yshift: [0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.0], nodes: [0.0, 0.001, 0.01, 0.1, 0.3, 0.5, 0.7, 0.9, 1]}
41+
f2d: {yshift: [0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.0], nodes: [0.0, 0.001, 0.01, 0.1, 0.3, 0.5, 0.7, 0.9, 1]}
42+
dis_cc: {yshift: [0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.0], nodes: [0.0, 0.001, 0.01, 0.1, 0.3, 0.5, 0.7, 0.9, 1]}
43+
Hj: {yshift: [2.0, 2.0, 2.0, 2.0, 2.0, 2.0], nodes: [0.25, 0.75, 1.25, 1.75, 2.25, 2.75]}
44+
H2j_ystar: {yshift: [2.0, 2.0, 2.0, 2.0, 2.0, 2.0], nodes: [0.25, 0.75, 1.25, 1.75, 2.25, 2.75]}
45+
H2j_yb: {yshift: [2.0, 2.0, 2.0, 2.0, 2.0], nodes: [0.25, 0.75, 1.25, 1.75, 2.25]}
46+
pc_included_procs: ["DIS NC", "DIS CC", "JETS", "DIJET"]
47+
pc_excluded_datasets: []
48+
pdf: 260202-jk-nnpdf41-mhou
49+
use_thcovmat_in_fitting: true
50+
use_thcovmat_in_sampling: true
51+
resample_negative_pseudodata: false
52+
53+
trvlseed: 130582403
54+
nnseed: 953262798
55+
mcseed: 1437981271
56+
genrep: true
57+
parameters: # This defines the parameter dictionary that is passed to the Model Trainer
58+
nodes_per_layer: [25, 20, 9]
59+
activation_per_layer: [tanh, tanh, linear]
60+
initializer: glorot_normal
61+
optimizer:
62+
clipnorm: 6.073e-6
63+
learning_rate: 2.621e-3
64+
optimizer_name: Nadam
65+
epochs: 27000
66+
positivity:
67+
initial: 184.8
68+
multiplier:
69+
integrability:
70+
initial: 10
71+
multiplier:
72+
stopping_patience: 0.1
73+
layer_type: dense
74+
dropout: 0.0
75+
threshold_chi2: 3.5
76+
feature_scaling_points: 5
77+
78+
fitting:
79+
fitbasis: CCBAR_ASYMM # EVOL (7), EVOLQED (8), etc.
80+
savepseudodata: true
81+
basis:
82+
- {fl: sng, trainable: false, smallx: [1.058, 1.155]}
83+
- {fl: g, trainable: false, smallx: [0.9017, 1.084]}
84+
- {fl: v, trainable: false, smallx: [0.481, 0.6499]}
85+
- {fl: v3, trainable: false, smallx: [0.08225, 0.502]}
86+
- {fl: v8, trainable: false, smallx: [0.5823, 0.7928]}
87+
- {fl: t3, trainable: false, smallx: [-0.3987, 0.9689]}
88+
- {fl: t8, trainable: false, smallx: [0.6077, 0.9459]}
89+
- {fl: t15, trainable: false, smallx: [1.023, 1.147]}
90+
- {fl: v15, trainable: false, smallx: [0.5005, 0.7189]}
91+
92+
################################################################################
93+
positivity:
94+
posdatasets:
95+
# Positivity of MSbar PDFs
96+
- {dataset: NNPDF_POS_100GEV_XUQ, maxlambda: 1e6}
97+
- {dataset: NNPDF_POS_100GEV_XUB, maxlambda: 1e6}
98+
- {dataset: NNPDF_POS_100GEV_XDQ, maxlambda: 1e6}
99+
- {dataset: NNPDF_POS_100GEV_XDB, maxlambda: 1e6}
100+
- {dataset: NNPDF_POS_100GEV_XSQ, maxlambda: 1e6}
101+
- {dataset: NNPDF_POS_100GEV_XSB, maxlambda: 1e6}
102+
- {dataset: NNPDF_POS_100GEV_XCQ, maxlambda: 1e6}
103+
- {dataset: NNPDF_POS_100GEV_XCB, maxlambda: 1e6}
104+
- {dataset: NNPDF_POS_100GEV_XGL, maxlambda: 1e6}
105+
106+
integrability:
107+
integdatasets:
108+
- {dataset: NNPDF_INTEG_3GEV_XT8, maxlambda: 1e2}
109+
- {dataset: NNPDF_INTEG_3GEV_XT3, maxlambda: 1e2}
110+
111+
################################################################################
112+
debug: false
113+
maxcores: 16

validphys2/src/validphys/checks.py

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -361,3 +361,15 @@ def check_darwin_single_process(NPROC):
361361
"""
362362
if platform.system() == "Darwin" and NPROC != 1:
363363
raise CheckError("NPROC must be set to 1 on OSX, because multithreading is not supported.")
364+
365+
366+
@make_argcheck
367+
def check_pc_parameters(pc_parameters):
368+
"""Check that the parameters for the PC method are set correctly"""
369+
for par in pc_parameters.values():
370+
# Check that the length of shifts is the same as the length of nodes
371+
if len(par['yshift']) != len(par['nodes']):
372+
raise ValueError(
373+
f"The length of nodes does not match that of the list in {par['ht']}."
374+
f"Check the runcard. Got {len(par['yshift'])} != {len(par['nodes'])}"
375+
)

0 commit comments

Comments
 (0)