Skip to content

Simple counting experiment with bins over 10k can have float underflow in pdf #1075

@nsmith-

Description

@nsmith-

The issue is that for simple count cards that do not use shapes FAKE, the resulting PDF is evaluated not in log space, so for cases with large counts and/or some mild tension between data and simulation, we get float underflow errors and subsequent fit failures.

For example, this simplified version of a real case:

Combination of simplecard.txt
imax 2 number of bins
jmax 3 number of processes minus 1
kmax 2 number of nuisance parameters
----------------------------------------------------------------------------------------------------------------------------------
bin          ch1_chC  ch1_chD
observation  59144    57148  
----------------------------------------------------------------------------------------------------------------------------------
bin                             ch1_chC       ch1_chC       ch1_chC       ch1_chC       ch1_chD       ch1_chD       ch1_chD       ch1_chD     
process                         pih           pi0           bkg           bkg2          pih           pi0           bkg           bkg2        
process                         -1            0             1             2             -1            0             1             2           
rate                            323           124           43015.6       16128.4       163           78            51447.2       5700.77     
----------------------------------------------------------------------------------------------------------------------------------
lumi                    lnN     1.025         1.025         -             -             1.025         1.025         -             -           
bkg_uncertainty         lnN     -             -             1.2           -             -             -             1.2           -

fails at r=20

$ combine -M MultiDimFit --setParameters r=20 --freezeParameters r simplecard.txt --saveWorkspace
 <<< Combine >>> 
 <<< v10.1.0 >>>
>>> Random number generator seed is 123456
>>> Method used is MultiDimFit
Set Default Value of Parameter r To : 20
Doing initial fit: 

 

 ---------------------------

 WARNING: MultiDimFit failed

 ---------------------------

 

 --- MultiDimFit ---
best fit parameter values: 
   r :   +20.000
Done in 0.00 min (cpu), 0.00 min (real)

due to float underflow that is readily apparent when printing the RooWorkspace from higgsCombineTest.MultiDimFit.mH120.root:

p.d.f.s
-------
SimpleGaussianConstraint::bkg_uncertainty_Pdf[ x=bkg_uncertainty mean=bkg_uncertainty_In sigma=1 ] = 1
SimpleGaussianConstraint::lumi_Pdf[ x=lumi mean=lumi_In sigma=1 ] = 1
RooProdPdf::modelObs_b[ pdf_binch1_chC_bonly * pdf_binch1_chD_bonly ] = 2.73756e-06
RooProdPdf::modelObs_s[ pdf_binch1_chC * pdf_binch1_chD ] = 0
RooProdPdf::model_b[ modelObs_b * nuisancePdf ] = 2.73756e-06
RooProdPdf::model_s[ modelObs_s * nuisancePdf ] = 0
RooProdPdf::nuisancePdf[ lumi_Pdf * bkg_uncertainty_Pdf ] = 1
RooPoisson::pdf_binch1_chC[ x=n_obs_binch1_chC mean=n_exp_binch1_chC ] = 2.25661e-270
RooPoisson::pdf_binch1_chC_bonly[ x=n_obs_binch1_chC mean=n_exp_binch1_chC_bonly ] = 0.00164042
RooPoisson::pdf_binch1_chD[ x=n_obs_binch1_chD mean=n_exp_binch1_chD ] = 4.12988e-87

Note RooProdPdf::modelObs_s[ pdf_binch1_chC * pdf_binch1_chD ] = 0 evaluates to zero!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions