-
Notifications
You must be signed in to change notification settings - Fork 17
SLAC Workflow
In the past, our analysis was done through the kickstart repository ldmx-analysis. Frequent software changes made this difficult, so a Python-based analysis framework emerged within our group to be more immune to chaotic development periods. This part of the tutorial will walk you through analyzing a collection of ROOT files using pyEcalVeto.
NOTE: There are plans to refactor pyEcalVeto in the interest of being more user-friendly and enhancing readability. This page will be updated once this is done.
- The ROOT files for this tutorial are provided under
inputsin theTutorialFilesfolder. Start by navigating topyEcalVetoand examinetreeMaker.py. This script processes each event and calculates a slew of kinematic variables, some of which we feed to a machine learning program called a boosted decision tree (BDT). For now, we'll use this script to analyze the tutorial files. Let's also make a directory to hold whatever gets output later.
cd /nfs/slac/g/ldmx/users/<USER>/ldmx-sw-v3.0.0/LDMX-scripts/pyEcalVeto
ldmx python3 treeMaker.py --help
mkdir outputs
The second command should have brought up some useful information on how to use the script. A quick rundown: Tell the script to run in batch mode with the --batch flag, specify your inputs either as a list of files with the -i flag or as a list of directories with the --indirs flag, label each group of files with the -g flag, specify your outputs with the -o flag, and tell the script how many events to process for each file group with the -m flag.
- Now we'll have the analysis script run over each file and output the results from the first 500 events of each file. We'll also label each output file by its process. The following command does all of this.
ldmx python3 treeMaker.py -i $PWD/../TutorialFiles/inputs/0.001_input.root $PWD/../TutorialFiles/inputs/0.01_input.root $PWD/../TutorialFiles/inputs/0.1_input.root $PWD/../TutorialFiles/inputs/1.0_input.root -g 0.001 0.01 0.1 1.0 -o $PWD/outputs $PWD/outputs $PWD/outputs $PWD/outputs -m 500
Once the script finishes processing the files, you can go ahead and delete the newly created scratch directory. Sometimes it isn't able to do this on its own, but this will hopefully be fixed in the future.
- Navigate to the output directory and open up the 0.001 GeV signal file in ROOT. Let's examine the number of reconstructed hits read out from the ECal.
cd outputs
root 0.001_unsorted.root
new TBrowser()
Browse through the file and select the nReadoutHits leaf under the EcalVeto branch. It should be the first leaf. If all goes as expected, you should see the following histogram.

- Oftentimes you'll need to run over a large number of files. This is where batch submission comes into play. Start by setting
LSB_JOB_REPORT_MAIL=Yto receive email updates about your jobs' progress. You'll need to set this variable every time you open up a new terminal if you want to receive updates. - Navigate to your workspace and submit some batch jobs. This is done through the
bsubcommand. You can set which queue to submit a job to with the-qflag (Available options areshort,medium, andlong), specify how long a job is expected to run in minutes with the-Wflag, and set how many cores you want to use with the-nflag. Let's havetreeMaker.pyrun over each file as before and process all of the events this time. We'll submit the jobs to theshortqueue, running each on a single core with an expected run time of 5 minutes.
cd /nfs/slac/g/ldmx/users/<USER>/ldmx-sw-v3.0.0
bsub -q short -W 5 -n 1 -R "select[centos7] span[hosts=1]" singularity run --home $PWD $PWD/ldmx_dev_latest.sif . python3 $PWD/LDMX-scripts/pyEcalVeto/treeMaker.py --batch -i $PWD/LDMX-scripts/TutorialFiles/inputs/0.001_input.root -g 0.001 -o $PWD/LDMX-scripts/pyEcalVeto/outputs
bsub -q short -W 5 -n 1 -R "select[centos7] span[hosts=1]" singularity run --home $PWD $PWD/ldmx_dev_latest.sif . python3 $PWD/LDMX-scripts/pyEcalVeto/treeMaker.py --batch -i $PWD/LDMX-scripts/TutorialFiles/inputs/0.01_input.root -g 0.01 -o $PWD/LDMX-scripts/pyEcalVeto/outputs
bsub -q short -W 5 -n 1 -R "select[centos7] span[hosts=1]" singularity run --home $PWD $PWD/ldmx_dev_latest.sif . python3 $PWD/LDMX-scripts/pyEcalVeto/treeMaker.py --batch -i $PWD/LDMX-scripts/TutorialFiles/inputs/0.1_input.root -g 0.1 -o $PWD/LDMX-scripts/pyEcalVeto/outputs
bsub -q short -W 5 -n 1 -R "select[centos7] span[hosts=1]" singularity run --home $PWD $PWD/ldmx_dev_latest.sif . python3 $PWD/LDMX-scripts/pyEcalVeto/treeMaker.py --batch -i $PWD/LDMX-scripts/TutorialFiles/inputs/1.0_input.root -g 1.0 -o $PWD/LDMX-scripts/pyEcalVeto/outputs
It's crucial that you run bsub from the directory where your singularity image file (.sif) is located. If your image has a different name than the one shown here, make sure to point the command to the correct file. Note the judicious application of absolute file paths. This is good practice when working from inside the container, as it can be finicky about the locations of files sometimes.
- Navigate to the output directory and open up the 1.0 GeV signal file in ROOT. Let's examine the transverse RMS deviation of ECal hits.
cd LDMX-scripts/pyEcalVeto/outputs
root 1.0_unsorted.root
new TBrowser()
Browse through the file and select the showerRMS leaf under the EcalVeto branch. It should be the fourth leaf down from nReadoutHits. If all goes as expected, you should see the following histogram.