Interactively exploring files
Learning Objectives
- Understand how RawEvents and RawBanks save event information
- Use GaudiPython to interactively explore the contents of a file
- Be able to traverse candidate decay chains (just like DaVinci does) manually
Within files events are stored as RawEvents. Each RawEvent is made up of RawBanks. The RawEvent is one DataObject appearing at /Event/<stream>/RawEvent
that needs to be unpacked to its (potentially many) different RawBanks. This can be done using the do_unpacking
function in GaudiConf.reading
.
Relevant RawBanks for this tutorial are
-
DstData - where your physics objects are such as your signal candidates and any reconstruction objects you persist
-
HltDecReports - where the HLT{1,2} (and Sprucing) trigger decisions are stored
There are also detector RawBanks such as Muon
and Rich
that can be persisted if required
Using GaudiPython
To explore the contents of a DST file we can use GaudiPython. This is a Python implementation of the LHCb event processing framework -- Gaudi -- used by the Moore and DaVinci applications.
When running over an event Gaudi needs to place the event data into a Transient Event store (TES). Elements in the store are accessed by a string which looks like a simple directory path, returning a DataObject.
Exploring the file contents
Lets look at a \(B_s^0 \to D_s^- \pi^+\) MC file that has been processed by HLT2. The file has been copied to the DPA EOS space at
/eos/lhcb/wg/dpa/wp7/Run3SK/exampleDST/00257716_00000011_1.hlt2.dst
.
Copy the exploring script locally1, which invokes the do_unpacking
function, and call it explore.py
. Run the command
lb-run Moore/v56r2 python -i explore.py --input_file /eos/lhcb/wg/dpa/wp7/Run3SK/exampleDST/00257716_00000011_1.hlt2.dst --input_process Hlt2 --simulation True
You should now see a long list of locations that look like paths.
Exploring HltDecReports decision information
The HltDecReports
RawBanks have been unpacked and decoded and can be accessed by the TES locations /Event/<source>/DecReports
where <source>
is Hlt1
, Hlt2
or Spruce
.
Enter evt['/Event/Hlt2/DecReports']
to see all the HLT2 lines that were run on this data and their corresponding "decision" which is a boolean value depending on whether this event fired each line.
Advancing to a positive decision
All the events in this file passed at least one HLT2 trigger line. Although we have \(B_s^0 \to D_s^- \pi^+\) MC events here, not all events will actually pass the Hlt2B2OC_BdToDsmPi_DsmToKpKmPim
HLT2 line due to inefficiencies. Use the helper function advance_decision
to move to the first event that passes the Hlt2B2OC_BdToDsmPi_DsmToKpKmPim
line.
advance_decision("Hlt2B2OC_BdToDsmPi_DsmToKpKmPim")
Exploring DstData candidate information
The DstData RawBank has been unpacked and you should see all the accessible TES locations printed as a result of the python script running evt.dump()
. Events always persist Primary Vertices (PVs) at the TES location
/Event/Rec/Vertex/Primary
and the reconstruction summary at the TES location
/Event/Rec/Summary
.
In the event dump you will see the TES location /Event/HLT2/Hlt2B2OC_BdToDsmPi_DsmToKpKmPim/Particles
. These are the candidates that pass the Sprucing line selecting a \(B_s^0\) meson decaying to a \(D_s^-\) meson and a \(\pi^+\). This channel is used for one of the LHCb flagship measurements of \(B_s^0 - \bar{B_s^0}\) mixing. The decay descriptor used for this line is
\[ B_s^0 \to D_s^- \pi^+ \]
where
\[ D_s^- \to K^- K^+ \pi^- \]
The decay chain has a mother particle \(B_s^0\), an intermediate \(D_s^-\) and 4 final state particles: the associated \(\pi^-\) and the 3 \(D_s^-\) daughters \(K^-\), \(K^+\) and \(\pi^-\). Note that the charge conjugation of this decay is also included.
We can take a closer look at these candidates; by running
evt['/Event/HLT2/Hlt2B2OC_BdToDsmPi_DsmToKpKmPim/Particles'].size()
we can see that there is one decay candidate for \(B_s^0\to D_s^- \pi^+\) in this event. An event can have multiple candidates if there is more than one combination of tracks in the event that pass all the selections of the Sprucing line.
We can access this candidate using indexing. By running
evt['/Event/HLT2/Hlt2B2OC_BdToDsmPi_DsmToKpKmPim/Particles'][0]
we can see properties of the candidate \(B_s^0\) particle like measuredMass
, momentum
etc. We can also see that this particle has the \(B_s^0\) particleID
number -511
assigned based on the Monte Carlo particle numbering scheme. Remember this does not mean there is something intrinsic about this particle that tells us it is a \(B_s^0\), it has this particleID
based on the decay descriptor. The particleID
numbers are defined in the PDG.
So far we have only looked at the top level \(B_s^0\) candidate but we can explore the whole decay tree using this script. To see the decay products of our signal candidate we can use daughtersVector()
. Try running
evt['/Event/HLT2/Hlt2B2OC_BdToDsmPi_DsmToKpKmPim/Particles'][0].daughtersVector().size()
We see that there are 2 daughter particles of the \(B_s^0\) as expected, the intermediate \(D_s^-\) is under index 0 and the final state \(\pi^+\) under index 1.
Traversing the decay tree
Confirm that the daughters of the \(D_s^-\) with index 0, 1, 2
are assigned to be \(K^- K^+ \pi^-\) respectively by using chained daughtersVector()
and the Monte Carlo particle numbering scheme
Congratulations! You can now interactively explore LHCb data files. This is a great debugging tool for your future data analyses!
-
This can be done with the command:
wget -O explore.py https://gitlab.cern.ch/lhcb/Moore/-/raw/master/Hlt/Moore/tests/options/starterkit/first-analysis-steps/interactive-dst.py?inline=false
↩