Finding data in the Bookkeeping
Knowing how data flows through the various Gaudi applications is crucial for knowing where to look for your data.
Data are catalogued in ‘the bookkeeping’, and are initially sorted in broad groups such as ‘real data for physics analysis’, ‘simulated data’, and ‘data for validation studies’. After this, a tree of various application and processing versions will eventually lead to the data you need.
So, before we can run our first DaVinci job we need to locate some events. In this tutorial we will use the decay \( B_{s}^0 \to D_{s}^-\pi^+ \) as an example, where the \( D_{s}^+ \) decays to \( K^+ K^- \pi^+ \).
Learning Objectives
- Find MC in the bookkeeping
- Find data in the bookkeeping
- Find the decay you want
Navigate to the bookkeeping under Data / Bookkeeping Browser, which lets you find both simulated and real data.
At the bottom of the "Bookkeeping tree" tab there is a drop-down menu
labelled Simulation Condition
, open it and change it to Event
type
. This changes the way the bookkeeping tree is sorted, making it easier for us to locate files by event.
We will analyse 2024 data, and correspondingly use simulation for 2024
data. To navigate to the simulation, expand the folder icon in the
"Bookkeeping tree" window. Navigate to the MC/2024
folder. This will
give you a very long list of all possible decay types for which there
is simulated data. We are looking for a folder which is named:
13264021 (Bs_Dspi,KKpi=DecProdCut)
This sample of simulated events will only contain events where a \( B_s^{0} \to (D_{s}^- \to K^+ K^- \pi^+)\pi^+ \) was generated within the LHCb acceptance, although the decay might not have been fully reconstructed. (Not all simulated samples have the same requirements made on the signal decay).
If you expand the 13264021 (Bs_Dspi,KKpi=DecProdCut)
folder you
will find a couple different subfolders to choose from. The names of these
subfolders correspond to different data-taking conditions, such as magnet
polarity (MagDown
and MagUp
), as well as different software versions used
to create the samples that are available, the simulated pile-up (nu),
and the specific time period we are simulating (e.g. 2024.W37.39). We will use
Beam6800GeV2024.W37.39MagDownNu6.325nsPythia8
.
So much choice!
Often there are only one or two combinations of data-taking conditions and software versions to choose from, but sometimes (including in this case!) there can be very many. Generally newer versions are the best bet, but you should always ask the Monte Carlo liason of your working group for advice on what to use if you're not sure.
Next we need to choose what version of the simulation you want to
use. There is only one available in our case, Sim10d
.
Usually the latest available version is the best when there are more than one.
In this folder there may, in general, be several subdirectories called 'AnaProd' refering to the output of analysis productions, and the actual data, stored in 'HLT2.DST'.
Flagged and filtered samples
In the usual data-taking flow, the trigger and sprucing are run in filtering mode, whereby events that don't pass any trigger line or any stripping line are thrown away. In the simulation, it's often useful to keep such events so that the properties of the rejected events can be studied. The trigger and stripping are then run in flagging mode, such that the decisions are only recorded for later inspection. Filtered Monte Carlo can be produced for analyses that need lots of events. In this tutorial, we will be using Flagged MC.
By clicking on the HLT2.DST entry we finally see a list of files that we can process. At the bottom right of the page there is a “Save” button which will let us download a file specifying the inputs that we'll use for running our DaVinci job. Click it, select “Save as a python file”. Clicking “Save” once again in the pop-up menu will start the download. Save this file somewhere you can find it again.
A copy of the file we just downloaded is available here:
/eos/lhcb/wg/dpa/wp7/Run3SK/exampleDST/MC_2024_13264021_Beam6800GeV2024.W37.39MagDownNu6.325nsPythia8_Sim10d_HLT22024.W35.39_HLT2.DST.py
Shortcut
Once you get a bit of experience with navigating the bookkeeping you
can take a shortcut! At the bottom of your browser window there is a
text field next to a green "plus" symbol. You can directly enter a
path here to navigate there directly. For example you could go
straight to:
evt+std://MC/2024/13264021/Beam6800GeV-2024.W35.37-MagUp-Nu6.3-25ns-Pythia8/Sim10d/HLT2-2024.W35.39
by typing this path and pressing the Go
button.
Find your own decay!
Think of a decay and try to find a Monte Carlo sample for it. You could use the decay that your analysis is about, or if you don't have any ideas you could look for the semileptonic decay \( B^{0} \to D^{+}\mu^{-}\bar{\nu}_{\mu} \), where the \( D^{+} \) decays to \( K^{+}\pi^{-}\pi^{-} \).
If you would like to find out more about how the event types define the signal decay, you can look at the documentation for the DecFiles package.