Skip to content

Finding data in the Bookkeeping

Knowing how data flows through the various Gaudi applications is crucial for knowing where to look for your data.

Data are catalogued in ‘the bookkeeping’, and are initially sorted in broad groups such as ‘real data for physics analysis’, ‘simulated data’, and ‘data for validation studies’. After this, a tree of various application and processing versions will eventually lead to the data you need.

So, before we can run our first DaVinci job we need to locate some events. In this tutorial we will use the decay \( B_{s}^0 \to D_{s}^-\pi^+ \) as an example, where the \( D_{s}^+ \) decays to \( K^+ K^- \pi^+ \).

Learning Objectives

  • Find MC in the bookkeeping
  • Find data in the bookkeeping
  • Find the decay you want

Navigate to the bookkeeping under Data / Bookkeeping Browser, which lets you find both simulated and real data.

At the bottom of the "Bookkeeping tree" tab there is a drop-down menu labelled Simulation Condition, open it and change it to Event type. This changes the way the bookkeeping tree is sorted, making it easier for us to locate files by event.

We will analyse 2024 data, and correspondingly use simulation for 2024 data. To navigate to the simulation, expand the folder icon in the "Bookkeeping tree" window. Navigate to the MC/2024 folder. This will give you a very long list of all possible decay types for which there is simulated data. We are looking for a folder which is named:

13264021 (Bs_Dspi,KKpi=DecProdCut)
Here, number is a numerical representation of the event type. The text is the human readable version of that.

This sample of simulated events will only contain events where a \( B_s^{0} \to (D_{s}^- \to K^+ K^- \pi^+)\pi^+ \) was generated within the LHCb acceptance, although the decay might not have been fully reconstructed. (Not all simulated samples have the same requirements made on the signal decay).

If you expand the 13264021 (Bs_Dspi,KKpi=DecProdCut) folder you will find a couple different subfolders to choose from. The names of these subfolders correspond to different data-taking conditions, such as magnet polarity (MagDown and MagUp), as well as different software versions used to create the samples that are available, the simulated pile-up (nu), and the specific time period we are simulating (e.g. 2024.W37.39). We will use Beam6800GeV2024.W37.39MagDownNu6.325nsPythia8.

So much choice!

Often there are only one or two combinations of data-taking conditions and software versions to choose from, but sometimes (including in this case!) there can be very many. Generally newer versions are the best bet, but you should always ask the Monte Carlo liason of your working group for advice on what to use if you're not sure.

Next we need to choose what version of the simulation you want to use. There is only one available in our case, Sim10d. Usually the latest available version is the best when there are more than one. In this folder there may, in general, be several subdirectories called 'AnaProd' refering to the output of analysis productions, and the actual data, stored in 'HLT2.DST'.

Flagged and filtered samples

In the usual data-taking flow, the trigger and sprucing are run in filtering mode, whereby events that don't pass any trigger line or any stripping line are thrown away. In the simulation, it's often useful to keep such events so that the properties of the rejected events can be studied. The trigger and stripping are then run in flagging mode, such that the decisions are only recorded for later inspection. Filtered Monte Carlo can be produced for analyses that need lots of events. In this tutorial, we will be using Flagged MC.

By clicking on the HLT2.DST entry we finally see a list of files that we can process. At the bottom right of the page there is a “Save” button which will let us download a file specifying the inputs that we'll use for running our DaVinci job. Click it, select “Save as a python file”. Clicking “Save” once again in the pop-up menu will start the download. Save this file somewhere you can find it again.

A copy of the file we just downloaded is available here:

/eos/lhcb/wg/dpa/wp7/Run3SK/exampleDST/MC_2024_13264021_Beam6800GeV2024.W37.39MagDownNu6.325nsPythia8_Sim10d_HLT22024.W35.39_HLT2.DST.py

Shortcut

Once you get a bit of experience with navigating the bookkeeping you can take a shortcut! At the bottom of your browser window there is a text field next to a green "plus" symbol. You can directly enter a path here to navigate there directly. For example you could go straight to: evt+std://MC/2024/13264021/Beam6800GeV-2024.W35.37-MagUp-Nu6.3-25ns-Pythia8/Sim10d/HLT2-2024.W35.39 by typing this path and pressing the Go button.

Find your own decay!

Think of a decay and try to find a Monte Carlo sample for it. You could use the decay that your analysis is about, or if you don't have any ideas you could look for the semileptonic decay \( B^{0} \to D^{+}\mu^{-}\bar{\nu}_{\mu} \), where the \( D^{+} \) decays to \( K^{+}\pi^{-}\pi^{-} \).

If you would like to find out more about how the event types define the signal decay, you can look at the documentation for the DecFiles package.