Skip to content

The LHCb data flow

Learning Objectives

  • Understand the changes to LHCb's data flow for Run 3
  • Learn the key concepts on the HLT, Sprucing

LHCb Upgrade

In Run 1 & 2 LHCb proved itself not only to be a high-precision heavy flavour physics experiment, but it also extended the core physics programme to many different areas such as electroweak physics and fixed-target experiments. This incredible precision led to over 500 papers including breakthroughs such as the first discovery of CP-violation in charm and the first observation of the decay \(B_s^0\to \mu^+\mu^-\), among many others.

In order to reach even higher precision the experiment aims to have taken \(50~\mathrm{fb}^{-1}\) of data by the end of the LHCb Upgrade (including Runs 1, 2, 3 and 4) by increasing the instantaneous luminosity by a factor of five. To be capable of dealing with the higher detector occupancy the experiment will be equipped with an entire new set of tracking detectors with higher granularity and improved radiation tolerance.

The data flow of the experiment can be broadly split into "online" and "offline" processing. The former, "online", includes the detector response and readout and the two High Level Trigger (HLT) stages before the data is permanently stored on tape. This is then followed by "offline" processing which accesses the stored data via the Sprucing application, the results of which are then stored to disk and made available to analysts.

One of the most important upgrades in Run 3 has been the removal of LHCb's L0 hardware trigger. As described in the following, this brought significant changes in the data flow of the experiment both for online and offline processing. It also means that the front-end and readout electronics of all sub-detectors had to be replaced, to be able to operate at the bunch crossing rate of \(40~\mathrm{MHz}\), as well as the photodetectors of the RICH1 detector.

Upgrade of LHCb's Trigger system

The current trigger layout, as designed in the Computing Model of the Upgrade LHCb experiment technical design report, looks like this: online_dataflow.png

LHCb's trigger system has been fully redesigned by removing the L0 hardware trigger and moving to a fully-software based trigger, a novel choice at the LHC (see: CERN news article). The hardware trigger had a rate limit of 1 MHz, which would have been a limitation with the increase in luminosity. Such a low rate with a hardware trigger could be only achieved by having tight hardware trigger thresholds on relatively crude signatures which is inefficient especially for fully hadronic decay modes.

Due to the type of signatures LHCb selects (decays with distinct topologies, soft ('low' \(p_\mathrm{T}\)) hadrons, etc.), such a rate reduction is best done with a holistic understanding of the events, i.e. reconstructing tracks, vertices and making composite particles within the event. This therefore calls for a software trigger. However, this is no easy task. Removal of the L0 promised to be a large boon for physics, but it required that the full detector readout needs to be capable of operating at the bunch crossing rate \(40~\mathrm{MHz}\), while the first stage of the software trigger (HLT1) must run at the average non-empty bunch crossing rate \(30~\mathrm{MHz}\), a significant computing challenge!

The software trigger is implemented in two steps: HLT1 which performs partial event reconstruction and simple trigger decisions to reduce the data rate, and HLT2 which performs the more computationally expensive full reconstruction and complete trigger selection. In Run 3, HLT1 is run on GPUs, gaining large throughput improvements due to the highly-parallelisable nature of the tasks required, such as track reconstruction which is one of the more computationally expensive tasks to perform.

The HLT architecture

Much more detail about the current HLT architecture is available in chapters 10, 11 of The LHCb Upgrade I paper. To perform the event selection, the full-software LHCb trigger requires the complete event information from all the subdetectors. Consequently, event-building, the assembly of all pieces of data belonging to the same bunch-crossing, is done for every collision of non-empty bunches at a 40 MHz rate. Completed events are handed over to the HLT1 selection process, which uses GPUs to process the events.

The resultant HLT1-filtered events are then propagated to a data storage buffer to be accessed by the later stages. Having the HLT1 run on GPUs imposes some different requirements on the code development. The Allen framework is based on the CUDA GPU programming platform. Developers have to keep in mind that HLT1 algorithms need to be implemented in a way that maximizes parallelizability and are thread-safe, i.e. memory should be accessed in a secure way from parallel processes. Documentation on how to develop reconstruction algorithms and selection lines for HLT1 can be found in the documentation as well as the README of the Allen Gitlab project. The raw data of events selected by HLT1 is passed on to the buffer system and stored there before alignment and calibration and later HLT2 can process them. Even with the usage of GPUs, HLT1 also only performs a partial reconstruction to reach the required throughput. This includes not having hadron identification and using a simplistic model for the magnet and material conditions.

As the majority of LHCb Upgrade's physics does not retain the necessary information for offline reconstruction, it is imperative that the online reconstruction performs at an offline-quality. This offline-quality reconstruction is only possible due to real-time alignment and calibration of the detector, performed using the HLT1-filtered events placed in the buffer before HLT2 processes those events.
For the determination of the alignment and calibration, HLT1 selects dedicated calibration samples. Without this step, the persistency model described below would not be possible and many less events (and thus physics) could be saved per LHC second. More information about the real-time alignment and calibration can be found in the Upgrade Alignment TWIKI.

The second and final stage of the high level trigger (HLT2), in contrast to HLT1, is not as throughput-limited and is performed on CPUs. It does include complex models for the material interactions, perform hadron identification and other such reconstruction that are not possible at HLT1. Due to performing alignment and calibration in real-time, this reconstruction can be performed with an offline-quality and thus removes the need to perform reconstruction offline.

A reconstructed event is then evaluated via 1000s of different physics selection lines, where each line is a sequence of algorithms that define whether an event contains an object of interest and should be kept for offline processing and analysis. While HLT1 is mostly dominated by throughput concerns, to reach the \(30~\mathrm{MHz}\) processing speed, HLT2's performance is more constrained by bandwidth, i.e. the output data filesize saved per LHC second. The bandwidth directly corresponds to permanent storage on tape and therefore financial constraints which thus limits the physics that is possible at LHCb. To counteract this, LHCb Upgrade leverages a persistency model to reduce the size of an event, therefore saving more events and gaining more physics reach.

The lines used to evaluate events are streamed, i.e. placed in different resultant output files based on the intended physics/technical purpose of the selection. This minimises the overall data volume with all selections belonging to a stream sharing an underlying logic and recording similar sets of event information. For physics analyses there are broadly two streams, Turbo and Full, both of which have the raw detector information (or "rawbanks") removed. For a pictoral representation of the persistency see the cartoon below, adapted from the Computing Model of the Upgrade LHCb experiment's technical design report: persistency_cartoon.png Rawbanks not being present after HLT2 reduces the average event size and thus reduces the contribution to total HLT2 bandwidth, however this removes the ability to perform the reconstruction offline. You can consider every reduction a compromise between offline-analysis flexibility and online physics reach. The Full stream retains all reconstructed objects, whilst the "pure" Turbo stream only retains the reconstructed objects related to the candidate of interest. This further reduces the average event size even further when compared to the Full stream. "Pure" Turbo refers to the extreme, but there is a compromise, where a selection can retain some extra information (reconstructed or raw) alongside it's candidate information. This compromise is referred to as Turbo "Selective Persistence". Due to the bandwidth reduction, wherever possible the Turbo stream is the baseline, and is intended to account for two thirds of the data. Any extra persistence must be justified by physics.

When designing/evaluating a line, the decision to place it within the Turbo or Full stream will be specific to the individual line and use case. One piece of information to keep in mind is the different handling of the streams during offline processing's Sprucing application (see below). For details and tutorials on how to develop HLT2 line selections, streaming of lines and how to check their efficiencies and data rates, follow the Moore documentation.

Offline data processing

Offline data processing, shown in the plot below is made up of 2 steps.
First Sprucing (described in this section), is an offline selection stage somewhat similar to HLT2.
Then Analysis Productions (relevant section of the starterkit) is a centralised way to produce tuples via DaVinci (relevant section of the starterkit) ready for physics analysis. offline_dataflow.png

The information saved by HLT2 and migrated offline is permanently stored on tape, and then further processed by the Sprucing application. The output of Sprucing is saved to disk on the Worldwide LHC Computing Grid accessible by analysts. Data that is saved to disk is provided in streams, first streamed between Full, Turbo and the other calibration/ technical streams as done at HLT2, then further streamed between physics interest (broadly the physics working groups).

For events from the Turbo stream, the default model is to run through the Sprucing only in a passthrough Sprucing, which corresponds to a reorganisation of the rawbanks and some compression, overall having a reduction in stored file size, but performing no filtering of events.

For Full stream events, where all reconstructed objects are persisted, the events are further processed by filters, combinations and selections on those persisted reconstructed objects by running through a Sprucing line that must be written. HLT2 and sprucing selections share the same code base, so that selection lines can be interchanged among the two flexibly. The information stored per event in the Full stream is also reduced at this stage, by default following the "pure" Turbo schema thus only retaining the signal candidate information and reducing the data saved to disk. So like HLT2 this is a compromise, but this is between storage on the WLCG and flexibility in offline analyses (i.e. DaVinci and later). This is necessary because the total bandwidth for events that can be saved on disk is constrained and less than HLT2 can store to tape.

The benefit of having this extra step for Full stream events is twofold. Firstly, it allows for the use of inclusive selections at HLT2 (like the topological triggers which are used for many distinct physics analysis) to then feed several exclusive selections at Sprucing. This also provides the ability to perform even more complicated triggers that may be too slow for real-time data-taking at HLT2. Secondly, it adds some level of security, in the sense that re-processing of the HLT2 Full stream output occurs via Re-Sprucing campaigns, allowing any selections at Sprucing to be physics-lossless in the long term as the data can always be regained/ re-examined before being processed by a newly written/ newly optimised Sprucing line. As the Full stream after HLT2 must be further processed by Sprucing while the Turbo stream does not, the Turbo stream data becomes accessible to analysts more quickly than the Full stream data. However, in 2024 the Sprucing campaigns were ran effectively synchronously with data taking and thus the delay was relatively short.