Akademie věd České republiky, 28.12.2017.
Metoda, kterou...
The first long period of data taking of the Large Hadron Collider was finished after almost 3 years of data in February 2013. The increase of the instantaneous luminosity by more than six orders of magnitude documents impressively the extraordinary success of this running period enabling the ATLAS experiment to collect data of very high quality. However, to ensure a constant and reliable monitoring and data quality assessment of the trigger's point of view, a highly flexible and powerful software framework is essential, covering many different aspects. Aside from drastically changing beam conditions as e.g. increasing pile up, the monitoring frame work has to follow up immediately and flexible all developments of the TDAQ system. The TDAQ system used to date is organised in a three-level selection scheme, including a hardware-based first-level trigger and second- and third-level triggers implemented as separate software systems distributed on commodity hardware nodes. The second-level trigger operates over limited regions of the detector, the so-called Regions-of-Interest (RoI). The third-level trigger deals instead with complete events. While this architecture was successfully operated well beyond the original design goals, the accumulated experience stimulated interest to explore possible evolutions.
The TDAQ monitoring system of ATLAS covers very different aspects as rate measurements, trigger configuration and software tests, data quality assessment and handling of events where the trigger decision has failed. Especially the data quality assessment must be made coherent at the online and offline side. To provide data at a high quality level any occurring processing failures, misbehaviour of selection algorithms and data defects must be discovered immediately and resolved accordingly. I will report on the successful monitoring and data quality assessment of the ATLAS trigger and detector system. It could be ensured, that all problems where seen redundantly by the system ensuring a correct treatment of all problems occurring during the data taking. In 2011 only 3% of all data were declared bad due to an issue in the trigger system whereas in 2012 this could be reduced to 1%