High energy physics (HEP) experiments are controlled and monitored through a diverse set of tools and infrastructures that, in their turn, are producers of large amounts of data. These data are mainly characterized by large volumes and large variety, being generated by multiple uneven data sources (sensors meters, service logs, alarm states), with sometimes fragmentary or irregular patterns.
As the experiments' complexity increases, a wide set of tools is required to streamline their operation and data collection.
This includes real-time dynamic decisions, a prompt identification of system’s faults, anomalies and inefficiencies in order to sustain high performance for these systems. In addition, it includes maintaining experimental and computing infrastructures sustainably and increasing their reliability.
In the context of this discussion, we define Operational Intelligence (OI) as the collection of activities that involve extracting information from a large number of monitored data sources and actively apply analytical insights exploiting machine learning methods.
The focus of this article collection is on contributions from the HEP community which currently explore this area of research. We welcome contributions from a wide range of technical subjects and applications, which can be classified into the following main areas:
• Machine learning-based optimization pipelines, including performance benchmarking with traditional tools, and Infrastructure sustainability.
• Advances on reduced tuning time of HEP experiments infrastructures.
• Developments of anomaly detections and failure prediction.
• Applications of machine learning in monitoring and controlling experiments and computing infrastructures.
High energy physics (HEP) experiments are controlled and monitored through a diverse set of tools and infrastructures that, in their turn, are producers of large amounts of data. These data are mainly characterized by large volumes and large variety, being generated by multiple uneven data sources (sensors meters, service logs, alarm states), with sometimes fragmentary or irregular patterns.
As the experiments' complexity increases, a wide set of tools is required to streamline their operation and data collection.
This includes real-time dynamic decisions, a prompt identification of system’s faults, anomalies and inefficiencies in order to sustain high performance for these systems. In addition, it includes maintaining experimental and computing infrastructures sustainably and increasing their reliability.
In the context of this discussion, we define Operational Intelligence (OI) as the collection of activities that involve extracting information from a large number of monitored data sources and actively apply analytical insights exploiting machine learning methods.
The focus of this article collection is on contributions from the HEP community which currently explore this area of research. We welcome contributions from a wide range of technical subjects and applications, which can be classified into the following main areas:
• Machine learning-based optimization pipelines, including performance benchmarking with traditional tools, and Infrastructure sustainability.
• Advances on reduced tuning time of HEP experiments infrastructures.
• Developments of anomaly detections and failure prediction.
• Applications of machine learning in monitoring and controlling experiments and computing infrastructures.