AUTHOR=Rovere Marco , Chen Ziheng , Di Pilato Antonio , Pantaleo Felice , Seez Chris TITLE=CLUE: A Fast Parallel Clustering Algorithm for High Granularity Calorimeters in High-Energy Physics JOURNAL=Frontiers in Big Data VOLUME=3 YEAR=2020 URL=https://www.frontiersin.org/journals/big-data/articles/10.3389/fdata.2020.591315 DOI=10.3389/fdata.2020.591315 ISSN=2624-909X ABSTRACT=

One of the challenges of high granularity calorimeters, such as that to be built to cover the endcap region in the CMS Phase-2 Upgrade for HL-LHC, is that the large number of channels causes a surge in the computing load when clustering numerous digitized energy deposits (hits) in the reconstruction stage. In this article, we propose a fast and fully parallelizable density-based clustering algorithm, optimized for high-occupancy scenarios, where the number of clusters is much larger than the average number of hits in a cluster. The algorithm uses a grid spatial index for fast querying of neighbors and its timing scales linearly with the number of hits within the range considered. We also show a comparison of the performance on CPU and GPU implementations, demonstrating the power of algorithmic parallelization in the coming era of heterogeneous computing in high-energy physics.