ALICE HLT High Speed Tracking on GPU

S. Gorbunov, D. Rohr, K. Aamodt, T. Alt, H. Appelshauser, A. Arend, M. Bach, B. Becker, S. Bottger, T. Breitner, H. Busching, S. Chattopadhyay, J. Cleymans, I. Das,O. Djuvsland, H. Erdal, R. Fearick, O. S. Haaland, P. T. Hille, S. Kalcher, K. Kanaki, U. Kebschull, I. Kisel, M. Kretz, C. Lara, S. Lindal, V. Lindenstruth, A. A. Masoodi, G. Ovrebekk, R. Panse, J. Peschek, M. Ploskon, T. Pocheptsov, T. Rascanu, M. Richter, D. Rohrich, B. Skaali, T. Steinbeck, A. Szostak, J. Thader, T. Tveter, K. Ullaland, Z. Vilakazi, R. Weis, P. Zelnicek
Frankfurt Inst. fur Inf. Ifl, Frankfurt Inst. fur Adv. Studies FIAS, Frankfurt, Germany
IEEE Transactions on Nuclear Science, Volume 58, Issue 4, p.1845-1851, 2011


   title={ALICE HLT High Speed Tracking on GPU},

   author={Gorbunov, S. and Rohr, D. and Aamodt, K. and Alt, T. and Appelshauser, H. and Arend, A. and Bach, M. and Becker, B. and Bottger, S. and Breitner, T. and others},

   journal={Nuclear Science, IEEE Transactions on},






Download Download (PDF)   View View   Source Source   



The on-line event reconstruction in ALICE is performed by the High Level Trigger, which should process up to 2000 events per second in proton-proton collisions and up to 300 central events per second in heavy-ion collisions, corresponding to an input data stream of 30 GB/s. In order to fulfill the time requirements, a fast on-line tracker has been developed. The algorithm combines a Cellular Automaton method being used for a fast pattern recognition and the Kalman Filter method for fitting of found trajectories and for the final track selection. The tracker was adapted to run on Graphics Processing Units (GPU) using the NVIDIA Compute Unified Device Architecture (CUDA) framework. The implementation of the algorithm had to be adjusted at many points to allow for an efficient usage of the graphics cards. In particular, achieving a good overall workload for many processor cores, efficient transfer to and from the GPU, as well as optimized utilization of the different memories the GPU offers turned out to be critical. To cope with these problems a dynamic scheduler was introduced, which redistributes the workload among the processor cores. Additionally a pipeline was implemented so that the tracking on the GPU, the initialization and the output processed by the CPU, as well as the DMA transfer can overlap. The GPU tracking algorithm significantly outperforms the CPU version for large events while it entirely maintains its efficiency.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2021 hgpu.org

All rights belong to the respective authors

Contact us: