Solutions For Optimizing The Radix Sort Algorithmic Function Using The Compute Unified Device Architecture
Faculty of Computer Science for Business Management, Romanian-American University, 1B, Expozitiei Blvd., district 1, code 012101, Bucharest, Romania
The Proceedings of Journal ISOM Vol. 6 No. 2, 2012
In this paper, we have researched and developed solutions for optimizing the radix sort algorithmic function using the Compute Unified Device Architecture (CUDA). The radix sort is a common parallel primitive, an essential building block for many data processing algorithms, whose optimization improves the performance of a wide class of parallel algorithms useful in data processing. A particular interest in our research was to develop solutions for optimizing the radix sort algorithmic function that offers optimal solutions over an entire range of CUDA enabled GPUs: Tesla GT200, Fermi GF100 and the latest Kepler GK104 architecture, released on March 2012. In order to confirm the utility of the developed optimization solutions, we have extensively benchmarked and evaluated the performance of the radix sort algorithmic function in CUDA.
March 7, 2013 by hgpu