GRS – GPU radix sort for multifield records
Department of Computer and Information Science and Engineering, University of Florida, Gainesville, FL 32611, USA
International Conference on High Performance Computing (HiPC), 2010
@conference{bandyopadhyay2011grs,
title={GRS-GPU radix sort for multifield records},
author={Bandyopadhyay, S. and Sahni, S.},
booktitle={High Performance Computing (HiPC), 2010 International Conference on},
pages={1–10},
year={2011},
organization={IEEE}
}
We develop a radix sort algorithm, GRS, suitable to sort multifield records on a graphics processing unit (GPU). We assume the ByField layout for records to be sorted. GRS is benchmarked against the radix sort algorithm, SDK, in NVIDIA’s CUDA SDK 3.0 as well as the radix sort algorithm, SRTS, of Merrill and Grimshaw. Although SRTS is faster than both GRS and SDK when sorting numbers as well as records that have a key and an additional 32-bit field, both GRS and SDK outperform SRTS on records with 2 or more fields (in addition to the key). GRS is consistently faster than SDK on numbers as well as records with 1 or more fields. When sorting records with 9 32-bit fields, GRS is up to 74% faster than SRTS and up to 55% faster than SDK. Thus, GRS is the fastest way to radix sort records with more than 1 32-bit field on a GPU.
April 27, 2011 by hgpu