Can GPUs Sort Strings Efficiently?
Center for Visual Information Technology, International Institute of Information Technology, Hyderabad, India
IEEE High Performance Computing (HiPC), 2013
@article{deshpande2013can,
title={Can GPUs Sort Strings Efficiently?},
author={Deshpande, Aditya and Narayanan, PJ},
year={2013}
}
String sorting or variable-length key sorting has lagged in performance on the GPU even as the fixed-length key sorting has improved dramatically. Radix sorting is the fastest on the GPUs. In this paper, we present a fast and efficient string sort on the GPU that is built on the available radix sort. Our method sorts strings from left to right in steps, moving only indexes and small prefixes for efficiency. We reduce the number of sort steps by adaptively consuming maximum string bytes based on the number of segments in each step. Performance is improved by using Thrust primitives for most steps and by removing singleton segments from consideration. Over 70% of the string sort time is spent on Thrust primitives. This provides high performance along with high adaptability to future GPUs. We achieve speed of up to 10 over current GPU methods, especially on large datasets. We also scale to much larger input sizes. We present results on easy and difficult strings defined using their after-sort tie lengths.
September 20, 2013 by hgpu