Sorting On A Graphics Processing Unit (GPU)

Shibdas Bandyopadhyay, Sartaj Sahni
University of Florida
Chapter in the book "Multi- and Many-Core Technologies: Architectures, Programming, Algorithms, and Applications", Chapman-Hall/CRC Press, 2013

   title={Sorting On A Graphics Processing Unit (GPU)},

   author={Bandyopadhyay, Shibdas and Sahni, Sartaj},



Download Download (PDF)   View View   Source Source   



One of the very first GPU sorting algorithms, an adaptation of bitonic sort, was developed by Govindraju et al. [12]. Since this algorithm was developed before the advent of CUDA, the algorithm was implemented using GPU pixel shaders. Zachmann et al. [13] improved on this sort algorithm by using BitonicT rees to reduce the number of comparisons while merging the bitonic sequences. Cederman et al. [7] have adapted quick sort for GPUs. Their adaptation first partitions the sequence to be sorted into subsequences, sorts these subsequences in parallel, and then merges the sorted subsequences in parallel. A hybrid sort algorithm that splits the data using bucket sort and then merges the data using a vectorized version of merge sort is proposed by Sintron et al. [28]. Satish et al. [26] have developed an even faster merge sort. In this merge sort, two sorted sequences A and B are merged by a thread block to produce the sequence C when A and B have less than 256 elements each. Each thread reads an element of A and then does a binary search on the sequence B with that element to determine where it should be placed in the merged sequence C. When the number of elements in a sequence is more than 256, A and B are divided into a set of subsequences by using a set of splitters. The splitters are chosen from the two sequences in such a way that the interval between successive splitters is small enough to be merged by a thread block. The fastest GPU merge sort algorithm known at this time is Warpsort [31]. Warpsort first creates sorted sequences using bitonic sort; each sorted sequence being created by a thread warp. The sorted sequences are merged in pairs until only a small number of sequences remain. The remaining sequences are partitioned into subsequences that can be pairwise merged independently and finally this pairwise merging is done with each warp merging a pair of subsequences. Experimental results reported in [31] indicate that Warpsort is about 30% faster than the merge sort algorithm of [26]. Another comparison-based sort for GPUs-GPU sample sort-was developed by Leischner et al. [20]. Sample sort is reported to be about 30% faster than the merge sort of [26], on average, when the keys are 32-bit integers. This would make sample sort competitive with Warpsort for 32-bit keys. For 64-bit keys, sample sort is twice as fast, on average, as the merge sort of [26].
VN:F [1.9.22_1171]
Rating: 0.0/5 (0 votes cast)

* * *

* * *

TwitterAPIExchange Object
    [oauth_access_token:TwitterAPIExchange:private] => 301967669-yDz6MrfyJFFsH1DVvrw5Xb9phx2d0DSOFuLehBGh
    [oauth_access_token_secret:TwitterAPIExchange:private] => o29ji3VLVmB6jASMqY8G7QZDCrdFmoTvCDNNUlb7s
    [consumer_key:TwitterAPIExchange:private] => TdQb63pho0ak9VevwMWpEgXAE
    [consumer_secret:TwitterAPIExchange:private] => Uq4rWz7nUnH1y6ab6uQ9xMk0KLcDrmckneEMdlq6G5E0jlQCFx
    [postfields:TwitterAPIExchange:private] => 
    [getfield:TwitterAPIExchange:private] => ?cursor=-1&screen_name=hgpu&skip_status=true&include_user_entities=false
    [oauth:protected] => Array
            [oauth_consumer_key] => TdQb63pho0ak9VevwMWpEgXAE
            [oauth_nonce] => 1480711012
            [oauth_signature_method] => HMAC-SHA1
            [oauth_token] => 301967669-yDz6MrfyJFFsH1DVvrw5Xb9phx2d0DSOFuLehBGh
            [oauth_timestamp] => 1480711012
            [oauth_version] => 1.0
            [cursor] => -1
            [screen_name] => hgpu
            [skip_status] => true
            [include_user_entities] => false
            [oauth_signature] => jFpHjH8SlXh1oQDqLPpvg7RvnlA=

    [url] => https://api.twitter.com/1.1/users/show.json
Follow us on Facebook
Follow us on Twitter

HGPU group

2081 peoples are following HGPU @twitter

HGPU group © 2010-2016 hgpu.org

All rights belong to the respective authors

Contact us: