high performance computing on graphics processing units: hgpu.org

hgpu.org » Programming » Algorithms » Parallel external sorting for CUDA-enabled GPUs with load balancing and low transfer overhead

Parallel external sorting for CUDA-enabled GPUs with load balancing and low transfer overhead

Hagen Peters, Ole Schulz-Hildebrandt, Norbert Luttenberger

Research Group for Communication Systems, Department of Computer Science, Christian-Albrechts-University Kiel, Germany

In 2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW) (April 2010), pp. 1-8.

DOI:10.1109/IPDPSW.2010.5470833

@conference{peters2010parallel,

title={Parallel external sorting for CUDA-enabled GPUs with load balancing and low transfer overhead},

author={Peters, H. and Schulz-Hildebrandt, O. and Luttenberger, N.},

booktitle={Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW), 2010 IEEE International Symposium on},

pages={1–8},

year={2010},

organization={IEEE}

}

Download (PDF)

View

Source

5048

views

Sorting is a well-investigated topic in Computer Science in general and by now many efficient sorting algorithms for CPUs and GPUs have been developed. There is no swapping, paging, etc. available on GPUs to provide more virtual memory than physically available, thus if one wants to sort sequences that exceed GPU memory using the GPU the problem of external sorting arises. In this contribution we present a novel merge-based external sorting algorithm for one or more CUDA-enabled GPUs. We reduce the performance impact of memory transfers to and from the GPU by using an approach similar to regular samplesort and by overlapping memory transfers with GPU computation. We achieve a good utilization of GPUs and load balancing among them by carefully choosing the samples and the amount of GPU memory used for computation. We demonstrate the performance of our algorithm by extended testing. Using two GTX280 the implementation outperforms the fastest CPU sorting algorithms known to the authors.

Tags: Algorithms, Computer science, CUDA, nVidia, nVidia GeForce GTX 260, nVidia GeForce GTX 280, Sorting

November 2, 2010 by hgpu

No votes yet.

Please wait...