Provably Efficient GPU Algorithms
Karlsruhe Institute of Technology, Karlsruhe, Germany
arXiv:1306.5076 [cs.DS], (21 Jun 2013)
@article{2013arXiv1306.5076S,
author={Sitchinava}, N. and {Weichert}, V.},
title={"{Provably Efficient GPU Algorithms}"},
journal={ArXiv e-prints},
archivePrefix={"arXiv"},
eprint={1306.5076},
primaryClass={"cs.DS"},
keywords={Computer Science – Data Structures and Algorithms, Computer Science – Distributed, Parallel, and Cluster Computing},
year={2013},
month={jun},
adsurl={http://adsabs.harvard.edu/abs/2013arXiv1306.5076S},
adsnote={Provided by the SAO/NASA Astrophysics Data System}
}
In this paper we present an abstract model for algorithm design on GPUs by extending the parallel external memory (PEM) model with computations in internal memory (commonly known as shared memory in GPU literature) defined in the presence of memory banks and bank conflicts. We also present a framework for designing bank conflict free algorithms on GPUs. Using our framework we develop the first shared memory sorting algorithm that incurs no bank conflicts. Our sorting algorithm can be used as a subroutine for comparison-based GPU sorting algorithms to replace current use of sorting networks in shared memory. We show experimentally that such substitution improves the runtime of the mergesort implementation of the THRUST library.
June 24, 2013 by hgpu