GPU-RMAP: Accelerating Short-Read Mapping on Graphics Processors

Ashwin M. Aji, Liqing Zhang, Wu C. Feng
Department of Computer Science, Virginia Tech, Blacksburg, VA, USA
Computational Science and Engineering, IEEE International Conference on, Vol. 0 (2010), pp. 168-175.


   title={GPU-RMAP: Accelerating Short-Read Mapping on Graphics Processors},

   author={Aji, A.M. and Zhang, L. and Feng, W.},

   booktitle={2010 13th IEEE International Conference on Computational Science and Engineering},





Source Source   



Next-generation, high-throughput sequencers are now capable of producing hundreds of billions of short sequences (reads) in a single day. The task of accurately mapping the reads back to a reference genome is of particular importance because it is used in several other biological applications, e.g., genome re-sequencing, DNA methylation, and ChiP sequencing. On a personal computer (PC), the computationally intensive short-read mapping task currently requires several hours to execute while working on very large sets of reads and genomes. Accelerating this task requires parallel computing. Among the current parallel computing platforms, the graphics processing unit (GPU) provides massively parallel computational prowess that holds the promise of accelerating scientific applications at low cost. In this paper, we propose GPU-RMAP, a massively parallel version of the RMAP short-read mapping tool that is highly optimized for the NVIDIA family of GPUs. We then evaluate GPU-RMAP by mapping millions of synthetic and real reads of varying widths on the mosquito (Aedes aegypti) and human genomes. We also discuss the effects of various input parameters, such as read width, number of reads, and chromosome size, on the performance of GPU-RMAP. We then show that despite using the conventionally “slower” but GPU-compatible binary search algorithm, GPU-RMAP outperforms the sequential RMAP implementation, which uses the “faster” hashing technique on a PC. Our data-parallel GPU implementation results in impressive speedups of up to 14:5-times for the mapping kernel and up to 9:6-times for the overall program execution time over the sequential RMAP implementation on a traditional PC.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2017 hgpu.org

All rights belong to the respective authors

Contact us: