Fast Burrows Wheeler Compression Using CPU and GPU
Center for Visual Information Technology, International Institute of Information Technology, Hyderabad – 500 032, India
International Institute of Information Technology, Technical report IIIT/TR/2014/xx, 2014
@article{deshpande2014fast,
title={Fast Burrows Wheeler Compression Using CPU and GPU},
author={Deshpande, Aditya and Narayanan, PJ},
year={2014}
}
In this paper, we present an all-core implementation of Burrows Wheeler Compression algorithm that exploits all computing resources on a system. Our focus is to provide significant benefit to everyday users on common end-to-end applications by exploiting the parallelism of multiple CPU cores and many-core GPU on their machines. The all-core framework is suitable for problems that process large files or buffers in blocks. We consider a system to be made up of compute stations and use a work-queue to dynamically divide the tasks among them. Each compute station uses an implementation that optimally exploits its architecture. We develop a fast GPU BWC algorithm by extending the state-of-the-art GPU string sort to efficiently perform BWT step of BWC. Our hybrid BWC implementation achieves a 2.9x speedup over the best CPU implementation. Our all-core framework allows concurrent processing of blocks by both GPU and all available CPU cores. We achieve a 3.06x speedup by using all CPU cores and a 4.87x speedup using the GPU also in the all-core framework. Our approach will scale to the number and type of computing resources on a system.
April 14, 2014 by hgpu