HCudaBLAST: an implementation of BLAST on Hadoop and Cuda
Maulana Azad National Institute of Technology, Bhopal, Madhya Pradesh 462003, India
Journal of Big Data, 4:41, 2017
@article{khare2017hcudablast,
title={HCudaBLAST: an implementation of BLAST on Hadoop and Cuda},
author={Khare, Nilay and Khare, Alind and Khan, Farhan},
journal={Journal of Big Data},
volume={4},
number={1},
pages={41},
year={2017},
publisher={Springer}
}
The world of DNA sequencing has not only been a difficult field since it was first worked upon, but it is also growing at an exponential rate. The amount of data involved in DNA searching is huge, thereby normal tools or algorithms are not suitable to handle this degree of data processing. BLAST is a tool given by National Center for Biotechnology Information (NCBI) to compare nucleotide or protein sequences to sequence databases and calculate the statistical significance of matches. Many variants of BLAST such as blastn, blastp, blastx, etc. are used to search for nucleotides, proteins, nucleotides-to-proteins sequences respectively. GPU-BLAST and HBLAST have already been proposed to handle the vast amount of data involved in searching DNA sequencing and they also speedup the searching process. In this article, we propose a new model for searching DNA sequences-HCudaBLAST. It involves CUDA processing and Hadoop combined for efficient searching. The results recorded after implementing HCudaBLAST are shown. This solution combines the multi-core parallelism of GPGPUs and the scalability feature provided by the Hadoop framework.
December 3, 2017 by hgpu