Improving CUDASW++, a Parallelization of Smith-Waterman for CUDA Enabled Devices

Doug Hains, Zach Cashero, Mark Ottenberg, Wim Bohm, Sanjay Rajopadhye
Colorado State University, Department of Computer Science, Fort Collins, CO 80523
IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum (IPDPSW), 2011


   title={Improving CUDASW++, a Parallelization of Smith-Waterman for CUDA Enabled Devices},

   author={Hains, D. and Cashero, Z. and Ottenberg, M. and Bohm, W. and Rajopadhye, S.},



Download Download (PDF)   View View   Source Source   



CUDASW++ is a parallelization of the Smith-Waterman algorithm for CUDA graphical processing units that computes the similarity scores of a query sequence paired with each sequence in a database. The algorithm uses one of two kernel functions to compute the score between a given pair of sequences: the inter-task kernel or the intra-task kernel. We have identified the intra-task kernel as a major bottleneck in the CUDASW++ algorithm. We have developed a new intra-task kernel that is faster than the original intra-task kernel used in CUDASW++. We describe the development of our kernel as a series of incremental changes that provide insight into a number of issues that must be considered when developing any algorithm for the CUDA architecture. We analyze the performance of our kernel compared to the original and show that the use of our intra-task kernel substantially improves the overall performance of CUDASW++ on the order of three to four giga-cell updates per second on various benchmark databases.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2021 hgpu.org

All rights belong to the respective authors

Contact us: