{"id":8214,"date":"2012-09-16T23:02:43","date_gmt":"2012-09-16T20:02:43","guid":{"rendered":"http:\/\/hgpu.org\/?p=8214"},"modified":"2012-09-16T23:02:43","modified_gmt":"2012-09-16T20:02:43","slug":"accelerating-the-smith-waterman-algorithm-for-bio-sequence-matching-on-gpu","status":"publish","type":"post","link":"https:\/\/hgpu.org\/?p=8214","title":{"rendered":"Accelerating the Smith-Waterman Algorithm for Bio-sequence Matching on GPU"},"content":{"rendered":"<p>Nowadays, GPU has emerged as one promising computing platform to accelerate bio-sequence analysis applications by exploiting all kinds of parallel optimization strategies. In this paper, we take a well-known algorithm in the field of pair-wise sequence alignment and database searching, the Smith-Waterman (S-W) algorithm as an example, and demonstrate approaches that fully exploit its performance potentials on GPU platform. We propose the combination of coalesced global memory accesses, shared memory tiles, and loop unfolding, achieving 50X speedups over initial S-W versions on a NVIDIA GeForce GTX 470 card. Experimental results also show that the GPU GTX 470 gains 12X speedups, instead of 100X reported by some studies, over Intel quad core CPU Q9400, under the same manufacturing technology and both with fully optimized schemes.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Nowadays, GPU has emerged as one promising computing platform to accelerate bio-sequence analysis applications by exploiting all kinds of parallel optimization strategies. In this paper, we take a well-known algorithm in the field of pair-wise sequence alignment and database searching, the Smith-Waterman (S-W) algorithm as an example, and demonstrate approaches that fully exploit its performance [&hellip;]<\/p>\n","protected":false},"author":351,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":false,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[36,10,89,3],"tags":[1787,123,1781,14,667,20,953,209,717,284],"class_list":["post-8214","post","type-post","status-publish","format-standard","hentry","category-algorithms","category-biology","category-nvidia-cuda","category-paper","tag-algorithms","tag-bioinformatics","tag-biology","tag-cuda","tag-databases","tag-nvidia","tag-nvidia-geforce-gtx-470","tag-sequence-alignment","tag-sequence-matching","tag-smith-waterman-algorithm"],"views":2578,"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/hgpu.org\/index.php?rest_route=\/wp\/v2\/posts\/8214","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/hgpu.org\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/hgpu.org\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/hgpu.org\/index.php?rest_route=\/wp\/v2\/users\/351"}],"replies":[{"embeddable":true,"href":"https:\/\/hgpu.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=8214"}],"version-history":[{"count":0,"href":"https:\/\/hgpu.org\/index.php?rest_route=\/wp\/v2\/posts\/8214\/revisions"}],"wp:attachment":[{"href":"https:\/\/hgpu.org\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=8214"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/hgpu.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=8214"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/hgpu.org\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=8214"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}