SWPS3 – fast multi-threaded vectorized Smith-Waterman for IBM Cell/B.E. and x86/SSE2

Adam Szalkowski, Christian Ledergerber, Philipp Kraehenbuehl, Christophe Dessimoz
Department of Computer Science, ETH Zurich, Zurich, Switzerland
BMC Research Notes, Vol. 1, No. 1. (2008)


   title={SWPS 3 — fast multi-threaded vectorized Smith-Waterman for IBM Cell/B. E. and x86/SSE 2},

   author={Szalkowski, A. and Ledergerber, C. and Kr{\”a}henb{\”u}hl, P. and Dessimoz, C.},

   journal={BMC Research Notes},






   publisher={BioMed Central Ltd}


Download Download (PDF)   View View   Source Source   Source codes Source codes




BACKGROUND:We present SWPS3, a vectorized implementation of the Smith-Waterman local alignment algorithm optimized for both the Cell/B.E. and x86 architectures. The paper describes SWPS3 and compares its performances with several other implementations. FINDINGS:Our benchmarking results show that SWPS3 is currently the fastest implementation of a vectorized Smith-Waterman on the Cell/B.E., outperforming the only other known implementation by a factor of at least 4: on a Playstation 3, it achieves up to 8.0 billion cell-updates per second (GCUPS). Using the SSE2 instruction set, a quad-core Intel Pentium can reach 15.7 GCUPS. We also show that SWPS3 on this CPU is faster than a recent GPU implementation. Finally, we note that under some circumstances, alignments are computed at roughly the same speed as BLAST, a heuristic method.CONCLUSIONS:The Cell/B.E. can be a powerful platform to align biological sequences. Besides, the performance gap between exact and heuristic methods has almost disappeared, especially for long protein sequences.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2021 hgpu.org

All rights belong to the respective authors

Contact us: