Quality-score guided error correction for short-read sequencing data using CUDA
School of Computer Engineering, N40-2-32a Nanyang Ave., Nanyang Technological University, Singapore 639798
Procedia Computer Science, Vol. 1, No. 1. (May 2010), pp. 1129-1138.
Recently introduced new sequencing technologies can produce massive amounts of short-read data. Detection and correction of sequencing errors in this data is an important but time-consuming pre-processing step for de-novo genome assembly. In this paper, we demonstrate how the quality-score value associated with each base-call can be integrated in a CUDA-based parallel error correction algorithm. We show that quality-score guided error correction can improve the assembly accuracy of several datasets from the NCBI SRA (Short-Read Archive) in terms of N50-values as well as runtime. We further propose a number of improvements of to our previously published CUDA-EC algorithm to improve its runtime by a factor of up to 1.88.
January 11, 2011 by hgpu