Quality-score guided error correction for short-read sequencing data using CUDA
School of Computer Engineering, N40-2-32a Nanyang Ave., Nanyang Technological University, Singapore 639798
Procedia Computer Science, Vol. 1, No. 1. (May 2010), pp. 1129-1138.
@article{shi2010quality,
title={Quality-score guided error correction for short-read sequencing data using CUDA},
author={Shi, H. and Schmidt, B. and Liu, W. and M{\”u}ller-Wittig, W.},
journal={Procedia Computer Science},
volume={1},
number={1},
pages={1123–1132},
issn={1877-0509},
year={2010},
publisher={Elsevier}
}
Recently introduced new sequencing technologies can produce massive amounts of short-read data. Detection and correction of sequencing errors in this data is an important but time-consuming pre-processing step for de-novo genome assembly. In this paper, we demonstrate how the quality-score value associated with each base-call can be integrated in a CUDA-based parallel error correction algorithm. We show that quality-score guided error correction can improve the assembly accuracy of several datasets from the NCBI SRA (Short-Read Archive) in terms of N50-values as well as runtime. We further propose a number of improvements of to our previously published CUDA-EC algorithm to improve its runtime by a factor of up to 1.88.
January 11, 2011 by hgpu