Faster Algorithms for RNA-folding using the Four-Russians method
UC Davis
UC Davis, Technical report CSE-2013-70, 2013
@article{venkatachalam2013faster,
title={Faster Algorithms for RNA-folding using the Four-Russians method},
author={Venkatachalam, B. and Frid, Y. and Gusfield, D.},
year={2013}
}
The secondary structure that maximizes the number of non-crossing matchings between complimentary bases of an RNA sequence of length n can be computed in O(n^3) time using dynamic programming. Four-Russians is a technique that will reduce the running time for certain dynamic programming algorithms by a factor after a preprocessing step where solutions to all smaller subproblems of a fixed size are exhaustively enumerated. Frid and Gusfield designed an O(n^3/log n) algorithm for RNA folding using the Four-Russians technique. However, in their algorithm the preprocessing is interleaved with the algorithm computation. We simplify the algorithm and the analysis by doing the preprocessing once prior to the algorithm computation. We call this the two-vector method. We also show variants where instead of exhaustive preprocessing, we only solve the subproblems encountered in the main algorithm once and memoize the results. We give a proof of correctness and explore the practical advantages over the earlier method. The Nussinov algorithm admits an O(n^2) parallel algorithm. We show an parallel algorithm using the two-vector idea that improves the time bound to O(n^2/log n). We have implemented the parallel algorithm on Graphical processing units using CUDA platform. We discuss the organization of the data structures to exploit coalesced memory access for fast running time. These ideas also help in improving the running time of the serial algorithms. For sequences of up to 6000 bases the parallel algorithm takes only about 2 secs, the two-vector and memoized versions are faster than the Frid-Gusfield algorithm by a factor of 3, and faster than Nussinov by a factor of 20.
January 30, 2013 by hgpu