On Longest Repeat Queries Using GPU
Department of Computer Science, Eastern Washington University, Cheney, WA 99004, USA
arXiv:1501.06663 [cs.DC], (27 Jan 2015)
@article{tian2015longest,
title={On Longest Repeat Queries Using GPU},
author={Tian, Yun and Xu, Bojian},
year={2015},
month={jan},
archivePrefix={"arXiv"},
primaryClass={cs.DC}
}
Repeat finding in strings has important applications in subfields such as computational biology. The challenge of finding the longest repeats covering particular string positions was recently proposed and solved by Ileri et al., using a total of the optimal O(n) time and space, where n is the string size. However, their solution can only find the leftmost longest repeat for each of the n string position. It is also not known how to parallelize their solution. In this paper, we propose a new solution for longest repeat finding, which although is theoretically suboptimal in time but is conceptually simpler and works faster and uses less memory space in practice than the optimal solution. Further, our solution can find all longest repeats of every string position, while still maintaining a faster processing speed and less memory space usage. Moreover, our solution is parallelizable in the shared memory architecture (SMA), enabling it to take advantage of the modern multi-processor computing platforms such as the general-purpose graphics processing units (GPU). We have implemented both the sequential and parallel versions of our solution. Experiments with both biological and non-biological data show that our sequential and parallel solutions are faster than the optimal solution by a factor of 2–3.5 and 6–14, respectively, and use less memory space.
January 28, 2015 by hgpu