Achieving TeraCUPS on Longest Common Subsequence Problem using GPGPUs
School of Informatics and Computing, Indiana University, Bloomington
19th IEEE International Conference on Parallel and Distributed Systems (ICPADS’13), 2013
In this paper, we describe a novel technique to optimize longest common subsequence (LCS) algorithm for one-to-many matching problem on GPUs by transforming the computation into bit-wise operations and a post-processing step. The former can be highly optimized and achieves more than a trillion operations (cell updates) per second (CUPS)-a first for LCS algorithms. The latter is more efficiently done on CPUs, in a fraction of the bit-wise computation time. The bit-wise step promises to be a foundational step and a fundamentally new approach to developing algorithms for increasingly popular heterogeneous environments that could dramatically increase the applicability of hybrid CPU-GPU environments.
January 2, 2014 by hgpu