Fast Exact String Matching on the GPU

Michael C. Schatz, Cole Trapnell
Center for Bioinformatics and Computational Biology, University of Maryland
CMSC 740 Computer Graphics, 2007


   title={Fast exact string matching on the gpu},

   author={Schatz, M.C. and Trapnell, C.},

   journal={Center for Bioinformatics and Computational Biology},



We present a string-matching program that runs on the GPU. Our program, Cmatch, achieves a speedup of as much as 35x on a recent GPU over the equivalent CPU-bound version. String matching has a long history in computational biology with roots in finding similar proteins and gene sequences in a database of known sequences. The explosion in sequence data available in the 80s and 90s motivated the development of ever faster techniques for searching for similar sequences, and ultimately lead the use of parallelized execution of string matching algorithms using sophisticated data structures called suffix trees. Suffix trees can be constructed time proportional to the length of the corpus, and provide exact matching of a query in time proportional to the length of the query, independent of the size of the corpus. Here, we present our string-matching kernel for use in the Compute Unified Device Architecture, which executes parallelized searching of a suffix tree for finding exact matches for a set of query strings. We compare our GPGPU suffix tree search to a serial CPU version of the algorithm, and analogous components of the widely used CPU program MUMmer, and explore issues associated with storing a suffix tree in a graphics card’s memory, and data distribution among the GPU’s processing units.
Rating: 0.5. From 1 vote.
Please wait...

* * *

* * *

HGPU group © 2010-2017 hgpu.org

All rights belong to the respective authors

Contact us: