Design of high-performance parallelized gene predictors in MATLAB

hgpu.org » Programming » Algorithms » Design of high-performance parallelized gene predictors in MATLAB

Design of high-performance parallelized gene predictors in MATLAB

Sylvain Robert Rivard, Jean-Gabriel Mailloux, Rachid Beguenane, Hung Tien Bui,

Departement des sciences appliquees, Universite du Quebec a Chicoutimi, 555, blvd de l’Universite, Chicoutimi, QC, G7H 2B1, Canada

BMC Research Notes, 5:183, 2012

DOI:10.1186/1756-0500-5-183

BibTeX

Download (PDF)

View

Source

1761

views

BACKGROUND: This paper proposes a method of implementing parallel gene prediction algorithms in MATLAB. The proposed designs are based on either Goertzel’s algorithm or on FFTs and have been implemented using varying amounts of parallelism on a central processing unit (CPU) and on a graphics processing unit (GPU). FINDINGS: Results show that an implementation using a straightforward approach can require over 4.5 h to process 15 million base pairs (bps) whereas a properly designed one could perform the same task in less than five minutes. In the best case, a GPU implementation can yield these results in 57 s. CONCLUSIONS: The present work shows how parallelism can be used in MATLAB for gene prediction in very large DNA sequences to produce results that are over 270 times faster than a conventional approach. This is significant as MATLAB is typically overlooked due to its apparent slow processing time even though it offers a convenient environment for bioinformatics. From a practical standpoint, this work proposes two strategies for accelerating genome data processing which rely on different parallelization mechanisms. Using a CPU, the work shows that direct access to the MEX function increases execution speed and that the PARFOR construct should be used in order to take full advantage of the parallelizable Goertzel implementation. When the target is a GPU, the work shows that data needs to be segmented into manageable sizes within the GFOR construct before processing in order to minimize execution time.

Tags: Algorithms, Bioinformatics, Biology, CUDA, FFT, Genetics, nVidia, nVidia GeForce GTX 560

April 13, 2012 by hgpu

No votes yet.

Please wait...

Your response

You must be logged in to post a comment.

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

chemtrain: Training Molecular Dynamics Potentials in JAX

chemtrain-deploy: A parallel and scalable framework for machine learning potentials in million-atom MD simulations

microSYCL: SYCL micro-benchmarks repository

Exploring SYCL as a Portability Layer for High-Performance Computing on CPUs

See all packages

* * *

high performance computing on graphics processing units: hgpu.org