GPU-accelerated HMM for Speech Recognition

hgpu.org » Programming » Algorithms » GPU-accelerated HMM for Speech Recognition

GPU-accelerated HMM for Speech Recognition

Leiming Yu, Yash Ukidave, David Kaeli

Department of Electrical and Computer Engineering, Northeastern University, Boston, MA, USA

Workshop Series on Heterogeneous and Unconventional Cluster Architectures and Applications (HUCAA), 2014

BibTeX

Download (PDF)

View

Source

Source codes

Package:

pariir

3124

views

Speech recognition is used in a wide range of applications and devices such as mobile phones, in-car entertainment systems and web-based services. Hidden Markov Models (HMMs) is one of the most popular algorithmic approaches applied in speech recognition. Training and testing a HMM is computationally intensive and time-consuming. Running multiple applications concurrently with speech recognition could overwhelm the compute resources, and introduce unwanted delays in the speech processing, eventually dropping words in the process due to buffer overruns. Graphics processing units (GPUs) have become widely accepted as accelerators which offer massive amounts of parallelism. The host processor (the CPU) can offload compute-intensive portions of an application to the GPU, leaving the CPU to focus on serial tasks and scheduling operations. In this paper, we provide a parallelized Hidden Markov Model to accelerate isolated words speech recognition. We experiment with different optimization schemes and make use of optimized GPU computing libraries to speedup the computation on GPUs. We also explore the performance benefits of using advanced GPU features for concurrent execution of multiple compute kernels. The algorithms are evaluated on multiple Nvidia GPUs using CUDA as a programming framework. Our GPU implementation achieves better performance than traditional serial and multithreaded implementations. When considering the end-to-end performance of the application, which includes both data transfer and computation, we achieve a 9x speedup for training with the use of a GPU over a multi-threaded version optimized for a multi-core CPU.

Tags: Algorithms, Computer science, CUDA, nVidia, nVidia GeForce GTX 680, nVidia GeForce GTX Titan, Package, Speech recognition

February 10, 2015 by hgpu

No votes yet.

Please wait...

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

Engineering Supercomputing Platforms for Biomolecular Applications

high performance computing on graphics processing units: hgpu.org