high performance computing on graphics processing units: hgpu.org

hgpu.org » Programming » Algorithms » Characterization of Speech Recognition Systems on GPU Architectures

Characterization of Speech Recognition Systems on GPU Architectures

Albert Segura Salvador

Departament d’Arquitectura de Computadors, Universitat Politecnica de Catalunya

Universitat Politecnica de Catalunya, 2016

@article{segura2016characterization,

title={Characterization of Speech Recognition Systems on GPU Architectures},

author={Segura Salvador, Albert},

year={2016},

publisher={Universitat Polit{‘e}cnica de Catalunya}

}

Download (PDF)

View

Source

1470

views

Automatic speech recognition is one of the most important applications in the area of cognitive computing. Mobile devices, such as smartphones, have incorporated speech recognition as one of the main interfaces for user interaction. This trend towards voice-based user interfaces is likely to continue in the next years. Effective speech recognition systems require real-time recognition, which involves a huge effort for CPU architectures to reach it. GPU architectures offer parallelization capabilities which can be exploited to increase the performance of speech recognition systems. However, efficiently utilizing the GPU resources for speech recognition is challenging, as the software implementations exhibit irregular and unpredictable memory accesses and poor temporal locality. Our key ambition is to characterize the performance and energy bottlenecks of speech recognition systems when running on a modern GPU, with the aim of providing useful information for designing future GPU architectures. First, we develop a GPU version of the Viterbi search algorithm, which is known to be the main bottleneck by far in speech recognition systems. Second, we analyse the GPU architecture to find the main sources of stalls in the pipeline and the energy bottlenecks. We show that memory stalls are the main reason for the low utilization of GPU resources. We then focus on the exploration of a number of architectural modifications to state-of-theart GPU architectures in order to deal with the performance limiting factors, i.e. the memory bottlenecks, and propose a GPU configuration highly tuned for speech recognition. The exploration evaluates different parameters for the memory hierarchy, including the L1 data cache, the L2 cache and the memory controller. We also consider modifications to the core resources and frequency scaling, in order to significantly reduce the number of idle cycles waiting for the memory and the underutilization of functional units. Our proposed GPU configuration is able to achieve real-time performance for large-vocabulary speech recognition, while increasing the issue rate from 5.1% to 18.1%, and achieving a power reduction of 31.6%, an energy reduction of 24% and area shrinkage of 17.96%.

Tags: Algorithms, Computer science, CUDA, Deep learning, GPGPU-sim, nVidia, nVidia GeForce GTX 980, Speech recognition, Thesis

September 22, 2016 by hgpu

Rating: 2.5/5. From 1 vote.

Please wait...

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

gpu_tracker: Python package for tracking and profiling GPU utilization in both desktop and high-performance computing environments

* * *

high performance computing on graphics processing units: hgpu.org

Characterization of Speech Recognition Systems on GPU Architectures

Recent source codes

QArray

Celerity: High-level C++ for Accelerator Clusters

CIFAR-10 Airbench: 94% on CIFAR-10 in 3.29 second

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

LOOPer: a polyhedral compiler for expressing fast and portable data parallel algorithms

OpenMC Monte Carlo Code

Polygeist: C/C++ frontend for MLIR

Parallel Gaussian process with kernel approximation in CUDA

Optical flow algorithms for SYCL

OpenMP5-Offload-OpenMC-Intel-PVC

Most viewed papers (last 30 days)

Characterization of Speech Recognition Systems on GPU Architectures

Share this:

Recent source codes

Most viewed papers (last 30 days)