high performance computing on graphics processing units: hgpu.org

hgpu.org » Programming » Algorithms » Speech Recognition on Multi-Core Processors and GPUs

Speech Recognition on Multi-Core Processors and GPUs

Patrick Cardinal

Ecole De Technologie Superieure, Universite Du Quebec

Universite Du Quebec, 2013

BibTeX

Download (PDF)

View

Source

2322

views

The speed of processors has remained stable over the past few years. The trend may even be towards slower speeds in order to satisfy the ever increasing demands of energy efficiency. This tendency is already apparent in the area of mobile devices. In order to take full advantage of the processing power offered by modern and future processors, applications must integrate parallelism and speech recognition is no exception. The classic decoding algorithm of Viterbi, a dynamic programming approach for searching in the recognition network, does not make full use of this power. The main reason being that the algorithm searches through a knowledge graph containing millions of nodes and transitions. In practice, a thorough search through such an enormous network is unfeasible. As a result, the graph is pruned so as to retain the most promising hypotheses only. The pruning process is however connected with a misuse of the memory architecture of Intel-based computers. To overcome this problem, another search algorithm is proposed: the A* search. This type of search makes use of a heuristic that provides an approximation of the distance for reaching the final node. A good heuristic results in a negligible number of nodes having to be explored, allowing to transfer the computational load of the network search towards the computation of the heuristic, so designed to make optimal use of modern processor architectures. The heuristic represents a much smaller knowledge graph for speech recognition. Because of its small size, the graph can be exhaustively explored thus eliminating the problems relating to memory architecture mismanagement. Acoustic model computations represent an important component of speech recognition. For this task, a 3.6x speed increase was achieved on a quad core processor with respect to the single core version. On GPU, the acceleration is 24.8x with respect to the sequential version. In regards to the recognition network search, the A* algorithm is shown to explore 28 times less nodes than the sequential version of the original algorithm. In addition, the heuristic computation is 4.1 and 10.1 times faster on a quad core and GPU than the sequential version respectively. Overall, the new parallelized version offers a 4% absolute increase in real-time recognition accuracy compared to the classic version.

Tags: Algorithms, CUDA, nVidia, nVidia GeForce GTX 295, Signal processing, Speech recognition, Thesis

August 13, 2013 by hgpu

Rating: 2.5/5. From 2 votes.

Please wait...

Your response

You must be logged in to post a comment.

* * *

high performance computing on graphics processing units: hgpu.org

Speech Recognition on Multi-Core Processors and GPUs

Your response

Recent source codes

Mutual-Supervised Learning for Sequential-to-Parallel Code Translation

Hardware Compute Partitioning on NVIDIA GPUs for Composable Systems

KISim: Kubernetes Intelligent Scheduling Simulator

Efficient GPU Implementation of Multi-Precision Integer Division

exa-AMD: Exascale Accelerated Materials Discovery

ParEval: A Parallel Code Evaluation Benchmark

FlashSparse: Minimizing Computation Redundancy for Fast Sparse Matrix Multiplications on Tensor Cores

WiLLM: An Open Wireless LLM Communication System

Vcc: the Vulkan Clang Compiler

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

Most viewed papers (last 30 days)

Speech Recognition on Multi-Core Processors and GPUs

Share this:

Your response

Recent source codes

Most viewed papers (last 30 days)