high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » Accelerating recurrent neural network language model based online speech recognition system

Accelerating recurrent neural network language model based online speech recognition system

Kyungmin Lee, Chiyoun Park, Namhoon Kim, Jaewon Lee

DMC R&D Center, Samsung Electronics, Seoul, Korea

arXiv:1801.09866 [cs.CL], (30 Jan 2018)

@article{lee2018accelerating,

title={Accelerating recurrent neural network language model based online speech recognition system},

author={Lee, Kyungmin and Park, Chiyoun and Kim, Namhoon and Lee, Jaewon},

year={2018},

month={jan},

archivePrefix={"arXiv"},

primaryClass={cs.CL}

}

Download (PDF)

View

Source

2355

views

This paper presents methods to accelerate recurrent neural network based language models (RNNLMs) for online speech recognition systems. Firstly, a lossy compression of the past hidden layer outputs (history vector) with caching is introduced in order to reduce the number of LM queries. Next, RNNLM computations are deployed in a CPU-GPU hybrid manner, which computes each layer of the model on a more advantageous platform. The added overhead by data exchanges between CPU and GPU is compensated through a frame-wise batching strategy. The performance of the proposed methods evaluated on LibriSpeech test sets indicates that the reduction in history vector precision improves the average recognition speed by 1.23 times with minimum degradation in accuracy. On the other hand, the CPU-GPU hybrid parallelization enables RNNLM based real-time recognition with a four times improvement in speed.

Tags: Computer science, CUDA, Deep learning, Neural networks, nVidia, RNN, Speech recognition, Tesla K80

February 3, 2018 by hgpu

Rating: 3.5/5. From 2 votes.

Please wait...

Your response

You must be logged in to post a comment.

* * *

high performance computing on graphics processing units: hgpu.org

Accelerating recurrent neural network language model based online speech recognition system

Your response

Recent source codes

UniCoder: Unified Visual-to-Code Generation via Symbolic Rewards and Reference-Guided Code Optimization

CuFuzz: An API-Knowledge-Graph Coverage-Driven Fuzzing Framework for CUDA Libraries

AutoPass: Evidence-Guided LLM Agents for Compiler Performance Tuning

Probe-and-Refine Tuning of Repository Guidance for AI Coding Agents

CUDAnalyst (CUDA + Analyst)

CodegenBench

KernelBenchX: A Comprehensive Benchmark for Evaluating LLM-Generated GPU Kernels

CUDA Kernel Fusion Benchmarks

IntelliKit: Agent-first tooling for AMD hardware

DITRON: Distributed Compiler based on Triton for Parallel Systems

Most viewed papers (last 30 days)

Accelerating recurrent neural network language model based online speech recognition system

Share this:

Your response

Recent source codes

Most viewed papers (last 30 days)