high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » Recurrent Neural Networks Hardware Implementation on FPGA

Recurrent Neural Networks Hardware Implementation on FPGA

Andre Xian Ming Chang, Berin Martini, Eugenio Culurciello

Department of Electrical and Computer Engineering, Purdue University, West Lafayette, IN 47907, USA

arXiv:1511.05552 [cs.NE], (17 Nov 2015)

@article{chang2015recurrent,

title={Recurrent Neural Networks Hardware Implementation on FPGA},

author={Chang, Andre Xian Ming and Martini, Berin and Culurciello, Eugenio},

year={2015},

month={nov},

archivePrefix={"arXiv"},

primaryClass={cs.NE}

}

Download (PDF)

View

Source

3758

views

Recurrent Neural Networks (RNNs) have the ability to retain memory and learn data sequences, and are a recent breakthrough of machine learning. Due to the recurrent nature of RNNs, it is sometimes hard to parallelize all its computations on conventional hardware. CPUs do not currently offer large parallelism, while GPUs offer limited parallelism due to branching in RNN models. In this paper we present a hardware implementation of Long-Short Term Memory (LSTM) recurrent network on the programmable logic Zynq 7020 FPGA from Xilinx. We implemented a RNN with 2 layers and 128 hidden units in hardware and it has been tested using a character level language model. The implementation is more than $21times$ faster than the ARM CPU embedded on the Zynq 7020 FPGA. This work can potentially evolve to a RNN co-processor for future mobile devices.

Tags: ARM, Computer science, FPGA, Machine learning, Neural networks, nVidia, nVidia Jetson TK1

November 20, 2015 by hgpu

Rating: 1.8/5. From 3 votes.

Please wait...