high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » PipeCNN: An OpenCL-Based FPGA Accelerator for Large-Scale Convolution Neuron Networks

PipeCNN: An OpenCL-Based FPGA Accelerator for Large-Scale Convolution Neuron Networks

Dong Wang, Jianjing An, Ke Xu

Institute of Information Science, Beijing Jiaotong University, Beijing 100044, China

arXiv:1611.02450 [cs.AR], (8 Nov 2016)

@article{wang2016pipecnn,

title={PipeCNN: An OpenCL-Based FPGA Accelerator for Large-Scale Convolution Neuron Networks},

author={Wang, Dong and An, Jianjing and Xu, Ke},

year={2016},

month={nov},

archivePrefix={"arXiv"},

primaryClass={cs.AR}

}

Download (PDF)

View

Source

Source codes

Package:

PipeCNN: An OpenCL-based FPGA Accelerator for Convolutinal Neural Networks

3706

views

Convolutional neural networks (CNNs) have been widely employed in many applications such as image classification, video analysis and speech recognition. Being compute-intensive, CNN computations are mainly accelerated by GPUs with high power dissipations. Recently, studies were carried out exploiting FPGA as CNN accelerator because of its reconfigurability and energy efficiency advantage over GPU, especially when OpenCL-based high-level synthesis tools are now available providing fast verification and implementation flows. Previous OpenCL-based design only focused on creating a generic framework to identify performance-related hardware parameters, without utilizing FPGA’s special capability of pipelining kernel functions to minimize memory bandwidth requirement. In this work, we propose an FPGA accelerator with a new architecture of deeply pipelined OpenCL kernels. Data reuse and task mapping techniques are also presented to improve design efficiency. The proposed schemes are verified by implementing two representative large-scale CNNs, AlexNet and VGG on Altera Stratix-V A7 FPGA. We have achieved a similar peak performance of 33.9 GOPS with a 34% resource reduction on DSP blocks compared to previous work. Our design is openly accessible and thus can be reused to explore new architectures for neural network accelerators.

Tags: CNN, Computer science, Deep learning, DSP, FPGA, Hardware Architecture, Neural networks, OpenCL, Package

November 10, 2016 by hgpu

No votes yet.

Please wait...