A High-efficiency FPGA-based Accelerator for Convolutional Neural Networks using Winograd Algorithm
Science and Technology on Parallel and Distributed Laboratory, National University of Defence Technology, Changsha 410073, China
Journal of Physics: Conference Series, Volume 1026, 012019, 2018
@inproceedings{huang2018high,
title={A High-efficiency FPGA-based Accelerator for Convolutional Neural Networks using Winograd Algorithm},
author={Huang, Y and Shen, J and Wang, Z and Wen, M and Zhang, C},
booktitle={Journal of Physics: Conference Series},
volume={1026},
number={1},
pages={012019},
year={2018},
organization={IOP Publishing}
}
Convolutional neural networks (CNNs) are widely used in many computer vision applications. Previous FPGA implementations of CNNs are mainly based on the conventional convolutional algorithm. However, the high arithmetic complexity of conventional convolution algorithm for CNNs restricts the performance of accelerators and significantly increases the challenges of design. It has been proved that the Winograd algorithm for CNNs can effectively reduce the computational complexity. Although a few FPGA approaches based on the Winograd algorithm have been implemented, their works are lake of evaluation on the performance for different tile sizes of the Winograd algorithm. In this work, we focus on exploring the possibility of using the Winograd algorithm to accelerate CNNs on FPGA. First, we propose an accelerator architecture applying to both convolutional layers and fully connected layers. Second, we use high level synthesis tool to expediently implement our design. Finally, we evaluate our accelerator with different tile sizes in terms of resource utilization, performance and efficiency. On VUS440 platform, we achieve an average 943 GOPS for overall VGG16 under low resource utilization, which reaches higher efficiency than the stateof-the-art works on FPGAs.
June 9, 2018 by hgpu