24399

Fast convolutional neural networks on FPGAs with hls4ml

Thea Aarrestad, Vladimir Loncar, Maurizio Pierini, Sioni Summers, Jennifer Ngadiuba, Christoffer Petersson, Hampus Linander, Yutaro Iiyama, Giuseppe Di Guglielmo, Javier Duarte, Philip Harris, Dylan Rankin, Sergo Jindariani, Kevin Pedro, Nhan Tran, Mia Liu, Edward Kreinar, Zhenbin Wu, Duc Hoang
European Organization for Nuclear Research (CERN), CH-1211 Geneva 23, Switzerland
arXiv:2101.05108 [cs.LG], (13 Jan 2021)

@misc{aarrestad2021fast,

   title={Fast convolutional neural networks on FPGAs with hls4ml},

   author={Thea Aarrestad and Vladimir Loncar and Maurizio Pierini and Sioni Summers and Jennifer Ngadiuba and Christoffer Petersson and Hampus Linander and Yutaro Iiyama and Giuseppe Di Guglielmo and Javier Duarte and Philip Harris and Dylan Rankin and Sergo Jindariani and Kevin Pedro and Nhan Tran and Mia Liu and Edward Kreinar and Zhenbin Wu and Duc Hoang},

   year={2021},

   eprint={2101.05108},

   archivePrefix={arXiv},

   primaryClass={cs.LG}

}

We introduce an automated tool for deploying ultra low-latency, low-power deep neural networks with large convolutional layers on FPGAs. By extending the hls4ml library, we demonstrate how to achieve inference latency of 5μs using convolutional architectures, while preserving state-of-the-art model performance. Considering benchmark models trained on the Street View House Numbers Dataset, we demonstrate various methods for model compression in order to fit the computational constraints of a typical FPGA device. In particular, we discuss pruning and quantization-aware training, and demonstrate how resource utilization can be reduced by over 90% while maintaining the original model accuracy.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2021 hgpu.org

All rights belong to the respective authors

Contact us: