Optimising Convolutional Neural Networks Inference on Low-Powered GPUs

Simon Rovder, Jose Cano, Michael O’Boyle
School of Informatics, University of Edinburgh, UK
12th International Workshop on Programmability and Architectures for Heterogeneous Multicores (MULTIPROG), 2019


   title={Optimising Convolutional Neural Networks Inference on Low-Powered GPUs},

   author={Rovder, Simon and Cano, Jos{‘e} and O’Boyle, Michael},



Download Download (PDF)   View View   Source Source   



In this paper we present effective optimisation techniques for accelerating convolutional neural networks inference on low-powered heterogeneous devices with OpenCL. Using LeNet and VGG-16 as test networks, we implement a custom neural network system in OpenCL and optimise it to minimise their inference times. Our baseline system shows a speedup of 17x for LeNet. We also outline two methods for fast convolution: an iterative vectorised approach and a Morton GEMM based approach. The two approaches demonstrate VGG-16 inference speeds up to 3x faster than current state-of-the-art systems and outperform other custom neural network systems by speedup factors of up to 1.82x.
Rating: 1.5/5. From 2 votes.
Please wait...

* * *

* * *

HGPU group © 2010-2021 hgpu.org

All rights belong to the respective authors

Contact us: