Acceleration of spiking neural networks in emerging multi-core and GPU architectures
Department of Electrical and Computer Engineering, Clemson University, Clemson, SC 29634, USA
IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW), 2010
@inproceedings{bhuiyan2010acceleration,
title={Acceleration of spiking neural networks in emerging multi-core and GPU architectures},
author={Bhuiyan, M.A. and Pallipuram, V.K. and Smith, M.C.},
booktitle={Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW), 2010 IEEE International Symposium on},
pages={1–8},
year={2010},
organization={IEEE}
}
Recently, there has been strong interest in large-scale simulations of biological spiking neural networks (SNN) to model the human brain mechanisms and capture its inference capabilities. Among various spiking neuron models, the Hodgkin-Huxley model is the oldest and most compute intensive, whereas the more recent Izhikevich model is very compute efficient. Some of the recent multi-core processors and accelerators including Graphical Processing Units, IBM’s Cell Broadband Engine, AMD Opteron, and Intel Xeon can take advantage of task and thread level parallelism, making them good candidates for large-scale SNN simulations. In this paper we implement and analyze two character recognition networks based on these spiking neuron models. We investigate the performance improvement and optimization techniques for SNNs on these accelerators over an equivalent software implementation on a 2.66 GHz Intel Core 2 Quad. We report significant speedups of the two SNNs on these architectures. It has been observed that given proper application of optimization techniques, the commodity X86 processors are viable options for those applications that require a nominal amount of flops/byte. But for applications with a significant number of flops/byte, specialized architectures such as GPUs and cell processors can provide better performance. Our results show that a proper match of architecture with algorithm complexity provides the best performance.
May 28, 2011 by hgpu