A case for neuromorphic ISAs

hgpu.org » Programming » Algorithms » A case for neuromorphic ISAs

A case for neuromorphic ISAs

Atif Hashmi, Andrew Nere, James Jamal Thomas, Mikko Lipasti

University of Wisconsin-Madison, Madison, WI, USA

Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systems, ASPLOS ’11, 2011

DOI:10.1145/1950365.1950385

@article{nere2011case,

title={A Case for Neuromorphic ISAs},

author={Nere, A. and Hashmi, A. and Thomas, J.J. and Lipasti, M.},

year={2011}

}

Download (PDF)

View

Source

4275

views

The desire to create novel computing systems, paired with recent advances in neuroscientific understanding of the brain, has led researchers to develop neuromorphic architectures that emulate the brain. To date, such models are developed, trained, and deployed on the same substrate. However, excessive co-dependence between the substrate and the algorithm prevents portability, or at the very least requires reconstructing and retraining the model whenever the substrate changes. This paper proposes a well-defined abstraction layer — the Neuromorphic instruction set architecture, or NISA — that separates a neural application’s algorithmic specification from the underlying execution substrate, and describes the Aivo framework, which demonstrates the concrete advantages of such an abstraction layer. Aivo consists of a NISA implementation for a rate-encoded neuromorphic system based on the cortical column abstraction, a state-of-the-art integrated development and runtime environment (IDE), and various profile-based optimization tools. Aivo’s IDE generates code for emulating cortical networks on the host CPU, multiple GPGPUs, or as boolean functions. Its runtime system can deploy and adaptively optimize cortical networks in a manner similar to conventional just-in-time compilers in managed runtime systems (e.g. Java, C#). We demonstrate the abilities of the NISA abstraction by constructing a cortical network model of the mammalian visual cortex, deploying on multiple execution substrates, and utilizing the various optimization tools we have created. For this hierarchical configuration, Aivo’s profiling based network optimization tools reduce the memory footprint by 50% and improve the execution time by a factor of 3x on the host CPU. Deploying the same network on a single GPGPU results in a 30x speedup. We further demonstrate that a speedup of 480x can be achieved by deploying a massively scaled cortical network across three GPGPUs. Finally, converting a trained hierarchical network to C/C++ boolean constructs on the host CPU results in 44x speedup.

Tags: Algorithms, Computer science, CUDA, Java, MPI, nVidia, nVidia GeForce 9800 GX2, nVidia GeForce GTX 280, Optimization, Performance

September 22, 2011 by hgpu

No votes yet.

Please wait...

Your response

You must be logged in to post a comment.

* * *

high performance computing on graphics processing units: hgpu.org