high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » Proteus: Exploiting Numerical Precision Variability in Deep Neural Networks

Proteus: Exploiting Numerical Precision Variability in Deep Neural Networks

P. Judd, J. Albericio, N. Enright Jerger, A. Moshovos, T. Hetherington, T. Aamodt

Department of Electrical and Computer Engineering, University of Toronto, Toronto, Canada

2nd Workshop On Approximate Computing (WAPCO), 2016

@article{judd2016proteus,

title={Proteus: Exploiting Numerical Precision Variability in Deep Neural Networks},

author={Judd, Patrick and Albericio, J and Jerger, N Enright and Moshovos, A and Hetherington, T and Aamodt, T},

year={2016}

}

Download (PDF)

View

Source

2118

views

This work exploits the tolerance of Deep Neural Networks (DNNs) to reduced precision numerical representations and specifically, their ability to use different representations per layer while maintaining accuracy. This flexibility provides an additional opportunity to improve performance and energy compared to conventional DNN implementations that use a single, uniform representation for all layers throughout the network. This work exploits this property by proposing PROTEUS, a layered extension over existing DNN implementations that converts between the numerical representation used by the DNN execution engines and a shorter, layer specific fixed-point representation when reading and writing data values to memory be it on-chip buffers or off-chip memory. When used with a modified layout of data in memory, PROTEUS can use a simple, low-cost and low energy conversion unit. On five popular DNNs, PROTEUS can reduce data traffic among layers by 41% on average and up to 44% compared to a baseline that uses 16-bit fixed-point representation, while maintaining accuracy within 1% even when compared to a single precision floating-point implementation. When incorporated into a state-of-the-art accelerator PROTEUS improves energy by 14% While maintaining the same performance. When incorporated on a graphics processor PROTEUS improves performance by 1%, energy by 4% and reduces off-chip DRAM accesses by 46%.

Tags: Computer science, Deep learning, GPGPU-sim, Neural networks

March 3, 2016 by hgpu

No votes yet.

Please wait...

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

gpu_tracker: Python package for tracking and profiling GPU utilization in both desktop and high-performance computing environments

high performance computing on graphics processing units: hgpu.org

Proteus: Exploiting Numerical Precision Variability in Deep Neural Networks

Recent source codes

SimSYCL: Synchronous, single-threaded, library-only SYCL implementation for debugging and verification

GPU plugin for PySCF

QArray

Celerity: High-level C++ for Accelerator Clusters

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

CIFAR-10 Airbench: 94% on CIFAR-10 in 3.29 second

LOOPer: a polyhedral compiler for expressing fast and portable data parallel algorithms

OpenMC Monte Carlo Code

Polygeist: C/C++ frontend for MLIR

Parallel Gaussian process with kernel approximation in CUDA

Most viewed papers (last 30 days)

Proteus: Exploiting Numerical Precision Variability in Deep Neural Networks

Share this:

Recent source codes

Most viewed papers (last 30 days)