A Metaprogramming and Autotuning Framework for Deploying Deep Learning Applications

Matthew W. Moskewicz, Ali Jannesari, Kurt Keutzer
University of California, Berkeley
arXiv:1611.06945 [cs.NE], (21 Nov 2016)


   title={A Metaprogramming and Autotuning Framework for Deploying Deep Learning Applications},

   author={Moskewicz, Matthew W. and Jannesari, Ali and Keutzer, Kurt},






In recent years, deep neural networks (DNNs), have yielded strong results on a wide range of applications. Graphics Processing Units (GPUs) have been one key enabling factor leading to the current popularity of DNNs. However, despite increasing hardware flexibility and software programming toolchain maturity, high efficiency GPU programming remains difficult: it suffers from high complexity, low productivity, and low portability. GPU vendors such as NVIDIA have spent enormous effort to write special-purpose DNN libraries. However, on other hardware targets, especially mobile GPUs, such vendor libraries are not generally available. Thus, the development of portable, open, high-performance, energy-efficient GPU code for DNN operations would enable broader deployment of DNN-based algorithms. Toward this end, this work presents a framework to enable productive, high-efficiency GPU programming for DNN computations across hardware platforms and programming models. In particular, the framework provides specific support for metaprogramming, autotuning, and DNN-tailored data types. Using our framework, we explore implementing DNN operations on three different hardware targets: NVIDIA, AMD, and Qualcomm GPUs. On NVIDIA GPUs, we show both portability between OpenCL and CUDA as well competitive performance compared to the vendor library. On Qualcomm GPUs, we show that our framework enables productive development of target-specific optimizations, and achieves reasonable absolute performance. Finally, On AMD GPUs, we show initial results that indicate our framework can yield reasonable performance on a new platform with minimal effort.
No votes yet.
Please wait...

* * *

* * *

Featured events

Hida Takayama, Japan

The Third International Workshop on GPU Computing and AI (GCA), 2018

Nagoya University, Japan

The 5th International Conference on Power and Energy Systems Engineering (CPESE), 2018

MediaCityUK, Salford Quays, Greater Manchester, England

The 10th International Conference on Information Management and Engineering (ICIME), 2018

No. 1037, Luoyu Road, Hongshan District, Wuhan, China

The 4th International Conference on Control Science and Systems Engineering (ICCSSE), 2018

Nanyang Executive Centre in Nanyang Technological University, Singapore

The 2018 International Conference on Cloud Computing and Internet of Things (CCIOT’18), 2018

HGPU group © 2010-2018 hgpu.org

All rights belong to the respective authors

Contact us: