Scaling Deep Learning on Multiple In-Memory Processors

Lifan Xu, Dong Ping Zhang, Nuwan Jayasena
AMD Research, Advanced Micro Devices, Inc.
3rd Workshop on Near-Data Processing In conjunction with MICRO-48, 2015


   title={Scaling Deep Learning on Multiple In-Memory Processors},

   author={Xu, Lifan and Zhang, Dong Ping and Jayasena, Nuwan},



Download Download (PDF)   View View   Source Source   



Deep learning methods are proven to be state-of-theart in addressing many challenges in machine learning domains. However, it comes at the cost of high computational requirements and energy consumption. The emergence of Processing In Memory (PIM) with diestacking technology presents an opportunity to speed up deep learning computation and reduce energy consumption by providing low-cost high-bandwidth memory accesses. PIM uses 3D die stacking to move computations closer to memory and therefore reduce data movement overheads. In this paper, we study the parallelization of deep learning methods on a system with multiple PIM devices. We select three typical layers: the convolutional, pooling, and fully connected layers from common deep learning models and parallelize them using different schemes. Preliminary results show we are able to reach competitive or even better performance using multiple PIM devices when comparing with traditional GPU parallelization.
Rating: 1.3/5. From 4 votes.
Please wait...

* * *

* * *

HGPU group © 2010-2021 hgpu.org

All rights belong to the respective authors

Contact us: