Persistent RNNs: Stashing Recurrent Weights On-Chip

Gregory Diamos, Shubho Sengupta, Bryan Catanzaro, Mike Chrzanowski, Adam Coates, Erich Elsen, Jesse Engel, Awni Hannun, Sanjeev Satheesh
Baidu Silicon Valley AI Lab, 1195 Bordeaux Drive, Sunnyvale, CA 94089, United States
The 33rd International Conference on Machine Learning, 2016


   title={Persistent RNNs: Stashing Recurrent Weights On-Chip},

   author={Diamos, Greg and Sengupta, Shubho and Catanzaro, Bryan and Chrzanowski, Mike and Coates, Adam and Elsen, Erich and Engel, Jesse and Hannun, Awni and Satheesh, Sanjeev},

   booktitle={Proceedings of The 33rd International Conference on Machine Learning},




This paper introduces a new technique for mapping Deep Recurrent Neural Networks (RNN) efficiently onto GPUs. We show how it is possible to achieve substantially higher computational throughput at low mini-batch sizes than direct implementations of RNNs based on matrix multiplications. The key to our approach is the use of persistent computational kernels that exploit the GPU’s inverted memory hierarchy to reuse network weights over multiple timesteps. Our initial implementation sustains 2.8 TFLOP/s at a minibatch size of 4 on an NVIDIA Titan X GPU. This provides a 16x reduction in activation memory footprint, enables model training with 12x more parameters on the same hardware, allows us to strongly scale RNN training to 128 GPUs, and allows us to efficiently explore end-to-end speech recognition models with over 100 layers.
Rating: 2.3/5. From 13 votes.
Please wait...

* * *

* * *

Featured events

Hida Takayama, Japan

The Third International Workshop on GPU Computing and AI (GCA), 2018

Nagoya University, Japan

The 5th International Conference on Power and Energy Systems Engineering (CPESE), 2018

MediaCityUK, Salford Quays, Greater Manchester, England

The 10th International Conference on Information Management and Engineering (ICIME), 2018

No. 1037, Luoyu Road, Hongshan District, Wuhan, China

The 4th International Conference on Control Science and Systems Engineering (ICCSSE), 2018

Nanyang Executive Centre in Nanyang Technological University, Singapore

The 2018 International Conference on Cloud Computing and Internet of Things (CCIOT’18), 2018

HGPU group © 2010-2018 hgpu.org

All rights belong to the respective authors

Contact us: