Pipelined Training with Stale Weights of Deep Convolutional Neural Networks

hgpu.org » Applications » Computer science » Pipelined Training with Stale Weights of Deep Convolutional Neural Networks

Pipelined Training with Stale Weights of Deep Convolutional Neural Networks

Lifu Zhang, Tarek S. Abdelrahman

The Edward S. Rogers Sr. Department of Electrical and Computer Engineering, University of Toronto

arXiv:1912.12675 [cs.DC], (29 Dec 2019)

BibTeX

Download (PDF)

View

Source

1900

views

The growth in the complexity of Convolutional Neural Networks (CNNs) is increasing interest in partitioning a network across multiple accelerators during training and pipelining the backpropagation computations over the accelerators. Existing approaches avoid or limit the use of stale weights through techniques such as micro-batching or weight stashing. These techniques either underutilize of accelerators or increase memory footprint. We explore the impact of stale weights on the statistical efficiency and performance in a pipelined backpropagation scheme that maximizes accelerator utilization and keeps memory overhead modest. We use 4 CNNs (LeNet-5, AlexNet, VGG and ResNet) and show that when pipelining is limited to early layers in a network, training with stale weights converges and results in models with comparable inference accuracies to those resulting from non-pipelined training on MNIST and CIFAR-10 datasets; a drop in accuracy of 0.4%, 4%, 0.83% and 1.45% for the 4 networks, respectively. However, when pipelining is deeper in the network, inference accuracies drop significantly. We propose combining pipelined and non-pipelined training in a hybrid scheme to address this drop. We demonstrate the implementation and performance of our pipelined backpropagation in PyTorch on 2 GPUs using ResNet, achieving speedups of up to 1.8X over a 1-GPU baseline, with a small drop in inference accuracy.

Tags: Computer science, CUDA, Deep learning, Machine learning, Neural networks, nVidia, nVidia GeForce GTX 1060, PyTorch

January 5, 2020 by hgpu

No votes yet.

Please wait...

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

Engineering Supercomputing Platforms for Biomolecular Applications

high performance computing on graphics processing units: hgpu.org