Early Results of Deep Learning on the Stampede2 Supercomputer
Texas Advanced Computing Center
Texas Advanced Computing Center, Technical Report, 2017
@article{zhang2017early,
title={Early Results of Deep Learning on the Stampede2 Supercomputer},
author={Zhang, Zhao and Xu, Weijia and Gaffney, Niall and Stanzione, Daniel},
year={2017}
}
We present early results of the deep learning work on the Stampede2 supercomputer. Our goal is to enable scalable and efficient deep learning model training and serving to expedite scientific discovery. We build three popular deep learning frameworks, namely, IntelCaffe, MXNet, and TensorFlow. With the built-in applications of these frameworks (CaffeNet, AlexNet, GoogLeNet, and Cifar10), we measure the scalability in both strong scaling and weak scaling way. At the time of writing, we are able to build and run Intel-Caffe, MXNet, and TensorFlow on multiple KNL nodes. While the MXNet and TensorFlow performance are still being tuned, we manage to scale the afore-mentioned applications in Caffe on 512 KNLs with ~80% efficiency compared to a single KNL performance.
October 29, 2017 by hgpu