CosmoFlow: Using Deep Learning to Learn the Universe at Scale
Lawrence Berkeley National Laboratory, 1 Cyclotron Road, M/S 59R4010A, Berkeley, CA 94720, USA
arXiv:1808.04728 [astro-ph.CO], (14 Aug 2018)
@article{mathuriya2018cosmoflow,
title={CosmoFlow: Using Deep Learning to Learn the Universe at Scale},
author={Mathuriya, Amrita and Bard, Deborah and Mendygral, Peter and Meadows, Lawrence and Arnemann, James and Shao, Lei and He, Siyu and Karna, Tuomas and Moise, Daina and Pennycook, Simon J. and Maschoff, Kristyn and Sewall, Jason and Kumar, Nalini and Ho, Shirley and Ringenburg, Mike and Prabhat, and Lee, Victor},
year={2018},
month={aug},
archivePrefix={"arXiv"},
primaryClass={astro-ph.CO}
}
Deep learning is a promising tool to determine the physical model that describes our universe. To handle the considerable computational cost of this problem, we present CosmoFlow: a highly scalable deep learning application built on top of the TensorFlow framework. CosmoFlow uses efficient implementations of 3D convolution and pooling primitives, together with improvements in threading for many element-wise operations, to improve training performance on Intel(C) Xeon Phi(TM) processors. We also utilize the Cray PE Machine Learning Plugin for efficient scaling to multiple nodes. We demonstrate fully synchronous data-parallel training on 8192 nodes of Cori with 77% parallel efficiency, achieving 3.5 Pflop/s sustained performance. To our knowledge, this is the first large-scale science application of the TensorFlow framework at supercomputer scale with fully-synchronous training. These enhancements enable us to process large 3D dark matter distribution and predict the cosmological parameters $Omega_M$, $sigma_8$ and n$_s$ with unprecedented accuracy.
August 19, 2018 by hgpu