Demystifying the MLPerf Benchmark Suite
University of Texas at Austin
arXiv:1908.09207 [cs.LG], (24 Aug 2019)
@article{verma2019demystifying,
title={Demystifying the MLPerf Benchmark Suite},
author={Verma, Snehil and Wu, Qinzhe and Hanindhito, Bagus and Jha, Gunjan and John, Eugene B and Radhakrishnan, Ramesh and John, Lizy K},
journal={arXiv preprint arXiv:1908.09207},
year={2019}
}
MLPerf, an emerging machine learning benchmark suite strives to cover a broad range of applications of machine learning. We present a study on its characteristics and how the MLPerf benchmarks differ from some of the previous deep learning benchmarks like DAWNBench and DeepBench. We find that application benchmarks such as MLPerf (although rich in kernels) exhibit different features compared to kernel benchmarks such as DeepBench. MLPerf benchmark suite contains a diverse set of models which allows unveiling various bottlenecks in the system. Based on our findings, dedicated low latency interconnect between GPUs in multi-GPU systems is required for optimal distributed deep learning training. We also observe variation in scaling efficiency across the MLPerf models. The variation exhibited by the different models highlight the importance of smart scheduling strategies for multi-GPU training. Another observation is that CPU utilization increases with increase in number of GPUs used for training. Corroborating prior work we also observe and quantify improvements possible by compiler optimizations, mixed-precision training and use of Tensor Cores.
September 1, 2019 by hgpu