HeteroMap: A Runtime Performance Predictor for Efficient Processing of Graph Analytics on Heterogeneous Multi-Accelerators
University of Connecticut, Storrs, CT, USA
IEEE International Symposium on Performance Analysis of Systems and Software, (ISPASS), 2019
@article{ahmad2019heteromap,
title={HeteroMap: A Runtime Performance Predictor for Efficient Processing of Graph Analytics on Heterogeneous Multi-Accelerators},
author={Ahmad, Masab and Dogan, Halit and Michael, Christopher J and Khan, Omer},
year={2019}
}
With the ever-increasing amount of data and input variations, portable performance is becoming harder to exploit on today’s architectures. Computational setups utilize single-chip processors, such as GPUs or large-scale multicores for graph analytics. Some algorithm-input combinations perform more efficiently when utilizing a GPU’s higher concurrency and bandwidth, while others perform better with a multicore’s stronger data caching capabilities. Architectural choices also occur within selected accelerators, where variables such as threading and thread placement need to be decided for optimal performance. This paper proposes a performance predictor paradigm for a heterogeneous parallel architecture where multiple disparate accelerators are integrated in an operational high performance computing setup. The predictor aims to improve graph processing efficiency by exploiting the underlying concurrency variations within and across the heterogeneous integrated accelerators using graph benchmark and input characteristics. The evaluation shows that intelligent and real-time selection of near-optimal concurrency choices provides performance benefits ranging from 5% to 3.8x, and an energy benefit averaging around 2.4x over the traditional single-accelerator setup.
March 31, 2019 by hgpu