Measurement and Analysis of GPU-accelerated Applications with HPCToolkit
Department of Computer Science, Rice University, Houston, TX
arXiv:2109.06931 [cs.DC], (14 Sep 2021)
@article{Zhou_2021,
title={Measurement and analysis of GPU-accelerated applications with HPCToolkit},
volume={108},
ISSN={0167-8191},
url={http://dx.doi.org/10.1016/j.parco.2021.102837},
DOI={10.1016/j.parco.2021.102837},
journal={Parallel Computing},
publisher={Elsevier BV},
author={Zhou, Keren and Adhianto, Laksono and Anderson, Jonathon and Cherian, Aaron and Grubisic, Dejan and Krentel, Mark and Liu, Yumeng and Meng, Xiaozhu and Mellor-Crummey, John},
year={2021},
month={Dec},
pages={102837}
}
To address the challenge of performance analysis on the US DOE’s forthcoming exascale supercomputers, Rice University has been extending its HPCToolkit performance tools to support measurement and analysis of GPU-accelerated applications. To help developers understand the performance of accelerated applications as a whole, HPCToolkit’s measurement and analysis tools attribute metrics to calling contexts that span both CPUs and GPUs. To measure GPU-accelerated applications efficiently, HPCToolkit employs a novel wait-free data structure to coordinate monitoring and attribution of GPU performance. To help developers understand the performance of complex GPU code generated from high-level programming models, HPCToolkit constructs sophisticated approximations of call path profiles for GPU computations. To support fine-grained analysis and tuning, HPCToolkit uses PC sampling and instrumentation to measure and attribute GPU performance metrics to source lines, loops, and inlined code. To supplement fine-grained measurements, HPCToolkit can measure GPU kernel executions using hardware performance counters. To provide a view of how an execution evolves over time, HPCToolkit can collect, analyze, and visualize call path traces within and across nodes. Finally, on NVIDIA GPUs, HPCToolkit can derive and attribute a collection of useful performance metrics based on measurements using GPU PC samples. We illustrate HPCToolkit’s new capabilities for analyzing GPU-accelerated applications with several codes developed as part of the Exascale Computing Project.
September 19, 2021 by hgpu