Distributed, combined CPU and GPU profiling within HPX using APEX

Patrick Diehl, Gregor Daiss, Kevin Huck, Dominic Marcello, Sagiv Shiber, Hartmut Kaiser, Juhan Frank, Geoffrey C. Clayton, Dirk Pflueger
LSU Center for Computation & Technology, Louisiana State University, Baton Rouge, LA, 70803 U.S.A
arXiv:2210.06437 [cs.DC], (21 Sep 2022)




   author={Diehl, Patrick and Daiss, Gregor and Huck, Kevin and Marcello, Dominic and Shiber, Sagiv and Kaiser, Hartmut and Frank, Juhan and Clayton, Geoffrey C. and Pflueger, Dirk},

   keywords={Distributed, Parallel, and Cluster Computing (cs.DC), FOS: Computer and information sciences, FOS: Computer and information sciences},

   title={Distributed, combined CPU and GPU profiling within HPX using APEX},



   copyright={Creative Commons Attribution Non Commercial Share Alike 4.0 International}


Benchmarking and comparing performance of a scientific simulation across hardware platforms is a complex task. When the simulation in question is constructed with an asynchronous, many-task (AMT) runtime offloading work to GPUs, the task becomes even more complex. In this paper, we discuss the use of a uniquely suited performance measurement library, APEX, to capture the performance behavior of a simulation built on HPX, a highly scalable, distributed AMT runtime. We examine the performance of the astrophysics simulation carried-out by Octo-Tiger on two different supercomputing architectures. We analyze the results of scaling and measurement overheads. In addition, we look in-depth at two similarly configured executions on the two systems to study how architectural differences affect performance and identify opportunities for optimization. As one such opportunity, we optimize the communication for the hydro solver and investigated its performance impact.
No votes yet.
Please wait...

* * *

* * *

* * *

HGPU group © 2010-2023 hgpu.org

All rights belong to the respective authors

Contact us: