In-Situ Techniques on GPU-Accelerated Data-Intensive Applications
Max Planck Computing and Data Facility
arXiv:2407.20731 [cs.PF], (30 Jul 2024)
@inproceedings{Ju_2023,
title={In-Situ Techniques on GPU-Accelerated Data-Intensive Applications},
url={http://dx.doi.org/10.1109/e-science58273.2023.10254865},
DOI={10.1109/e-science58273.2023.10254865},
booktitle={2023 IEEE 19th International Conference on e-Science (e-Science)},
publisher={IEEE},
author={Ju, Yi and Li, Mingshuai and Perez, Adalberto and Bellentani, Laura and Jansson, Niclas and Markidis, Stefano and Schlatter, Philipp and Laure, Erwin},
year={2023},
month={oct}
}
The computational power of High-Performance Computing (HPC) systems is constantly increasing, however, their input/output (IO) performance grows relatively slowly, and their storage capacity is also limited. This unbalance presents significant challenges for applications such as Molecular Dynamics (MD) and Computational Fluid Dynamics (CFD), which generate massive amounts of data for further visualization or analysis. At the same time, checkpointing is crucial for long runs on HPC clusters, due to limited walltimes and/or failures of system components, and typically requires the storage of large amount of data. Thus, restricted IO performance and storage capacity can lead to bottlenecks for the performance of full application workflows (as compared to computational kernels without IO). In-situ techniques, where data is further processed while still in memory rather to write it out over the I/O subsystem, can help to tackle these problems. In contrast to traditional post-processing methods, in-situ techniques can reduce or avoid the need to write or read data via the IO subsystem. They offer a promising approach for applications aiming to leverage the full power of large scale HPC systems. In-situ techniques can also be applied to hybrid computational nodes on HPC systems consisting of graphics processing units (GPUs) and central processing units (CPUs). On one node, the GPUs would have significant performance advantages over the CPUs. Therefore, current approaches for GPU-accelerated applications often focus on maximizing GPU usage, leaving CPUs underutilized. In-situ tasks using CPUs to perform data analysis or preprocess data concurrently to the running simulation, offer a possibility to improve this underutilization.
August 14, 2024 by hgpu