Exploring Scalability in C++ Parallel STL Implementations
Faculty of Informatics, TU Wien, Vienna, Austria
Proceedings of the 53rd International Conference on Parallel Processing (ICPP’24), 2024
@inproceedings{laso2024exploring,
title={Exploring Scalability in C++ Parallel STL Implementations},
author={Laso, Ruben and Krupitza, Diego and Hunold, Sascha},
booktitle={Proceedings of the 53rd International Conference on Parallel Processing},
pages={284–293},
year={2024}
}
Since the advent of parallel algorithms in the C++17 Standard Template Library (STL), the STL has become a viable framework for creating performance-portable applications. Given multiple existing implementations of the parallel algorithms, a systematic, quantitative performance comparison is essential for choosing the appropriate implementation for a particular hardware configuration. In this work, we introduce a specialized set of micro-benchmarks to assess the scalability of the parallel algorithms in the STL. By selecting different backends, our micro-benchmarks can be used on multi-core systems and GPUs. Using the suite, in a case study on AMD and Intel CPUs and NVIDIA GPUs, we were able to identify substantial performance disparities among different implementations, including GCC+TBB, GCC+HPX, Intel’s compiler with TBB, or NVIDIA’s compiler with OpenMP and CUDA.
September 1, 2024 by hgpu