An Evaluative Comparison of Performance Portability across GPU Programming Models
Department of Computer Science, University of Maryland
arXiv:2402.08950 [cs.DC], (14 Feb 2024)
@misc{davis2024evaluative,
title={An Evaluative Comparison of Performance Portability across GPU Programming Models},
author={Joshua H. Davis and Pranav Sivaraman and Isaac Minn and Konstantinos Parasyris and Harshitha Menon and Giorgis Georgakoudis and Abhinav Bhatele},
year={2024},
eprint={2402.08950},
archivePrefix={arXiv},
primaryClass={cs.DC}
}
Ensuring high productivity in scientific software development necessitates developing and maintaining a single codebase that can run efficiently on a range of accelerator-based supercomputing platforms. While prior work has investigated the performance portability of a few selected proxy applications or programming models, this paper provides a comprehensive study of a range of proxy applications implemented in the major programming models suitable for GPU-based platforms. We present and analyze performance results across NVIDIA and AMD GPU hardware currently deployed in leadership-class computing facilities using a representative range of scientific codes and several programming models — CUDA, HIP, Kokkos, RAJA, OpenMP, OpenACC, and SYCL. Based on the specific characteristics of applications tested, we include recommendations to developers on how to choose the right programming model for their code. We find that Kokkos and RAJA in particular offer the most promise empirically as performance portable programming models. These results provide a comprehensive evaluation of the extent to which each programming model for heterogeneous systems provides true performance portability in real-world usage.
February 18, 2024 by hgpu