28703

Performance portability evaluation of blocked stencil computations on GPUs

Oscar Antepara, Hans Johansen, Samuel Williams, Tuowen Zhao, Samantha Hirsch, Priya Goyal, Mary Hall
Lawrence Berkeley National Lab, Berkeley, California, USA
International Workshop on Performance, Portability & Productivity in HPC (P3HPC), 2023

@article{antepara2023performance,

   title={Performance portability evaluation of blocked stencil computations on GPUs},

   author={Antepara, Oscar and Johansen, Hans and Williams, Samuel and Zhao, Tuowen and Hirsch, Samantha and Goyal, Priya and Hall, Mary},

   year={2023}

}

In this new era where multiple GPU vendors are leading the supercomputing landscape, and multiple programming models are available to users, the drive to achieve performance portability across platforms faces new challenges. Consider stencil algorithms, where architecture-specific solutions are required to optimize for the parallelism hierarchy and memory hierarchy of emerging systems. In this work, we analyze performance portability of the BrickLib domain-specific library and vector code generator for stencils. BrickLib employs fine-grain data blocking to reduce the large amount of data movement associated with stencils. We compare different GPUs (NVIDIA, AMD and Intel) and their associated programming models (CUDA, HIP and SYCL). By testing a wide range of stencil configurations, we show that overall, BrickLib achieves good performance independent of machine or programming model. Moreover, we introduce correlation models as a new tool for comparing architectures and programming models from Roofline model data.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: