29611

Reproducible Study and Performance Analysis of GPU Programming Paradigms: OpenACC vs. CUDA in Key Linear Algebra Computations

Ezhilmathi Krishnasamy, Pascal Bouvry
Faculty of Science and Technology FSTM, University of Luxembourg, 2 Av. de l’Universite, Esch-Belval, L-4365, Esch-sur-Alzette, Luxembourg
Research Square, preprint rs.3.rs-5657196/v1, 2024

@article{krishnasamy2024reproducible,

   title={A Reproducible Study and Performance Analysis of GPU Programming Paradigms: OpenACC vs. CUDA in Key Linear Algebra Computations},

   author={Krishnasamy, Ezhilmathi and Bouvry, Pascal},

   year={2024}

}

Scientific and engineering problems are frequently governed by partial differential equations; however, the analytical solutions of these equations are often impractical, thereby forcing the adoption of numerical methods. Basic Linear Algebra Subprograms (BLAS) operations constitute a fundamental component of these numerical approaches, incorporating essential tasks such as Level 1 operations (dot products and vector addition), Level 2 operations (matrix-vector multiplication), and Level 3 operations (matrix-matrix multiplication). Graphics Processing Units (GPUs), particularly those produced by NVIDIA, have gained significant computational power and are extensively employed to tackle a variety of numerical challenges. Nevertheless, substantial obstacles remain in targeting diverse GPU architectures, particularly concerning portability, the reduction of workarounds, and the enhancement of performance. This study utilizes directive-based programming languages, such as OpenACC, to effectively exploit GPU capabilities. We undertake a comprehensive comparative study and performance evaluation of the OpenACC programming model in comparison to CUDA in executing essential BLAS routines.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: