17800

GPU Parallelization for Unstructured Sparse Matrix Problems with OpenMP 4.5 and OpenACC

Stefan Rosenberger, Gundolf Haase
University of Graz, 8010 Graz, Austria
University of Graz, SFB-Report No. 2017-010, 2017

@article{rosenberger2017gpu,

   title={GPU Parallelization for Unstructured Sparse Matrix Problems with OpenMP 4.5 and OpenACC},

   author={Rosenberger, S and Haase, G},

   year={2017}

}

Download Download (PDF)   View View   Source Source   

2462

views

The effective use of parallelized hardware is an important goal of today’s computer developments. Nvidia GPUs are an important footing in this context. While CUDA implemented algorithms focus on detailed optimized usage of GPU elements the pragma directive parallelization targets GPU computation for a broader community. In this paper we focus on the implementation of OpenACC and OpenMP 4.5 parallelization for Nvidia GPUs for a sparse matrix solver on unstructured discretizations. We show similarities between these methods and current performance differences. We focus also on the possibilities to force pragma directive parallelized GPU code to a specific vectorization. Finally we demonstrate the performance of these methods in a complex structured C++ implementation of the CG and the GMRES method with an algebraic multigrid as preconditioner.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2017 hgpu.org

All rights belong to the respective authors

Contact us: