Performance of PETSc GPU Implementation with Sparse Matrix Storage Schemes
The University of Edinburgh
The University of Edinburgh, 2011
@article{kumbhar2011performance,
title={Performance of PETSc GPU Implementation with Sparse Matrix Storage Schemes},
author={Kumbhar, P.},
year={2011}
}
PETSc is a scalable solver library developed at Argonne National Laboratory (ANL). It is widely used for solving system of equations arising from discretisation of partial differential equations (PDEs). GPU support has recently been added to PETSc to exploit the performance of GPUs. This support is quite new and currently only available in the PETSc development release. The goal of this MSc project is to evaluate the performance of the current GPU implementation, especially iterative solvers on the HECToR GPU cluster. In the current implementation, a new sub-class of matrix was added which stores matrix in Compressed Sparse Row (CSR) format. We have extended the current PETSc GPU implementation to improve the performance using different sparse matrix storage schemes like ELL, Diagonal and Hybrid. For structured matrices, the current GPU implementation shows 4x speedup compared – to Intel Xeon quad-core CPU. For multi-GPU applications, speedup starts decreasing due to high communication costs on the HECToR GPU cluster. Our implementation with new storage schemes show 50% performance improvement on sparse matrixvector operations. For structured matrices, new implementation shows 7x speedup and significantly improves the performance of vector operations on the GPU.
December 8, 2011 by hgpu