From Sparse Matrix to Optimal GPU CUDA Sparse Matrix Vector Product Implementation
ANU Supercomput. Facility, Australian Nat. Univ., Canberra, ACT, Australia
10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing (CCGrid), 2010
@inproceedings{el2010sparse,
title={From sparse matrix to optimal GPU CUDA sparse matrix vector product implementation},
author={El Zein, A.H. and Rendell, A.P.},
booktitle={2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing},
pages={808–813},
year={2010},
organization={IEEE}
}
The CUDA model for GPUs presents the programmer with a plethora of different programming options. These includes different memory types, different memory access methods, and different data types. Identifying which options to use and when is a non-trivial exercise. This paper explores the effect of these different options on the performance of a routine that evaluates sparse matrix vector products. A process for analysing performance and selecting the subset of implementations that perform best is proposed. The potential for mapping sparse matrix attributes to optimal CUDA sparse matrix vector product implementation is discussed.
May 20, 2011 by hgpu