Implementing Sparse Matrix-Vector Multiplication with QCSR on GPU
Department of Computer and Technology, Hangzhou Dianzi University, 310018, Hangzhou, Zhejiang, China
Applied Mathematics & Information Sciences, Volume 7, p.473-482, 2013
@article{zhang2013implementing,
title={Implementing Sparse Matrix-Vector Multiplication with QCSR on GPU},
author={Zhang, J. and Liu, E. and Wan, J. and Ren, Y. and Yue, M. and Wang, J.},
journal={Appl. Math},
volume={7},
number={2},
pages={473–482},
year={2013}
}
We are going through the computation from single core to multicore architecture in parallel programming. Graphics Processor Units (GPUs) have recently emerged as outstanding platforms for data parallel applications with regular data access patterns. However, it is still challenging to optimize computations with irregular data access patterns like sparse matrix-vector multiplication (SPMV). SPMV is one of the most important computational kernels in engineering practice and scientific computation. Various data formats to store the sparse matrix have been implemented on GPUs to maximize the performance. In this paper, we propose and evaluate a new implementation of SPMV on GPU based on QCSR storage format which combines the quadtree storage format and CSR format. We also outline some optimization strategies to improve performance. In comparison with previously published implementation, it achieves higher overall performance than BCSR format. The results show that it achieves 1.15 speedup averagely than BCSR format.
January 12, 2013 by hgpu