CSR5: An Efficient Storage Format for Cross-Platform Sparse Matrix-Vector Multiplication
Niels Bohr Institute, University of Copenhagen, Copenhagen, Denmark
arXiv:1503.05032 [cs.MS], (17 Mar 2015)
@article{liu2015efficient,
title={CSR5: An Efficient Storage Format for Cross-Platform Sparse Matrix-Vector Multiplication},
author={Liu, Weifeng and Vinter, Brian},
year={2015},
month={mar},
archivePrefix={"arXiv"},
primaryClass={cs.MS}
}
Sparse matrix-vector multiplication (SpMV) is a fundamental building block for numerous applications. In this paper, we propose CSR5 (Compressed Sparse Row 5), a new storage format, which offers high-throughput SpMV on various platforms including CPUs, GPUs and Xeon Phi. First, the CSR5 format is insensitive to the sparsity structure of the input matrix. Thus the single format can support a SpMV algorithm that is efficient both for regular matrices and for irregular matrices. Furthermore, we show that the overhead of the format conversion from the CSR to the CSR5 can be as low as cost of a few SpMV operations. We compare the CSR5-based SpMV algorithm with 11 state-of-the-art formats/algorithms on four mainstream processors using 14 regular and 10 irregular matrices as a benchmark suite. For the 14 regular matrices in the suite, we achieve comparable or better performance over the previous work. For the 10 irregular matrices, the CSR5 obtains average performance improvement of 17.6%, 28.5%, 173.0% and 293.3% (up to 213.3%, 153.6%, 405.1% and 943.3%) over the best existing work on dual-socket Intel CPUs, an nVidia GPU, an AMD GPU and an Intel Xeon Phi, respectively. For real-world applications with only tens of iterations, the CSR5 format can be more practical because of its low-overhead for format conversion. The source code of this work is downloadable at this https URL
March 20, 2015 by hgpu