clSpMV: A Cross-Platform OpenCL SpMV Framework on GPUs
University of California, Berkeley, EECS Department
International Conference on Supercomputing (ICS 2012), 2012
@article{su2012clspmv,
title={clSpMV: A Cross-Platform OpenCL SpMV Framework on GPUs},
author={Su, Bor-Yiing and Keutzer, Kurt},
booktitle={Proceedings of the international conference on Supercomputing},
series={ICS ’12},
year={2012}
}
Sparse matrix vector multiplication (SpMV) kernel is a key computation in linear algebra. Most iterative methods are composed of SpMV operations with BLAS1 updates. Therefore, researchers make extensive efforts to optimize the SpMV kernel in sparse linear algebra. With the appearance of OpenCL, a programming language that standardizes parallel programming across a wide variety of heterogeneous platforms, we are able to optimize the SpMV kernel on many different platforms. In this paper, we propose a new sparse matrix format, the Cocktail Format, to take advantage of the strengths of many different sparse matrix formats. Based on the Cocktail Format, we develop the clSpMV framework that is able to analyze all kinds of sparse matrices at runtime, and recommend the best representations of the given sparse matrices on different platforms. Although solutions that are portable across diverse platforms generally provide lower performance when compared to solutions that are specialized to particular platforms, our experimental results show that clSpMV can find the best representations of the input sparse matrices on both Nvidia and AMD platforms, and deliver 83% higher performance compared to the vendor optimized CUDA implementation of the proposed hybrid sparse format in [3], and 63:6% higher performance compared to the CUDA implementations of all sparse formats in [3].
May 30, 2012 by hgpu