Generating optimal CUDA sparse matrix-vector product implementations for evolving GPU hardware

Ahmed H. El Zein, Alistair P. Rendell
ANU Supercomputing Facility, The Australian National University, Canberra, ACT 0200, Australia
Concurrency and Computation: Practice and Experience, Special Issue: Special Section on Challenges and Solutions in Multicore and Many-Core Computing, Volume 24, Issue 1, pages 3-13, 2012


   title={Generating optimal CUDA sparse matrix–vector product implementations for evolving GPU hardware},

   author={El Zein, A.H. and Rendell, A.P.},

   journal={Concurrency and Computation: Practice and Experience},


   publisher={Wiley Online Library}


Download Download (PDF)   View View   Source Source   



The CUDA model for graphics processing units (GPUs) presents the programmer with a plethora of different programming options. These includes different memory types, different memory access methods and different data types. Identifying which options to use and when is a non-trivial exercise. This paper explores the effect of these different options on the performance of a routine that evaluates sparse matrix-vector products (SpMV) across three different generations of NVIDIA GPU hardware. A process for analysing performance and selecting the subset of implementations that perform best is proposed. The potential for mapping sparse matrix attributes to optimal CUDA SpMV implementations is discussed.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2021 hgpu.org

All rights belong to the respective authors

Contact us: