Accurate CUDA Performance Modeling for Sparse Matrix-Vector Multiplication

Ping Guo, Liqiang Wang
Department of Computer Science, University of Wyoming, USA
2012 International Conference on High Performance Computing & Simulation (HPCS 2012), 2012


   title={Accurate CUDA Performance Modeling for Sparse Matrix-Vector Multiplication},

   author={Guo, P. and Wang, L.},



Download Download (PDF)   View View   Source Source   



This paper presents an integrated analytical and profile-based CUDA performance modeling approach to accurately predict the kernel execution times of sparse matrix-vector multiplication for CSR, ELL, COO, and HYB SpMV CUDA kernels. Based on our experiments conducted on a collection of 8 widely-used testing matrices on NVIDIA Tesla C2050, the execution times predicted by our model match the measured execution times of NVIDIA’s SpMV implementations very well. Specifically, for 29 out of 32 test cases, the performance differences are under or around 7%. For the rest 3 test cases, the differences are between 8% and 10%. For CSR, ELL, COO, and HYB SpMV kernels, the differences are 4:2%, 5:2%, 1:0%, and 5:7% on the average, respectively.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2021 hgpu.org

All rights belong to the respective authors

Contact us: