Optimizing Symmetric Dense Matrix-Vector Multiplication on GPUs

Rajib Nath, Stanimire Tomov, Tingxing "Tim" Dong, Jack Dongarra
Computer Science and Engineering, University of California, San Diego
ACM/IEEE Conference on Supercomputing (SC’11), 2011


   title={Optimizing Symmetric Dense Matrix-Vector Multiplication on GPUs},

   author={Nath, R. and Tomov, S. and Dongarra, J.},



Download Download (PDF)   View View   Source Source   Source codes Source codes




GPUs are excellent accelerators for data parallel applications with regular data access patterns. It is challenging, however, to optimize computations with irregular data access patterns on GPUs. One such computation is the Symmetric Matrix Vector product (SYMV) for dense linear algebra. Optimizing the SYMV kernel is important because it forms the basis of fundamental algorithms such as linear solvers and eigenvalue solvers on symmetric matrices. In this work, we present a new algorithm for optimizing the SYMV kernel on GPUs. Our optimized SYMV in single precision brings up to a 7x speed up compared to the (latest) CUBLAS 4.0 NVIDIA library on the GTX 280 GPU. Our SYMV kernel tuned for Fermi C2050 is 4.5x faster than CUBLAS 4.0 in single precision and 2x faster than CUBLAS 4.0 in double precision. Moreover, the techniques used and described in the paper are general enough to be of interest for developing high-performance GPU kernels beyond the particular case of SYMV.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2017 hgpu.org

All rights belong to the respective authors

Contact us: