https://hgpu.org/?p=6704
Sparse matrix-vector multiplication on GPGPU clusters: A new storage format and a scalable implementation