Scaleable Sparse Matrix-Vector Multiplication with Functional Memory and GPUs

Noboru Tanabe, Yuuka Ogawa, Masami Takata, Kazuki Joe
19th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP), 2011


   title={Scaleable Sparse Matrix-Vector Multiplication with Functional Memory and GPUs},

   author={Tanabe, N. and Ogawa, Y. and Takata, M. and Joe, K.},

   booktitle={Parallel, Distributed and Network-Based Processing (PDP), 2011 19th Euromicro International Conference on},





Source Source   



Sparse matrix-vector multiplication on GPUs faces to a serious problem when the vector length is too large to be stored in GPU’s device memory. To solve this problem, we propose a novel software-hardware hybrid method for a heterogeneous system with GPUs and functional memory modules connected by PCI express. The functional memory contains huge capacity of memory and provides scatter/gather operations. We perform some preliminary evaluation for the proposed method with using a sparse matrix benchmark collection. We observe that the proposed method for a GPU with converting indirect references to direct references without exhausting GPU’s cache memory achieves 4.1 times speedup compared with conventional methods. The proposed method intrinsically has high scalability of the number of GPUs because intercommunication among GPUs is completely eliminated. Therefore we estimate the performance of our proposed method would be expressed as the single GPU execution performance, which may be suppressed by the burst-transfer bandwidth of PCI express, multiplied with the number of GPUs.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2020 hgpu.org

All rights belong to the respective authors

Contact us: