## Improving Numerical Accuracy for Non-Negative Matrix Multiplication on GPUs using Recursive Algorithms

University of California Irvine, Irvine, CA 92697

International Conference on Supercomputing (ICS), 2013

@article{badin2013improving,

title={Improving Numerical Accuracy for Non-Negative Matrix Multiplication on GPUs using Recursive Algorithms},

author={Badin, Matthew and D’Alberto, Paolo and Bic, Lubomir and Dillencourt, Michael and Nicolau, Alexandru},

year={2013}

}

Scientific computing is only bound by the limits of Moore’s Law and the scalability of high performance mathematical library implementations. Most mathematical libraries however tend to focus only on general inputs, limiting their potential performance and scalability by not tailoring their implementation to specific inputs, such as non-negative inputs. By removing this limitation it is possible to improve the performance and accuracy of a range of problems. In this paper we explore the limitations of hardware to improve accuracy of non-negative matrix multiply by specifically comparing implementations on the GPU and CPU and propose algorithmic solutions to improve accuracy. Next, we demonstrate a matrix multiply implementation that takes advantage of asymptotically fast matrix multiply algorithms, which have been shown to scale better than O(N^3) matrix multiply implementations, and improve accuracy by up to a whole digit while increasing performance by up to 27% for matrices where the input is positive. Finally, we propose to extend the BLAS level 3 specification to non-negative matrices to allow easy integration of our solution and allow other library authors to implement their own solutions as part of an existing standard.

April 30, 2013 by hgpu