Analysis of A Splitting Approach for the Parallel Solution of Linear Systems on GPU Cards
Electrical and Computer Engineering, University of Wisconsin-Madison, Madison, WI 53706
arXiv:1509.07919 [cs.DC], (25 Sep 2015)
@article{li2015analysis,
title={Analysis of A Splitting Approach for the Parallel Solution of Linear Systems on GPU Cards},
author={Li, Ang and Serban, Radu and Negrut, Dan},
year={2015},
month={sep},
archivePrefix={"arXiv"},
primaryClass={cs.DC}
}
We discuss an approach for solving sparse or dense banded linear systems ${bf A} {bf x} = {bf b}$ on a Graphics Processing Unit (GPU) card. The matrix ${bf A} in {mathbb{R}}^{N times N}$ is possibly nonsymmetric and moderately large; i.e., $10000 leq N leq 500000$. The ${it split and parallelize}$ (${tt SaP}$) approach seeks to partition the matrix ${bf A}$ into diagonal sub-blocks ${bf A}_i$, $i=1,ldots,P$, which are independently factored in parallel. The solution may choose to consider or to ignore the matrices that couple the diagonal sub-blocks ${bf A}_i$. This approach, along with the Krylov subspace-based iterative method that it preconditions, are implemented in a solver called ${tt SaP::GPU}$, which is compared in terms of efficiency with three commonly used sparse direct solvers: ${tt PARDISO}$, ${tt SuperLU}$, and ${tt MUMPS}$. ${tt SaP::GPU}$, which runs entirely on the GPU except several stages involved in preliminary row-column permutations, is robust and compares well in terms of efficiency with the aforementioned direct solvers. In a comparison against Intel’s ${tt MKL}$, ${tt SaP::GPU}$ also fares well when used to solve dense banded systems that are close to being diagonally dominant. ${tt SaP::GPU}$ is publicly available and distributed as open source under a permissive BSD3 license.
September 30, 2015 by hgpu