high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » Analysis of A Splitting Approach for the Parallel Solution of Linear Systems on GPU Cards

Analysis of A Splitting Approach for the Parallel Solution of Linear Systems on GPU Cards

Ang Li, Radu Serban, Dan Negrut

Electrical and Computer Engineering, University of Wisconsin-Madison, Madison, WI 53706

arXiv:1509.07919 [cs.DC], (25 Sep 2015)

@article{li2015analysis,

title={Analysis of A Splitting Approach for the Parallel Solution of Linear Systems on GPU Cards},

author={Li, Ang and Serban, Radu and Negrut, Dan},

year={2015},

month={sep},

archivePrefix={"arXiv"},

primaryClass={cs.DC}

}

Download (PDF)

View

Source

Source codes

Package:

SaP GPU

1780

views

We discuss an approach for solving sparse or dense banded linear systems ${bf A} {bf x} = {bf b}$ on a Graphics Processing Unit (GPU) card. The matrix ${bf A} in {mathbb{R}}^{N times N}$ is possibly nonsymmetric and moderately large; i.e., $10000 leq N leq 500000$. The ${it split and parallelize}$ (${tt SaP}$) approach seeks to partition the matrix ${bf A}$ into diagonal sub-blocks ${bf A}_i$, $i=1,ldots,P$, which are independently factored in parallel. The solution may choose to consider or to ignore the matrices that couple the diagonal sub-blocks ${bf A}_i$. This approach, along with the Krylov subspace-based iterative method that it preconditions, are implemented in a solver called ${tt SaP::GPU}$, which is compared in terms of efficiency with three commonly used sparse direct solvers: ${tt PARDISO}$, ${tt SuperLU}$, and ${tt MUMPS}$. ${tt SaP::GPU}$, which runs entirely on the GPU except several stages involved in preliminary row-column permutations, is robust and compares well in terms of efficiency with the aforementioned direct solvers. In a comparison against Intel’s ${tt MKL}$, ${tt SaP::GPU}$ also fares well when used to solve dense banded systems that are close to being diagonally dominant. ${tt SaP::GPU}$ is publicly available and distributed as open source under a permissive BSD3 license.

Tags: Computer science, CUDA, Linear Algebra, nVidia, Package, Sparse direct solvers, Tesla K20

September 30, 2015 by hgpu

Rating: 2.5/5. From 1 vote.

Please wait...

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

gpu_tracker: Python package for tracking and profiling GPU utilization in both desktop and high-performance computing environments

* * *

high performance computing on graphics processing units: hgpu.org

Analysis of A Splitting Approach for the Parallel Solution of Linear Systems on GPU Cards

Package:

Recent source codes

QArray

Celerity: High-level C++ for Accelerator Clusters

CIFAR-10 Airbench: 94% on CIFAR-10 in 3.29 second

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

LOOPer: a polyhedral compiler for expressing fast and portable data parallel algorithms

OpenMC Monte Carlo Code

Polygeist: C/C++ frontend for MLIR

Parallel Gaussian process with kernel approximation in CUDA

Optical flow algorithms for SYCL

OpenMP5-Offload-OpenMC-Intel-PVC

Most viewed papers (last 30 days)

Analysis of A Splitting Approach for the Parallel Solution of Linear Systems on GPU Cards

Package:

Share this:

Recent source codes

Most viewed papers (last 30 days)