high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » Multifrontal Sparse Matrix Factorization on Graphics Processing Units

Multifrontal Sparse Matrix Factorization on Graphics Processing Units

Robert F. Lucas, Gene Wagenbreth, John J. Tran, Dan M. Davis

Information Sciences Institute, University of Southern California, Marina del Rey, California

University of Southern California, Technical Report ISI-TR-677, 2012

BibTeX

Download (PDF)

View

Source

2070

views

For many finite element problems, when represented as sparse matrices, iterative solvers are found to be unreliable because they can impose computational bottlenecks. Early pioneering work by Duff et al, explored an alternative strategy called multifrontal sparse matrix factorization. This approach, by representing the sparse problem as a tree of dense systems, maps well to modern memory hierarchies. Furthermore, it promotes the effective use of highly-tuned dense matrix arithmetic kernels, especially on novel and specialized hardware. The graphics processing units (GPUs) are architected differently than general-purpose hosts and offer an order of magnitude more processing power. This paper explores the hypothesis that GPUs can accelerate the speed of a multifrontal linear solver, even when only processing a small number of the largest frontal matrices. We show that GPUs can more than double the throughput of the sparse matrix factorization. This, in turn, promises to offer a very cost-effective speedup to many science and engineering problems in disciplines such as Mechanical Computer Aided Engineering (MCAE).

Tags: Computer science, CUDA, Factorization, nVidia, nVidia GeForce 8800, Sparse matrix

January 25, 2012 by hgpu

No votes yet.

Please wait...

Your response

You must be logged in to post a comment.

* * *

high performance computing on graphics processing units: hgpu.org

Multifrontal Sparse Matrix Factorization on Graphics Processing Units

Your response

Recent source codes

Mutual-Supervised Learning for Sequential-to-Parallel Code Translation

Hardware Compute Partitioning on NVIDIA GPUs for Composable Systems

KISim: Kubernetes Intelligent Scheduling Simulator

Efficient GPU Implementation of Multi-Precision Integer Division

exa-AMD: Exascale Accelerated Materials Discovery

ParEval: A Parallel Code Evaluation Benchmark

FlashSparse: Minimizing Computation Redundancy for Fast Sparse Matrix Multiplications on Tensor Cores

WiLLM: An Open Wireless LLM Communication System

Vcc: the Vulkan Clang Compiler

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

Most viewed papers (last 30 days)

Multifrontal Sparse Matrix Factorization on Graphics Processing Units

Share this:

Your response

Recent source codes

Most viewed papers (last 30 days)