high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » High Performance Matrix Inversion on a Multi-core Platform with Several GPUs

High Performance Matrix Inversion on a Multi-core Platform with Several GPUs

Pablo Ezzatti, Enrique S. Quintana-Orti, Alfredo Remon

Centro de Calculo, Univ. de la Republica, Montevideo, Uruguay

19th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP), 2011

DOI:10.1109/PDP.2011.66

@inproceedings{ezzatti2011high,

title={High Performance Matrix Inversion on a Multi-core Platform with Several GPUs},

author={Ezzatti, P. and Quintana-Orti, ES and Remon, A.},

booktitle={Parallel, Distributed and Network-Based Processing (PDP), 2011 19th Euromicro International Conference on},

pages={87–93},

organization={IEEE},

year={2011}

}

Download (PDF)

View

Source

3137

views

Inversion of large-scale matrices appears in a few scientific applications like model reduction or optimal control. Matrix inversion requires an important computational effort and, therefore, the application of high performance computing techniques and architectures for matrices with dimension in the order of thousands. Following the recent uprise of graphics processors (GPUs), we present and evaluate high performance codes for matrix inversion, based on Gauss-Jordan elimination with partial pivoting, which off-load the main computational kernels to one or more GPUs while performing fine-grain operations on the general-purpose processor. The target architecture consists of a multi-core processor connected to several GPUs. Parallelism is extracted from parallel implementations of BLAS and from the concurrent execution of operations in the available computational units. Numerical experiments on a system with two Intel QuadCore processors and four NVIDIA cl060 GPUs illustrate the efficiency and the scalability of the different implementations, which deliver over 1.2 x 1012 floating point operations per second.

Tags: BLAS, Computer science, CUDA, Linear Algebra, Matrix inversion, nVidia, Presentation, Tesla C1060

July 5, 2011 by hgpu

No votes yet.

Please wait...

Your response

You must be logged in to post a comment.

* * *

high performance computing on graphics processing units: hgpu.org

High Performance Matrix Inversion on a Multi-core Platform with Several GPUs

Your response

Recent source codes

UniCoder: Unified Visual-to-Code Generation via Symbolic Rewards and Reference-Guided Code Optimization

CuFuzz: An API-Knowledge-Graph Coverage-Driven Fuzzing Framework for CUDA Libraries

AutoPass: Evidence-Guided LLM Agents for Compiler Performance Tuning

Probe-and-Refine Tuning of Repository Guidance for AI Coding Agents

CUDAnalyst (CUDA + Analyst)

CodegenBench

KernelBenchX: A Comprehensive Benchmark for Evaluating LLM-Generated GPU Kernels

CUDA Kernel Fusion Benchmarks

IntelliKit: Agent-first tooling for AMD hardware

DITRON: Distributed Compiler based on Triton for Parallel Systems

Most viewed papers (last 30 days)

High Performance Matrix Inversion on a Multi-core Platform with Several GPUs

Share this:

Your response

Recent source codes

Most viewed papers (last 30 days)