High Performance Matrix Inversion on a Multi-core Platform with Several GPUs
Centro de Calculo, Univ. de la Republica, Montevideo, Uruguay
19th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP), 2011
@inproceedings{ezzatti2011high,
title={High Performance Matrix Inversion on a Multi-core Platform with Several GPUs},
author={Ezzatti, P. and Quintana-Orti, ES and Remon, A.},
booktitle={Parallel, Distributed and Network-Based Processing (PDP), 2011 19th Euromicro International Conference on},
pages={87–93},
organization={IEEE},
year={2011}
}
Inversion of large-scale matrices appears in a few scientific applications like model reduction or optimal control. Matrix inversion requires an important computational effort and, therefore, the application of high performance computing techniques and architectures for matrices with dimension in the order of thousands. Following the recent uprise of graphics processors (GPUs), we present and evaluate high performance codes for matrix inversion, based on Gauss-Jordan elimination with partial pivoting, which off-load the main computational kernels to one or more GPUs while performing fine-grain operations on the general-purpose processor. The target architecture consists of a multi-core processor connected to several GPUs. Parallelism is extracted from parallel implementations of BLAS and from the concurrent execution of operations in the available computational units. Numerical experiments on a system with two Intel QuadCore processors and four NVIDIA cl060 GPUs illustrate the efficiency and the scalability of the different implementations, which deliver over 1.2 x 1012 floating point operations per second.
July 5, 2011 by hgpu