Heterogenous Acceleration for Linear Algebra in Multi-Coprocessor Environments
University of Tennessee Knoxville, USA
University of Tennessee Knoxville, Technical report UT-EECS-14-724, 2014
@article{haidar2014heterogenous,
title={Heterogenous Acceleration for Linear Algebra in Multi-Coprocessor Environments},
author={Haidar, Azzam and Luszczek, Piotr and Tomov, Stanimire and Dongarra, Jack},
year={2014}
}
We present an efficient and scalable programming model for the development of linear algebra in heterogeneous multi-coprocessor environments. The model incorporates some of the current best design and implementation practices for the heterogeneous acceleration of dense linear algebra (DLA). Examples are given as the basis for solving linear systems’ algorithms – the LU, QR, and Cholesky factorizations. To generate the extreme level of parallelism needed for the efficient use of coprocessors, algorithms of interest are redesigned and then split into well-chosen computational tasks. The tasks execution is scheduled over the computational components of a hybrid system of multi-core CPUs and coprocessors using a light-weight runtime system. The use of light-weight runtime systems keeps scheduling overhead low, while enabling the expression of parallelism through otherwise sequential code. This simplifies the development efforts and allows the exploration of the unique strengths of the various hardware components.
February 28, 2014 by hgpu