high performance computing on graphics processing units: hgpu.org

hgpu.org » Programming » Algorithms » Sparse direct solvers with accelerators over DAG runtimes

Sparse direct solvers with accelerators over DAG runtimes

Xavier Lacoste, Pierre Ramet, Mathieu Faverge, Yamazaki Ichitaro, Jack Dongarra

INRIA, University of Bordeaux, Bordeaux, France

hal-00700066, 2012

@techreport{lacoste:hal-00700066,

hal_id={hal-00700066},

url={http://hal.inria.fr/hal-00700066},

title={Sparse direct solvers with accelerators over DAG runtimes},

author={Lacoste, Xavier and Ramet, Pierre and Faverge, Mathieu and Ichitaro, Yamazaki and Dongarra, Jack},

affiliation={BACCHUS – INRIA Bordeaux – Sud-Ouest , Laboratoire Bordelais de Recherche en Informatique – LaBRI , Innovative Computing Laboratory – ICL},

pages={11},

type={Rapport de recherche},

institution={INRIA},

number={RR-7972},

year={2012},

pdf={http://hal.inria.fr/hal-00700066/PDF/RR-7972.pdf}

}

Download (PDF)

View

Source

2567

views

The current trend in the high performance computing shows a dramatic increase in the number of cores on the shared memory compute nodes. Algorithms, especially those related to linear algebra, need to be adapted to these new computer architectures in order to be efficient. PASTIX is a sparse parallel direct solver, that incorporates a dynamic scheduler for strongly hierarchical modern architectures. In this paper, we study the replacement of this internal highly integrated scheduling strategy by two generic runtime frameworks: DAGUE and STARPU. Those runtimes will give the opportunity to execute the factorization tasks graph on emerging computers equipped with accelerators. As for previous work done in dense linear algebra, we present the kernels used for GPU computations inspired by the MAGMA library and the DAG algorithm used with those two runtimes. A comparative study of the performances of the supernodal solver with the three different schedulers is performed on manycore architectures and the improvements obtained with accelerators are presented with the STARPU runtime. These results demonstrate that these DAG runtimes provide uniform programming interfaces to obtain high performance on different architectures on irregular problems as sparse direct factorizations.

Tags: Algorithms, Computer science, CUDA, Factorization, Linear Algebra, nVidia, Sparse direct solvers, Tesla T20

May 24, 2012 by hgpu

No votes yet.

Please wait...