Scheduling a Parallel Sparse Direct Solver to Multiple GPUs
Department of Aerospace Engineering and Engineering Mechanics, The University of Texas at Austin, Austin, TX, USA
The 14th IEEE Workshop on Parallel and Distributed Scientific and Engineering Computing, 2013
@article{kim2013scheduling,
title={Scheduling a Parallel Sparse Direct Solver to Multiple GPUs},
author={Kim, Kyungjoo and Eijkhout, Victor},
year={2013}
}
We present a sparse direct solver using multilevel task scheduling on a modern heterogeneous compute node consisting of a multi-core host processor and multiple GPU accelerators. Our direct solver is based on the multifrontal method, which is characterized by exploiting dense subproblems (fronts) related in an assembly tree. Critical to high performance of the solver is dynamic task allocation to account for the asymmetric performance of heterogeneous devices. Device-specific tasks are generated and adapted to different devices on the course of multifrontal factorization using multi-level matrix partitioning. Large blocks are used to provide coarse grain tasks for fast devices, and some of the blocks are recursively partitioned to supply fine-grained tasks for the next available (slower) devices. Experimental results are obtained from particular problems arising from a high order Finite Element Method.
February 21, 2013 by hgpu