high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » Scheduling a Parallel Sparse Direct Solver to Multiple GPUs

Scheduling a Parallel Sparse Direct Solver to Multiple GPUs

Kyungjoo Kim

Department of Aerospace Engineering and Engineering Mechanics, The University of Texas at Austin, Austin, TX, USA

The 14th IEEE Workshop on Parallel and Distributed Scientific and Engineering Computing, 2013

@article{kim2013scheduling,

title={Scheduling a Parallel Sparse Direct Solver to Multiple GPUs},

author={Kim, Kyungjoo and Eijkhout, Victor},

year={2013}

}

Download (PDF)

View

Source

Source codes

Package:

UHM: Parallel Multithreaded Un-assembled Hyper Matrix Sparse Direct Solver

2330

views

We present a sparse direct solver using multilevel task scheduling on a modern heterogeneous compute node consisting of a multi-core host processor and multiple GPU accelerators. Our direct solver is based on the multifrontal method, which is characterized by exploiting dense subproblems (fronts) related in an assembly tree. Critical to high performance of the solver is dynamic task allocation to account for the asymmetric performance of heterogeneous devices. Device-specific tasks are generated and adapted to different devices on the course of multifrontal factorization using multi-level matrix partitioning. Large blocks are used to provide coarse grain tasks for fast devices, and some of the blocks are recursively partitioned to supply fine-grained tasks for the next available (slower) devices. Experimental results are obtained from particular problems arising from a high order Finite Element Method.

Tags: Computer science, CUDA, Factorization, FEM, Finite element method, Heterogeneous systems, nVidia, Package, Sparse direct solvers, Task scheduling, Tesla M2070

February 21, 2013 by hgpu

No votes yet.

Please wait...

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

gpu_tracker: Python package for tracking and profiling GPU utilization in both desktop and high-performance computing environments

high performance computing on graphics processing units: hgpu.org

Scheduling a Parallel Sparse Direct Solver to Multiple GPUs

Package:

Recent source codes

SimSYCL: Synchronous, single-threaded, library-only SYCL implementation for debugging and verification

GPU plugin for PySCF

QArray

Celerity: High-level C++ for Accelerator Clusters

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

CIFAR-10 Airbench: 94% on CIFAR-10 in 3.29 second

LOOPer: a polyhedral compiler for expressing fast and portable data parallel algorithms

OpenMC Monte Carlo Code

Polygeist: C/C++ frontend for MLIR

Parallel Gaussian process with kernel approximation in CUDA

Most viewed papers (last 30 days)

Scheduling a Parallel Sparse Direct Solver to Multiple GPUs

Package:

Share this:

Recent source codes

Most viewed papers (last 30 days)