high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » Scheduling a Parallel Sparse Direct Solver to Multiple GPUs

Scheduling a Parallel Sparse Direct Solver to Multiple GPUs

Kyungjoo Kim

Department of Aerospace Engineering and Engineering Mechanics, The University of Texas at Austin, Austin, TX, USA

The 14th IEEE Workshop on Parallel and Distributed Scientific and Engineering Computing, 2013

@article{kim2013scheduling,

title={Scheduling a Parallel Sparse Direct Solver to Multiple GPUs},

author={Kim, Kyungjoo and Eijkhout, Victor},

year={2013}

}

Download (PDF)

View

Source

Source codes

Package:

UHM: Parallel Multithreaded Un-assembled Hyper Matrix Sparse Direct Solver

3093

views

We present a sparse direct solver using multilevel task scheduling on a modern heterogeneous compute node consisting of a multi-core host processor and multiple GPU accelerators. Our direct solver is based on the multifrontal method, which is characterized by exploiting dense subproblems (fronts) related in an assembly tree. Critical to high performance of the solver is dynamic task allocation to account for the asymmetric performance of heterogeneous devices. Device-specific tasks are generated and adapted to different devices on the course of multifrontal factorization using multi-level matrix partitioning. Large blocks are used to provide coarse grain tasks for fast devices, and some of the blocks are recursively partitioned to supply fine-grained tasks for the next available (slower) devices. Experimental results are obtained from particular problems arising from a high order Finite Element Method.

Tags: Computer science, CUDA, Factorization, FEM, Finite element method, Heterogeneous systems, nVidia, Package, Sparse direct solvers, Task scheduling, Tesla M2070

February 21, 2013 by hgpu

No votes yet.

Please wait...

Your response

You must be logged in to post a comment.

* * *

high performance computing on graphics processing units: hgpu.org

Scheduling a Parallel Sparse Direct Solver to Multiple GPUs

Package:

Your response

Recent source codes

UniCoder: Unified Visual-to-Code Generation via Symbolic Rewards and Reference-Guided Code Optimization

CuFuzz: An API-Knowledge-Graph Coverage-Driven Fuzzing Framework for CUDA Libraries

AutoPass: Evidence-Guided LLM Agents for Compiler Performance Tuning

Probe-and-Refine Tuning of Repository Guidance for AI Coding Agents

CUDAnalyst (CUDA + Analyst)

CodegenBench

KernelBenchX: A Comprehensive Benchmark for Evaluating LLM-Generated GPU Kernels

CUDA Kernel Fusion Benchmarks

IntelliKit: Agent-first tooling for AMD hardware

DITRON: Distributed Compiler based on Triton for Parallel Systems

Most viewed papers (last 30 days)

Scheduling a Parallel Sparse Direct Solver to Multiple GPUs

Package:

Share this:

Your response

Recent source codes

Most viewed papers (last 30 days)