high performance computing on graphics processing units: hgpu.org

hgpu.org » Programming » Algorithms » A Data-oriented Method for Scheduling Dependent Tasks on High-density Multi-GPU Systems

A Data-oriented Method for Scheduling Dependent Tasks on High-density Multi-GPU Systems

Peng Zhang, Yuxiang Gao, Meikang Qiu

Biomedical Engineering Department, Stony Brook University, Stony Brook, NY, United States

IEEE 17th International Conference on High Performance Computing and Communications (HPCC-ICESS-CSS 2015), 2015

BibTeX

Download (PDF)

View

Source

1815

views

The rapidly-changing computer architectures, though improving the performance of computers, have been challenging the programming environments for efficiently harnessing the potential of novel architectures. In this area, though the high-density multi-GPU architecture enabled unparalleled performance advantage of dense GPUs in a single server, it has increased the difficulty for scheduling diversified and dependent tasks. We therefore propose a data-oriented method for scheduling dependent tasks for this architecture while providing its implementation. In our method, we model a parallel program as a collection of data-dependent tasks for which data dependencies are managed by an expressive matrix. Accordingly, we develop a hierarchical scheduler infrastructure for our model. In this, a top scheduler is built for querying the data-dependency matrix; three downstream schedulers for queuing computation tasks that are exclusively assigned to processor, accelerator or either; and a multitude of bottom schedulers each for providing a processing element with assigned tasks. We experiment our scheduler for examples of Strassen matrix multiplication and Cholesky matrix inversion algorithms on a computer that has 8 Tesla K40 GPUs. The results show that our method is capable of offering the efficient task parallelism while fulfilling the complex task dependencies. When advanced task-oriented schedulers have been widely designed for distributed systems, a lightweight data-driven scheduler could be an alternative and handy approach that can handle the dependent yet diversified tasks of data-intensive applications for the novel high-density multi-accelerator system.

Tags: Algorithms, Computer science, CUDA, Matrix inversion, Matrix multiplication, nVidia, Task scheduling, Tesla K40

August 3, 2015 by hgpu

No votes yet.

Please wait...

high performance computing on graphics processing units: hgpu.org

A Data-oriented Method for Scheduling Dependent Tasks on High-density Multi-GPU Systems

Recent source codes

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

chemtrain: Training Molecular Dynamics Potentials in JAX

microSYCL: SYCL micro-benchmarks repository

XaaS containers

SYCL Container

CASS: Cuda-Amd aSSembly

Cluser of smartphones for edge computing application using TensorFlow

CFAL-bench

Efficient Graph Embedding at Scale: Optimizing CPU-GPU-SSD Integration

Can Large Language Models Predict Parallel Code Performance?

Most viewed papers (last 30 days)

A Data-oriented Method for Scheduling Dependent Tasks on High-density Multi-GPU Systems

Share this:

Recent source codes

Most viewed papers (last 30 days)