high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » Dynamic Scheduling for Work Agglomeration on Heterogeneous Clusters

Dynamic Scheduling for Work Agglomeration on Heterogeneous Clusters

Jonathan Lifflander, G. Carl Evans, Anshu Arya, Laxmikant V. Kale

Dept. of Computer Science, University of Illinois, Urbana-Champaign, United States

Proceedings of (PLC’12) Multicore and GPU Programming Models, Languages and Compilers Workshop at IPDPS 2012, 2012

BibTeX

Download (PDF)

View

Source

1717

views

Dynamic scheduling and varying decomposition granularity are well-known techniques for achieving high performance in parallel computing. Heterogeneous clusters with highly data-parallel processors, such as GPUs, present unique problems for the application of these techniques. These systems reveal a dichotomy between grain sizes: decompositions ideal for the CPUs may yield insufficient data-parallelism for accelerators, and decompositions targeted at the GPU may decrease performance on the CPU. This problem is typically ameliorated by statically scheduling a fixed amount of work for agglomeration. However, determining the ideal amount of work to compose requires experimentation because it varies between architectures and problem configurations. This paper describes a novel methodology for dynamically agglomerating work units at runtime and scheduling them on accelerators. This approach is demonstrated in the context of two applications: an n-body particle simulation, which offloads particle interaction work; and a parallel dense LU solver, which relocates DGEMM kernels to the GPU. In both cases dynamic agglomeration yields comparable or better results over statically scheduling the work across a variety of system configurations.

Tags: Computer science, CUDA, Heterogeneous systems, N-body simulation, nVidia, Particle simulation, Tesla S1070

March 9, 2012 by hgpu

No votes yet.

Please wait...

Your response

You must be logged in to post a comment.

* * *

high performance computing on graphics processing units: hgpu.org

Dynamic Scheduling for Work Agglomeration on Heterogeneous Clusters

Your response

Recent source codes

Mutual-Supervised Learning for Sequential-to-Parallel Code Translation

Hardware Compute Partitioning on NVIDIA GPUs for Composable Systems

KISim: Kubernetes Intelligent Scheduling Simulator

Efficient GPU Implementation of Multi-Precision Integer Division

exa-AMD: Exascale Accelerated Materials Discovery

ParEval: A Parallel Code Evaluation Benchmark

FlashSparse: Minimizing Computation Redundancy for Fast Sparse Matrix Multiplications on Tensor Cores

WiLLM: An Open Wireless LLM Communication System

Vcc: the Vulkan Clang Compiler

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

Most viewed papers (last 30 days)

Dynamic Scheduling for Work Agglomeration on Heterogeneous Clusters

Share this:

Your response

Recent source codes

Most viewed papers (last 30 days)