Scheduling Tasks over Multicore machines enhanced with Accelerators: a Runtime System’s Perspective

hgpu.org » Applications » Computer science » Scheduling Tasks over Multicore machines enhanced with Accelerators: a Runtime System’s Perspective

Scheduling Tasks over Multicore machines enhanced with Accelerators: a Runtime System’s Perspective

Cedric Augonnet

Universite de Bordeaux 1

Universite de Bordeaux 1, 2012

@article{augonnet2011scheduling,

title={Scheduling Tasks over Multicore machines enhanced with Accelerators: a Runtime System’s Perspective},

author={AUGONNET, C.},

year={2011}

}

Download (PDF)

View

Source

1919

views

Multicore machines equipped with accelerators are becoming increasingly popular in the High Performance Computing ecosystem. Hybrid architectures provide significantly improved energy efficiency, so that they are likely to generalize in the Manycore era. However, the complexity introduced by these architectures has a direct impact on programmability, so that it is crucial to provide portable abstractions in order to fully tap into the potential of these machines. Pure offloading approaches, that consist in running an application on regular processors while offloading predetermined parts of the code on accelerators, are not sufficient. The real challenge is to build systems where the application would be spread across the entire machine, that is, where computation would be dynamically scheduled over the full set of available processing units. In this thesis, we thus propose a new task-based model of runtime system specifically designed to address the numerous challenges introduced by hybrid architectures, especially in terms of task scheduling and of data management. In order to demonstrate the relevance of this model, we designed the StarPU platform. It provides an expressive interface along with flexible task scheduling capabilities tightly coupled to an efficient data management. Using these facilities, together with a database of auto-tuned per-task performance models, it for instance becomes straightforward to develop efficient scheduling policies that take into account both computation and communication costs. We show that our task-based model is not only powerful enough to provide support for clusters, but also to scale on hybrid manycore architectures. We analyze the performance of our approach on both synthetic and real-life workloads, and show that we obtain significant speedups and a very high efficiency on various types of multicore platforms enhanced with accelerators.

Tags: Cell processor, Computer science, CUDA, Databases, nVidia, nVidia Quadro FX 4600, nVidia Quadro FX 5800, Task scheduling, Tesla C2050, Tesla M2070, Tesla S1070, Thesis

March 29, 2012 by hgpu

No votes yet.

Please wait...

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

gpu_tracker: Python package for tracking and profiling GPU utilization in both desktop and high-performance computing environments

* * *

high performance computing on graphics processing units: hgpu.org