12799

Smart Multi-Task Scheduling for OpenCL Programs on CPU/GPU Heterogeneous Platforms

Yuan Wen, Zheng Wang, Michael F.P. O’Boyle
School of Informatics, The University of Edinburgh
The 21st annual IEEE International Conference on High Performance Computing (HiPC 2014), 2014

@article{wen2014smart,

   title={Smart Multi-Task Scheduling for OpenCL Programs on CPU/GPU Heterogeneous Platforms},

   author={Wen, Yuan and Wang, Zheng and O’Boyle, Michael F.P.},

   year={2014}

}

Download Download (PDF)   View View   Source Source   

814

views

Heterogeneous systems consisting of multiple CPUs and GPUs are increasingly attractive as platforms for high performance computing. Such platforms are usually programmed using OpenCL which provides program portability by allowing the same program to execute on different types of device. As such systems become more mainstream, they will move from application dedicated devices to platforms that need to support multiple concurrent user applications. Here there is a need to determine when and where to map different applications so as to best utilize the available heterogeneous hardware resources. In this paper, we present an efficient OpenCL task scheduling scheme which schedules multiple kernels from multiple programs on CPU/GPU heterogeneous platforms. It does this by determining at runtime which kernels are likely to best utilize a device. We show that speedup is a good scheduling priority function and develop a novel model that predicts a kernel’s speedup based on its static code structure. Our scheduler uses this prediction and runtime input data size to prioritize and schedule tasks. This technique is applied to a large set of concurrent OpenCL kernels. We evaluated our approach for system throughput and average turn-around time against competitive techniques on two different platforms: a Core i7/Nvidia GTX590 and a Core i7/AMD Tahiti 7970 platforms. For system throughput, we achieve, on average, a 1.21x and 1.25x improvement over the best competitors on the NVIDIA and AMD platforms respectively. Our approach reduces the turnaround time, on average, by at least 1.5x and 1.2x on the NVIDIA and AMD platforms respectively, when compared to alternative approaches.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2017 hgpu.org

All rights belong to the respective authors

Contact us: