Auto-tuning SkePU: a multi-backend skeleton programming framework for multi-GPU systems
PELAB, Department of Computer and Information Science, Linkoping University
Fourth Workshop on Programmability Issues for Multi-Core Computers (MULTIPROG-2011)
@inproceedings{dastgeer2011auto,
title={Auto-tuning SkePU: a multi-backend skeleton programming framework for multi-GPU systems},
author={Dastgeer, U. and Enmyren, J. and Kessler, C.},
booktitle={Fourth Workshop on Programmability Issues for Multi-Core Computers (MULTIPROG-2011)},
pages={132},
year={2011}
}
SkePU is a C++ template library that provides a simple and unified interface for specifying data-parallel computations with the help of skeletons on GPUs using CUDA and OpenCL. The interface is also general enough to support other architectures, and SkePU implements both a sequential CPU and a parallel OpenMP backend. It also supports multi-GPU systems. Currently available skeletons in SkePU include map, reduce, mapreduce, map-with-overlap, maparray, and scan. The performance of SkePU generated code is comparable to that of hand-written code, even for more complex applications such as ODE solving. In this paper, we discuss initial results from auto-tuning SkePU using an off-line, machine learning approach where we adapt skeletons to a given platform using training data. The prediction mechanism at execution time uses off-line pre-calculated estimates to construct an execution plan for any desired configuration with minimal overhead. The prediction mechanism accurately predicts execution time for repetitive executions and includes a mechanism to predict execution time for user functions of different complexity. The tuning framework covers selection between different backends as well as choosing optimal parameter values for the selected backend. We will discuss our approach and initial results obtained for different skeletons (map, mapreduce, reduce).
August 19, 2011 by hgpu