SkePU: a multi-backend skeleton programming library for multi-GPU systems
PELAB, Department of Computer and Information Science, Linkoping University, S-58183 Linkoping, Sweden
HLPP ’10 Proceedings of the fourth international workshop on High-level parallel programming and applications, 2010
@inproceedings{enmyren2010skepu,
title={SkePU: a multi-backend skeleton programming library for multi-GPU systems},
author={Enmyren, J. and Kessler, C.W.},
booktitle={Proceedings of the fourth international workshop on High-level parallel programming and applications},
pages={5–14},
year={2010},
organization={ACM}
}
We present SkePU, a C++ template library which provides a simple and unified interface for specifying data-parallel computations with the help of skeletons on GPUs using CUDA and OpenCL. The interface is also general enough to support other architectures, and SkePU implements both a sequential CPU and a parallel OpenMP backend. It also supports multi-GPU systems. Copying data between the host and the GPU device memory can be a performance bottleneck. A key technique in SkePU is the implementation of lazy memory copying in the container type used to represent skeleton operands, which allows to avoid unnecessary memory transfers. We evaluate SkePU with small benchmarks and a larger application, a Runge-Kutta ODE solver. The results show that a skeleton approach to GPU programming is viable, especially when the computation burden is large compared to memory I/O (the lazy memory copying can help to achieve this). It also shows that utilizing several GPUs have a potential for performance gains. We see that SkePU offers good performance with a more complex and realistic task such as ODE solving, with up to 10 times faster run times when using SkePU with a GPU backend compared to a sequential solver running on a fast CPU.
August 18, 2011 by hgpu