Hybrid strategy for stencil computations on the APU
UPMC Univ Paris 06 and CNRS UMR 7606, LIP6, 4 place Jussieu, F-75252, Paris cedex 05, France
1st International Workshop on High-Performance Stencil Computations (HiStencils ’14), 2014
@article{eberhart2014hybrid,
title={Hybrid strategy for stencil computations on the APU},
author={Eberhart, Pac{^o}me and Said, Issam and Fortin, Pierre and Calandra, Henri},
year={2014}
}
Stencil computations are very regular and well adapted to GPU execution. However, the PCI-E bus that connects a discrete GPU to the system memory has a relatively low bandwidth when compared to the GPU compute power. The AMD APU architecture contains both CPU and GPU on the same chip and shared memory between them, which enables to bypass this PCI-E bus. In this paper, we devise a strategy for hybrid deployments on the CPU and the integrated GPU of the APU. For the task-parallel deployment, we rely on the CPU to process the diverging parts of the application. For the data-parallel deployment, we balance the workloads of the CPU and the GPU to achieve the best performance. Our strategy is tested on different stencil computations and we achieve a 20 to 30% gain in performance in the best cases.
January 26, 2014 by hgpu