Optimising Purely Functional GPU Programs
University of New South Wales, Australia
University of New South Wales, 2013
@article{mcdonell2013optimising,
title={Optimising Purely Functional GPU Programs},
author={McDonell, Trevor L and Chakravarty, Manuel MT and Keller, Gabriele and Lippmeier, Ben},
year={2013}
}
Purely functional, embedded array programs are a good match for SIMD hardware, such as GPUs. However, the naive compilation of such programs quickly leads to both code explosion and an excessive use of intermediate data structures. The resulting slowdown is not acceptable on target hardware that is usually chosen to achieve high performance. It this paper, we present two optimisation techniques, sharing recovery and array fusion, that tackle code explosion and eliminate superfluous intermediate structures. Both techniques are well known from other contexts, but they present unique challenges for an embedded language compiled for execution on a GPU. We present novel methods for implementing sharing recovery and array fusion, and demonstrate their effectiveness on a set of benchmarks.
April 4, 2013 by hgpu