Automatic Performance Tuning of Stencil Computations on Graphics Processing Units
University of Toronto
University of Toronto, 2015
@article{garvey2015automatic,
title={Automatic Performance Tuning of Stencil Computations on Graphics Processing Units},
author={Garvey, Joseph D.},
year={2015}
}
The focus of this work is the automatic performance tuning of stencil computations on Graphics Processing Units (GPUs). A strategy is presented that uses machine learning to determine the best way to use the GPU memory followed by a heuristic that divides the remaining optimizations into groups and exhaustively explores one group at a time. The strategy is evaluated using 104 synthetically generated OpenCL stencil kernels on an Nvidia GTX Titan GPU. The strategy is assessed both in terms of the number of configurations explored during auto-tuning and the quality of the best configuration obtained. Two alternative heuristics that use different groupings of the optimizations are explored. Relative to a random sampling of the space and an expert search, the strategy achieves a reduction in the number of configurations explored of up to 71% and 84% respectively while finding configurations that perform 12% and 6% better respectively.
January 4, 2016 by hgpu