Hierarchical overlapped tiling
University of Illinois at Urbana-Champaign
Proceedings of the Tenth International Symposium on Code Generation and Optimization (CGO ’12), 2012
@inproceedings{zhou2012hierarchical,
title={Hierarchical overlapped tiling},
author={Zhou, X. and Giacalone, J.P. and Garzar{‘a}n, M.J. and Kuhn, R.H. and Ni, Y. and Padua, D.},
booktitle={Proceedings of the Tenth International Symposium on Code Generation and Optimization},
pages={207–218},
year={2012},
organization={Citeseer}
}
This paper introduces hierarchical overlapped tiling, a transformation that applies loop tiling and fusion to conventional loops. Overlapped tiling is a useful transformation to reduce communication overhead, but it may also generate a significant amount of redundant computation. Hierarchical overlapped tiling performs overlapped tiling hierarchically to balance communication overhead and redundant computation, and thus has the potential to provide better performance. In this paper, we describe the hierarchical overlapped tiling optimization and its implementation in an OpenCL compiler. We also evaluate the effectiveness of this optimization using 8 programs that implement different forms of stencil computation. Our results show that hierarchical overlapped tiling achieves an average 37% speedup over traditional tiling on a 32-core workstation.
June 23, 2012 by hgpu