Loop Transformation Recipes for Code Generation and Auto-Tuning
School of Computing, University of Utah; Salt Lake City, UT
In Languages and Compilers for Parallel Computing, Vol. 5898 (2010), pp. 50-64
@article{hall2010loop,
title={Loop transformation recipes for code generation and auto-tuning},
author={Hall, M. and Chame, J. and Chen, C. and Shin, J. and Rudy, G. and Khan, M.},
journal={Languages and Compilers for Parallel Computing},
pages={50–64},
year={2010},
publisher={Springer}
}
In this paper, we describe transformation recipes, which provide a high-level interface to the code transformation and code generation capability of a compiler. These recipes can be generated by compiler decision algorithms or savvy software developers. This interface is part of an auto-tuning framework that explores a set of different implementations of the same computation and automatically selects the best-performing implementation. Along with the original computation, a transformation recipe specifies a range of implementations of the computation resulting from composing a set of high-level code transformations. In our system, an underlying polyhedral framework coupled with transformation algorithms takes this set of transformations, composes them and automatically generates correct code. We first describe an abstract interface for transformation recipes, which we propose to facilitate interoperability with other transformation frameworks. We then focus on the specific transformation recipe interface used in our compiler and present performance results on its application to kernel and library tuning and tuning of key computations in high-end applications. We also show how this framework can be used to generate and auto-tune parallel OpenMP or CUDA code from a high-level specification.
February 12, 2011 by hgpu