High Performance Stencil Code Generation with Lift
University of Munster, Munster, Germany
International Symposium on Code Generationand Optimization (CGO), 2018
@inproceedings{hagedorn2018high,
title={High Performance Stencil Code Generation with LIFT},
author={Hagedorn, Bastian and Stolzfus, Larisa and Steuwer, Michel and Gorlatch, Sergei and Dubach, Christophe},
booktitle={International Symposium on Code Generation and Optimization},
year={2018}
}
Stencil computations are widely used from physical simulations to machine-learning. They are embarrassingly parallel and perfectly fit modern hardware such as Graphic Processing Units. Although stencil computations have been extensively studied, optimizing them for increasingly diverse hardware remains challenging. Domain Specific Languages (DSLs) have raised the programming abstraction and offer good performance. However, this places the burden on DSL implementers who have to write almost full-fledged parallelizing compilers and optimizers. Lift has recently emerged as a promising approach to achieve performance portability and is based on a small set of reusable parallel primitives that DSL or library writers can build upon. Lift’s key novelty is in its encoding of optimizations as a system of extensible rewrite rules which are used to explore the optimization space. However, Lift has mostly focused on linear algebra operations and it remains to be seen whether this approach is applicable for other domains. This paper demonstrates how complex multidimensional stencil code and optimizations such as tiling are expressible using compositions of simple 1D Lift primitives. By leveraging existing Lift primitives and optimizations, we only require the addition of two primitives and one rewrite rule to do so. Our results show that this approach outperforms existing compiler approaches and hand-tuned codes.
January 13, 2018 by hgpu