9297

Split tiling for GPUs: automatic parallelization using trapezoidal tiles

Tobias Grosser, Albert Cohen, Paul H J Kelly, J. Ramanujam, P. Sadayappan, Sven Verdoolaege
Ecole Normale Superieure
6th Workshop on General Purpose Processor Using Graphics Processing Units (GPGPU-6), 2013

@inproceedings{grosser2013split,

   title={Split tiling for GPUs: automatic parallelization using trapezoidal tiles},

   author={Grosser, Tobias and Cohen, Albert and Kelly, Paul HJ and Ramanujam, J and Sadayappan, P and Verdoolaege, Sven},

   booktitle={Proceedings of the 6th Workshop on General Purpose Processor Using Graphics Processing Units},

   pages={24–31},

   year={2013},

   organization={ACM}

}

Download Download (PDF)   View View   Source Source   Source codes Source codes

Package:

2051

views

Tiling is a key technique to enhance data reuse. For computations structured as one sequential outer "time" loop enclosing a set of parallel inner loops, tiling only the parallel inner loops may not enable enough data reuse in the cache. Tiling the inner loops along with the outer time loop enhances data locality but may require other transformations like loop skewing that inhibit inter-tile parallelism. One approach to tiling that enhances data locality without inhibiting inter-tile parallelism is split tiling, where tiles are subdivided into a sequence of trapezoidal computation steps. In this paper, we develop an approach to generate split tiled code for GPUs in the PPCG polyhedral code generator. We propose a generic algorithm to calculate index-set splitting that enables us to perform tiling for locality and synchronization avoidance, while simultaneously maintaining parallelism, without the need for skewing or redundant computations. Our algorithm performs split tiling for an arbitrary number of dimensions and without the need to construct any large integer linear program. The method and its implementation are evaluated on standard stencil kernels and compared with a state-of-the-art polyhedral compiler and with a domain-specific stencil compiler, both targeting CUDA GPUs.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: