Program optimization space pruning for a multithreaded gpu
University of Illinois at Urbana-Champaign, Urbana, IL, USA
In CGO ’08: Proceedings of the 6th annual IEEE/ACM international symposium on Code generation and optimization (2008), pp. 195-204.
@conference{ryoo2008program,
title={Program optimization space pruning for a multithreaded GPU},
author={Ryoo, S. and Rodrigues, C.I. and Stone, S.S. and Baghsorkhi, S.S. and Ueng, S.Z. and Stratton, J.A. and Hwu, W.W.},
booktitle={Proceedings of the sixth annual IEEE/ACM international symposium on Code generation and optimization},
pages={195–204},
year={2008},
organization={ACM}
}
Program optimization for highly-parallel systems has historically been considered an art, with experts doing much of the performance tuning by hand. With the introduction of inexpensive, single-chip, massively parallel platforms, more developers will be creating highly-parallel applications for these platforms, who lack the substantial experience and knowledge needed to maximize their performance. This creates a need for more structured optimization methods with means to estimate their performance effects. Furthermore these methods need to be understandable by most programmers. This paper shows the complexity involved in optimizing applications for one such system and one relatively simple methodology for reducing the workload involved in the optimization process.
November 1, 2010 by hgpu