Block-Size Independence for GPU Programs
University of Pennsylvania
25th Static Analysis Symposium (SAS), 2018
@article{alur2018block,
title={Block-Size Independence for GPU Programs},
author={Alur, Rajeev and Devietti, Joseph and Singhania, Nimit},
year={2018}
}
Optimizing GPU programs by tuning execution parameters is essential to realizing the full performance potential of GPU hardware. However, many of these optimizations do not ensure correctness and subtle errors can enter while optimizing a GPU program. Further, lack of formal models and the presence of non-trivial transformations prevent verification of optimizations. In this work, we verify transformations involved in tuning the execution parameter, block-size. First, we present a formal programming and execution model for GPUs, and then formalize block-size independence of GPU programs, which ensures tuning block-size preserves program semantics. Next, we present an inter-procedural analysis to verify block-size independence for synchronization-free GPU programs. Finally, we evaluate the analysis on the Nvidia CUDA SDK samples, where 35 global kernels are verified to be block-size independent.
July 28, 2018 by hgpu