On Static Timing Analysis of GPU Kernels
Aalto University, Espoo, Finland
14th International Workshop on Worst-Case Execution Time Analysis (WCET), 2014
@InProceedings{hirvisalo:OASIcs:2014:4603,
author={Vesa Hirvisalo},
title={On Static Timing Analysis of GPU Kernels},
booktitle={14th International Workshop on Worst-Case Execution Time Analysis},
pages={43–52},
series={OpenAccess Series in Informatics (OASIcs)},
ISBN={978-3-939897-69-9},
ISSN={2190-6807},
year={2014},
volume={39},
editor={Heiko Falk},
publisher={Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik},
address={Dagstuhl, Germany},
URL={http://drops.dagstuhl.de/opus/volltexte/2014/4603},
annote={Keywords: Parallelism, WCET}
}
We study static timing analysis of programs running on GPU accelerators. Such programs follow a data parallel programming model that allows massive parallelism on manycore processors. Data parallel programming and GPUs as accelerators have received wide use during the recent years. The timing analysis of programs running on single core machines is well known and applied also in practice. However for multicore and manycore machines, timing analysis presents a significant but yet not properly solved problem. In this paper, we present static timing analysis of GPU kernels based on a method that we call abstract CTA simulation. Cooperative Thread Arrays (CTA) are the basic execution structure that GPU devices use in their operation that proceeds in thread groups called warps. Abstract CTA simulation is based on static analysis of thread divergence in warps and their abstract scheduling.
July 4, 2014 by hgpu