Microbenchmarks for GPU characteristics: the occupancy roofline and the pipeline model
Vrije Universiteit Brussel (VUB), Industrial Sciences (INDI) Dept., Pleinlaan 2, B-1050 Brussels, Belgium
24th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP), 2016
@inproceedings{lemeire2016microbenchmarks,
title={Microbenchmarks for GPU Characteristics: The Occupancy Roofline and the Pipeline Model},
author={Lemeire, Jan and Cornelis, Jan G and Segers, Laurent},
booktitle={2016 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP)},
pages={456–463},
year={2016},
organization={IEEE}
}
In this paper we present microbenchmarks in OpenCL to measure the most important performance characteristics of GPUs. Microbenchmarks try to measure individual characteristics that influence the performance. First, performance, in operations or bytes per second, is measured with respect to the occupancy and as such provides an occupancy roofline curve. The curve shows at which occupancy level peak performance is reached. Second, when considering the cycles per instruction of each compute unit, we measure the two most important characteristics of an instruction: its issue and completion latency. This is based on modeling each compute unit as a pipeline for computations and a pipeline for the memory access. We also measure some specific characteristics: the influence of independent instructions within a kernel and thread divergence. We argue that these are the most important characteristics for understanding the performance and predicting performance. The results for several Nvidia and AMD GPUs are provided. A free java application containing the microbenchmarks is available online.
May 9, 2016 by hgpu