23412

GPA: A GPU Performance Advisor Based on Instruction Sampling

Keren Zhou, Xiaozhu Meng, Ryuichi Sai, John Mellor-Crummey
Rice University, Houston, Texas, United States
arXiv:2009.04061 [cs.PF], (9 Sep 2020)

@misc{zhou2020gpa,

   title={GPA: A GPU Performance Advisor Based on Instruction Sampling},

   author={Keren Zhou and Xiaozhu Meng and Ryuichi Sai and John Mellor-Crummey},

   year={2020},

   eprint={2009.04061},

   archivePrefix={arXiv},

   primaryClass={cs.PF}

}

Download Download (PDF)   View View   Source Source   

1122

views

Developing efficient GPU kernels can be difficult because of the complexity of GPU architectures and programming models. Existing performance tools only provide coarse-grained suggestions at the kernel level, if any. In this paper, we describe GPA, a performance advisor for NVIDIA GPUs that suggests potential code optimization opportunities at a hierarchy of levels, including individual lines, loops, and functions. To relieve users of the burden of interpreting performance counters and analyzing bottlenecks, GPA uses data flow analysis to approximately attribute measured instruction stalls to their root causes and uses information about a program’s structure and the GPU to match inefficiency patterns with suggestions for optimization. To quantify each suggestion’s potential benefits, we developed PC sampling-based performance models to estimate its speedup. Our experiments with benchmarks and applications show that GPA provides an insightful report to guide performance optimization. Using GPA, we obtained speedups on a Volta V100 GPU ranging from 1.03x to 3.86x, with a geometric mean of 1.22x.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: