Performance Degradation Analysis of GPU Kernels

Jinpeng Lv, Guodong Li, Alan Humphrey, Ganesh Gopalakrishnan
Electrical & Computer Engineering, University of Utah, Salt Lake City, UT
Computer Aided Verification (CAV 2011 – EC2 Workshop), 2011


   title={Performance Degradation Analysis of GPU Kernels},

   author={Lv, Jinpeng and Li, Guodong and Humphrey, Alan and Gopalakrishnan, Ganesh},



Download Download (PDF)   View View   Source Source   



Hardware accelerators (currently Graphical Processing Units or GPUs) are an important component in many existing high-performance computing solutions [5]. Their growth in variety and usage is expected to skyrocket [1] due to many reasons. First, GPUs offer impressive energy efficiencies [3]. Second, when properly programmed, they yield impressive speedups by allowing programmers to model their computation around many fine-grained threads whose focus can be rapidly switched during memory stalls. Unfortunately, arranging for high memory access efficiency requires developed computational thinking to properly decompose a problem domain to gain this efficiency. Our work currently addresses the needs of the CUDA [5] approach to programming GPUs. Two important classes of such rules are bank conflict avoidance rules that pertain to CUDA shared memory and coalesced access rules that pertain to global memory. The former requires programmers to generate memory addresses from consecutive threads that fall within separate shared memory banks. The latter requires programmers to generate memory addresses that permit coalesced fetches from the global memory. In previous work [6], we had, to some extent addressed the former through SMTbased methods. Several other efforts also address bank conflicts [7, 8, 4]. In this work, we address the latter requirement-detecting when coalesced access rules are being violated.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2021 hgpu.org

All rights belong to the respective authors

Contact us: