A Compiler for Throughput Optimization of Graph Algorithms on GPUs

Sreepathi Pai, Keshav Pingali
The University of Texas at Austin, USA
OOPSLA ’16, 2016


   title={A Compiler for Throughput Optimization of Graph Algorithms on GPUs},

   author={Pai, Sreepathi and Pingali, Keshav},



Download Download (PDF)   View View   Source Source   



Writing high-performance GPU implementations of graph algorithms can be challenging. In this paper, we argue that three optimizations called throughput optimizations are key to high-performance for this application class. These optimizations describe a large implementation space making it unrealistic for programmers to implement them by hand. To address this problem, we have implemented these optimizations in a compiler that produces CUDA code from an intermediate-level program representation called IrGL. Compared to state-of-the-art handwritten CUDA implementations of eight graph applications, code generated by the IrGL compiler is up to 5.95x times faster (median 1.4x) for five applications and never more than 30% slower for the others. Throughput optimizations contribute an improvement up to 4.16x (median 1.4x) to the performance of unoptimized IrGL code.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2021 hgpu.org

All rights belong to the respective authors

Contact us: