16566

A Compiler for Throughput Optimization of Graph Algorithms on GPUs

Sreepathi Pai, Keshav Pingali
The University of Texas at Austin, USA
OOPSLA ’16, 2016

@article{pai2016compiler,

   title={A Compiler for Throughput Optimization of Graph Algorithms on GPUs},

   author={Pai, Sreepathi and Pingali, Keshav},

   year={2016}

}

Download Download (PDF)   View View   Source Source   

347

views

Writing high-performance GPU implementations of graph algorithms can be challenging. In this paper, we argue that three optimizations called throughput optimizations are key to high-performance for this application class. These optimizations describe a large implementation space making it unrealistic for programmers to implement them by hand. To address this problem, we have implemented these optimizations in a compiler that produces CUDA code from an intermediate-level program representation called IrGL. Compared to state-of-the-art handwritten CUDA implementations of eight graph applications, code generated by the IrGL compiler is up to 5.95x times faster (median 1.4x) for five applications and never more than 30% slower for the others. Throughput optimizations contribute an improvement up to 4.16x (median 1.4x) to the performance of unoptimized IrGL code.
VN:F [1.9.22_1171]
Rating: 0.0/5 (0 votes cast)

* * *

* * *

HGPU group © 2010-2017 hgpu.org

All rights belong to the respective authors

Contact us: