18573

Improving OpenCL Performance by Specializing Compiler Phase Selection and Ordering

Ricardo Nobre, Luis Reis, Joao M. P. Cardoso
Faculty of Engineering, University of Porto, INESC TEC, Porto, Portugal
arXiv:1810.10496 [cs.PF], (24 Oct 2018)

@article{nobre2018improving,

   title={Improving OpenCL Performance by Specializing Compiler Phase Selection and Ordering},

   author={Nobre, Ricardo and Reis, Luis and Cardoso, Joao M. P.},

   year={2018},

   month={oct},

   archivePrefix={"arXiv"},

   primaryClass={cs.PF}

}

Download Download (PDF)   View View   Source Source   

1548

views

Automatic compiler phase selection/ordering has traditionally been focused on CPUs and, to a lesser extent, FPGAs. We present experiments regarding compiler phase ordering specialization of OpenCL kernels targeting a GPU. We use iterative exploration to specialize LLVM phase orders on 15 OpenCL benchmarks to an NVIDIA GPU. We analyze the generated NVIDIA PTX code for the various versions to identify the main causes of the most significant improvements and present results of a set of experiments that demonstrate the importance of using specific phase orders. Using specialized compiler phase orders, we were able to achieve geometric mean improvements of 1.54x (up to 5.48x) and 1.65x (up to 5.7x) over PTX generated by the NVIDIA CUDA compiler from CUDA versions of the same kernels, and over execution of the OpenCL kernels compiled from source with the NVIDIA OpenCL driver, respectively. We also evaluate the use of code-features in the OpenCL kernels. More specifically, we evaluate an approach that achieves geometric mean improvements of 1.49x and 1.56x over the same OpenCL baseline, by using the compiler sequences of the 1 or 3 most similar benchmarks, respectively.
Rating: 2.0/5. From 1 vote.
Please wait...

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: