gpucc: an open-source GPGPU compiler
Google, USA
The 2016 International Symposium on Code Generation and Optimization (CGO 2016), 2016
@inproceedings{wu2016gpucc,
title={gpucc: an open-source GPGPU compiler},
author={Wu, Jingyue and Belevich, Artem and Bendersky, Eli and Heffernan, Mark and Leary, Chris and Pienaar, Jacques and Roune, Bjarke and Springer, Rob and Weng, Xuetian and Hundt, Robert},
booktitle={Proceedings of the 2016 International Symposium on Code Generation and Optimization},
pages={105–116},
year={2016},
organization={ACM}
}
Graphics Processing Units have emerged as powerful accelerators for massively parallel, numerically intensive workloads. The two dominant software models for these devices are NVIDIA’s CUDA and the cross-platform OpenCL standard. Until now, there has not been a fully open-source compiler targeting the CUDA environment, hampering general compiler and architecture research and making deployment difficult in datacenter or supercomputer environments. In this paper, we present gpucc, an LLVM-based, fully open-source, CUDA compatible compiler for high performance computing. It performs various general and CUDA-specific optimizations to generate high performance code. The Clang-based frontend supports modern language features such as those in C++11 and C++14. Compile time is 8% faster than NVIDIA’s toolchain (nvcc) and it reduces compile time by up to 2.4x for pathological compilations (>100 secs), which tend to dominate build times in parallel build environments. Compared to nvcc, gpucc’s runtime performance is on par for several open-source benchmarks, such as Rodinia (0.8% faster), SHOC (0.5% slower), or Tensor (3.7% faster). It outperforms nvcc on internal large-scale end-to-end benchmarks by up to 51.0%, with a geometric mean of 22.9%.
April 6, 2016 by hgpu