Jit4OpenCL: a compiler from Python to OpenCL
Department of Computing Science, University of Alberta
University of Alberta, 2010
@article{xunhao2010jit4opencl,
title={Jit4OpenCL: a compiler from Python to OpenCL},
author={Xunhao, L.},
year={2010}
}
Heterogeneous computing platforms that use GPUs and CPUs in tandem for computation have become an important choice to build low-cost high-performance computing platforms. The computing ability of modern GPUs surpasses that of CPUs can offer for certain classes of applications. GPUs can deliver several Tera-Flops in peak performance. However, programmers must adopt a more complicated and more difficult new programming paradigm. To alleviate the burden of programming for heterogeneous systems, Garg and Amaral developed a Python compiling framework that combines an ahead-of-time compiler called unPython with a just-in-time compiler called jit4GPU. This compilation framework generates code for systems with AMD GPUs. We extend the framework to retarget it to generate OpenCL code, an industry standard that is implemented for most GPUs. Therefore, by generating OpenCL code, this new compiler, called jit4OpenCL, enables the execution of the same program in a wider selection of heterogeneous platforms. To further improve the target-code performance on nVidia GPUs, we developed an array-access analysis tool that helps to exploit the data reusability by utilizing the shared (local) memory space hierarchy in OpenCL. The thesis presents an experimental performance evaluation indicating that, in comparison with jit4GPU, jit4OpenCL has performance degradation because of the current performance of implementations of OpenCL, and also because of the extra time needed for the additional just-in-time compilation. However, the portable code generated by jit4OpenCL still have performance gains in some applications compared to highly optimized CPU code.
October 29, 2011 by hgpu