OCLoptimizer: An Iterative Optimization Tool for OpenCL
Computer Architecture Group, University of A Coruna, Spain
Proceedings of the International Conference on Computational Science (ICCS), 2013
@article{fabeiro2013ocloptimizer,
title={OCLoptimizer: An Iterative Optimization Tool for OpenCL},
author={Fabeiro, Jorge F and Andrade, Diego and Fraguela, Basilio B},
journal={Procedia Computer Science},
volume={18},
pages={1322–1331},
year={2013},
publisher={Elsevier}
}
Nowadays, computers include several computational devices with parallel capacities, such as multicore processors and Graphic Processing Units (GPUs). OpenCL enables the programming of all these kinds of devices. An OpenCL program consists of a host code which discovers the computational devices available in the host system and it queues up commands to the devices, and the kernel code which defines the core of the parallel computation executed in the devices. This work addresses two of the most important problems faced by an OpenCL programmer: (1) hosts codes are quite verbose but they can be automatically generated if some parameters are known; (2) OpenCL codes that are hand-optimized for a given device do not get necessarily a good performance in a different one. This paper presents a source-to-source iterative optimization tool, called OCLoptimizer, that aims to generate host codes automatically and to optimize OpenCL kernels taking as inputs an annotated version of the original kernel and a configuration file. Iterative optimization is a well-known technique which allows to optimize a given code by exploring different configuration parameters in a systematic manner. For example, we can apply tiling on one loop and the iterative optimizer would select the optimal tile size by exploring the space of possible tile sizes. The experimental results show that the tool can automatically optimize a set of OpenCL kernels for multicore processors.
June 10, 2013 by hgpu