Optimizing OpenCL Kernels for Iterative Statistical Applications on GPUs

Thilina Gunarathne, Bimalee Salpitikorala, Arun Chauhan, Geoffrey Fox
Indiana University, Bloomington, IN 47405, USA
Proceedings of the Second International Workshop on GPUs and Scientific Applications (GPUScA), PACT 2011, 2011


   title={Optimizing OpenCL Kernels for Iterative Statistical Applications on GPUs},

   author={Gunarathne, T. and Salpitikorala, B. and Chauhan, A. and Fox, G.},



Download Download (PDF)   View View   Source Source   



We present a study of three important kernels that occur frequently in iterative statistical applications: K-Means, Multi-Dimensional Scaling (MDS), and PageRank. We implemented each kernel using OpenCL and evaluated their performance on an NVIDIA Tesla GPGPU card. By examining the underlying algorithms and empirically measuring the performance of various components of the kernel we explored the optimization of these kernels by four main techniques: (1) caching invariant data in GPU memory across iterations, (2) selectively placing data in different memory levels, (3) rearranging data in memory, and (4) dividing the work between the GPU and the CPU. The optimizations resulted in performance improvements of up to 5X, compared to naive OpenCL implementations. We believe that these categories of optimizations are also applicable to other similar kernels. Finally, we draw several lessons that would be useful in not only implementing other similar kernels with OpenCL, but also in devising code generation strategies in compilers that target GPGPUs through OpenCL.
Rating: 2.5/5. From 1 vote.
Please wait...

* * *

* * *

HGPU group © 2010-2020 hgpu.org

All rights belong to the respective authors

Contact us: