Accelerating Kernel Density Estimation on the GPU Using the CUDA Framework
Department of Balkan Studies, University of Western Macedonia, 3rd Km Florinas-Nikis National Road, Florina, 53100, Greece
Applied Mathematical Sciences, Vol. 7, no. 30, 1447 – 1476, 2013
@article{michailidis2013accelerating,
title={Accelerating Kernel Density Estimation on the GPU Using the CUDA Framework},
author={Michailidis, Panagiotis D and Margaritis, Konstantinos G},
journal={Applied Mathematical Sciences},
volume={7},
number={30},
pages={1447–1476},
year={2013}
}
The main problem of the kernel density estimation methods is the huge computational requirements, especially for large data sets. One way for accelerating these methods is to use the parallel processing. Recent advances in parallel processing have focused on the use Graphics Processing Units (GPUs) using Compute Unified Device Architecture (CUDA) programming model. In this work we discuss a naive and two optimised CUDA algorithms for the two kernel estimation methods: univariate and multivariate. These optimised algorithms are based on the use of shared memory tiles and loop unrolling techniques. We also present exploratory experimental results of the proposed CUDA algorithms according to the several values of parameters such as number of threads per block, tile size, loop unroll level, number of variables and data (sample) size. The experimental results show significant performance gains of all proposed CUDA algorithms over serial CPU version and small performance speed-ups of the two optimised CUDA algorithms over naive GPU algorithms. Finally, based on extended performance results are obtained general conclusions of all proposed CUDA algorithms for some parameters.
March 2, 2013 by hgpu