Parallel data mining on graphics processors
Department of Computer Science and Engineering, Hong Kong University of Science and Technology
Technical Report HKUST-CS08-07, Oct 2008
@article{fangparallel,
title={Parallel Data Mining on Graphics Processors},
author={Fang, W. and Lau, K.K. and Lu, M. and Xiao, X. and Lam, C.K. and Yang, P.Y. and He, B. and Luo, Q. and Sander, P.V. and Yang, K.},
year={2008}
}
We introduce GPUMiner, a novel parallel data mining system that utilizes new-generation graphics processing units (GPUs). Our system relies on the massively multi-threaded SIMD (Single Instruction, Multiple-Data) architecture provided by GPUs. As specialpurpose co-processors, these processors are highly optimized for graphics rendering and rely on the CPU for data input/output as well as complex program control. Therefore, we design GPUMiner to consist of the following three components: (1) a CPU-based storage and buffer manager to handle I/O and data transfer between the CPU and the GPU, (2) a GPU-CPU co-processing parallel mining module, and (3) a GPU-based mining visualization module. We design the GPU-CPU co-processing scheme in mining depending on the complexity and inherent parallelism of individual mining algorithms. We provide the visualization module to facilitate users to observe and interact with the mining process online. We have implemented the k-means clustering and the Apriori frequent pattern mining algorithms in GPUMiner. Our preliminary results have shown significant speedups over state-of-the-art CPU implementations on a PC with a G80 GPU and a quad-core CPU. We will demonstrate the mining process through our visualization module. Code and documentation of GPUMiner are available at http://code.google.com/p/gpuminer/.
March 28, 2011 by hgpu