Accelerating BIRCH for Clustering Large Scale Streaming Data Using CUDA Dynamic Parallelism

Jianqiang Dong, Fei Wang, Bo Yuan
Intelligent Computing Lab, Division of Informatics, Graduate School at Shenzhen, Tsinghua University, Shenzhen 518055, P.R. China
14th International Conference on Intelligent Data Engineering and Automated Learning, 2013


   title={Accelerating BIRCH for Clustering Large Scale Streaming Data Using CUDA Dynamic Parallelism},

   author={Dong, Jianqiang and Wang, Fei and Yuan, Bo},



Download Download (PDF)   View View   Source Source   



In this big data era, the capability of mining and analyzing large scale datasets is imperative. As data are becoming more abundant than ever before, data driven methods are playing a critical role in areas such as decision support and business intelligence. In this paper, we demonstrate how state-of-the-art GPUs and the Dynamic Parallelism feature of the latest CUDA platform can bring significant benefits to BIRCH, one of the most well-known clustering techniques for streaming data. Experiment results show that, on a number of benchmark problems, the GPU accelerated BIRCH can be made up to 154 times faster than the CPU version with good scalability and high accuracy. Our work suggests that massively parallel GPU computing is a promising and effective solution to the challenges of big data.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2021 hgpu.org

All rights belong to the respective authors

Contact us: