CUDA-accelerated Hierarchical K-means
National Taiwan University
National Taiwan University, 2012
@article{lee2012cuda,
title={CUDA-accelerated Hierarchical K-means},
author={Lee, C.C. and Chu, K.Y.},
year={2012}
}
In 2011, more than 350 billion photos are generated in a single year. Thus, it is indispensable to use statistic tools for managing data, such as clustering. K-Means is one of the most used clustering methods because it is easy to implement. However, when the number of clusters grows larger, the speed of K-Means become slower. Thus, Hierarchical K-Means (HKM) appears, which is much faster than K-Means. Instead of directly calculating K centroids, HKM divides this task into L layers. In each layer, it finds K’ << K centroids for each cluster found by upper layer. In the end, it can find (K’)^L ~ K centroids at the bottom layer. However, it is still slow when dealing with large data. Thus, in this project, we further improve the speed of HKM by using CUDA. In the experiments, we show that the GPU-version can reach about 700 times speed-up over the CPU-version. We also show a simple application that helps us to manage the photos by using HKM.
July 1, 2012 by hgpu