Efficient Computation of k-Nearest Neighbour Graphs for Large High-Dimensional Data Sets on GPU Clusters
Department of Mechanical Engineering, Complex Systems Simulation Lab, University of Wisconsin-Milwaukee, Milwaukee, Wisconsin, United States of America
PLoS ONE 8(9): e74113. doi:10.1371/journal.pone.0074113, 2013
@article{dashti2013efficient,
title={Efficient Computation of k-Nearest Neighbour Graphs for Large High-Dimensional Data Sets on GPU Clusters},
author={Dashti, Ali and Komarov, Ivan and D’Souza, Roshan M},
journal={PLOS ONE},
volume={8},
number={9},
pages={e74113},
year={2013},
publisher={Public Library of Science}
}
This paper presents an implementation of the brute-force exact k-Nearest Neighbor Graph (k-NNG) construction for ultra-large high-dimensional data cloud. The proposed method uses Graphics Processing Units (GPUs) and is scalable with multi-levels of parallelism (between nodes of a cluster, between different GPUs on a single node, and within a GPU). The method is applicable to homogeneous computing clusters with a varying number of nodes and GPUs per node. We achieve a 6-fold speedup in data processing as compared with an optimized method running on a cluster of CPUs and bring a hitherto impossible k-NNG generation for a dataset of twenty million images with 15 k dimensionality into the realm of practical possibility.
September 28, 2013 by hgpu