Oct-tree Method on GPU
Department of Computer Science and Engineering, University of Aizu, Aizu-Wakamatsu, Fukushima 965-0815, Japan
arXiv:0909.0541v1 [astro-ph.IM] (2 Sep 2009)
@article{nakasato2009oct,
title={Oct-tree Method on GPU},
author={Nakasato, N.},
journal={Arxiv preprint arXiv:0909.0541},
year={2009}
}
The kd-tree is a fundamental tool in computer science. Among others, an application of the kd-tree search (oct-tree method) to fast evaluation of particle interactions and neighbor search is highly important since computational complexity of these problems are reduced from O(N^2) with a brute force method to O(N log N) with the tree method where N is a number of particles. In this paper, we present a parallel implementation of the tree method running on a graphic processor unit (GPU). We successfully run a simulation of structure formation in the universe very efficiently. On our system, which costs roughly $900, the run with N ~ 2.87×10^6 particles took 5.79 hours and executed 1.2×10^13 force evaluations in total. We obtained the sustained computing speed of 21.8 Gflops and the cost per Gflops of 41.6/Gflops that is two and half times better than the previous record in 2006.
October 27, 2010 by hgpu