Scaling Fast Multipole Methods up to 4000 GPUs
King Abdullah University of Science and Technology, 4700 KAUST, Thuwal, 23955-6900, Saudi Arabia
ATIP – A*CRC Workshop on Accelerator Technologies for High Performance Computing
@article{yokota2012scaling,
title={Scaling Fast Multipole Methods up to 4000 GPUs},
author={Yokota, R. and Narumi, T. and Barba, L. and Yasuoka, K.},
year={2012}
}
The Fast Multipole Method (FMM) is a hierarchical N-body algorithm with linear complexity, high arithmetic intensity, high data locality, has hierarchical communication patterns, and no global synchronization. The combination of these features allows the FMM to scale well on large GPU based systems, and to use their compute capability effectively. We present a 1 PFlop/s calculation of isotropic turbulence with 64 billion vortex particles using 4096 GPUs on the TSUBAME 2.0 system.
June 9, 2012 by hgpu