Scaling Fast Multipole Methods up to 4000 GPUs

Rio Yokota, Lorena Barba, Tetsu Narumi, Kenji Yasuoka
King Abdullah University of Science and Technology, 4700 KAUST, Thuwal, 23955-6900, Saudi Arabia
ATIP – A*CRC Workshop on Accelerator Technologies for High Performance Computing


   title={Scaling Fast Multipole Methods up to 4000 GPUs},

   author={Yokota, R. and Narumi, T. and Barba, L. and Yasuoka, K.},



Download Download (PDF)   View View   Source Source   



The Fast Multipole Method (FMM) is a hierarchical N-body algorithm with linear complexity, high arithmetic intensity, high data locality, has hierarchical communication patterns, and no global synchronization. The combination of these features allows the FMM to scale well on large GPU based systems, and to use their compute capability effectively. We present a 1 PFlop/s calculation of isotropic turbulence with 64 billion vortex particles using 4096 GPUs on the TSUBAME 2.0 system.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2021 hgpu.org

All rights belong to the respective authors

Contact us: