A Tuned and Scalable Fast Multipole Method as a Preeminent Algorithm for Exascale Systems

Rio Yokota, Lorena Barba
Mechanical Engineering Department, Boston University, Boston MA 02215
arXiv:1106.2176v1 [cs.NA] (10 Jun 2011)


   author={Yokota}, R. and {Barba}, L.},

   title={"{A Tuned and Scalable Fast Multipole Method as a Preeminent Algorithm for Exascale Systems}"},

   journal={ArXiv e-prints},




   keywords={Computer Science – Numerical Analysis, 70F10, D.1.3, G.1.0, G.1.2},




   adsnote={Provided by the SAO/NASA Astrophysics Data System}


Download Download (PDF)   View View   Source Source   



Achieving computing at the exascale means accelerating today’s applications by one thousand times. Clearly, this cannot be accomplished by hardware alone, at least not in the short time frame expected for reaching this performance milestone. Thus, a lively discussion has begun in the last couple of years about programming models, software components and tools, and algorithms that will facilitate exascale computing. Among the algorithms that are likely to play a preeminent role in the new world of computing, the fast multipole method (FMM) appears as a rising star. Due to its hierarchical nature and the techniques used to access the data via a tree structure, it is not a locality-sensitive application. It also enjoys favorable synchronization patterns, again, of a hierarchical nature, where many operations can happen simultaneously at each level of the hierarchy. In this paper, we present a discussion of the features of the F MM that make it a particularly favorable algorithm for the emerging heterogeneous, massively parallel architectural landscape. We back this up with results from a campaign of performance tuning and scalability studies using multi-core and GPU hardware.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2021 hgpu.org

All rights belong to the respective authors

Contact us: