A Performance Model for the Communication in Fast Multipole Methods on HPC Platforms

hgpu.org » Programming » Algorithms » A Performance Model for the Communication in Fast Multipole Methods on HPC Platforms

A Performance Model for the Communication in Fast Multipole Methods on HPC Platforms

Huda Ibeid, Rio Yokota, David Keyes

Division of Computer, Electrical and Mathematical Sciences and Engineering, King Abdullah University of Science and Technology, Saudi Arabia

arXiv:1405.6362 [cs.DC], (25 May 2014)

BibTeX

Download (PDF)

View

Source

1990

views

Exascale systems are predicted to have approximately one billion cores, assuming Gigahertz cores. Limitations on affordable network topologies for distributed memory systems of such massive scale bring new challenges to the current parallel programing model. Currently, there are many efforts to evaluate the hardware and software bottlenecks of exascale designs. There is therefore an urgent need to model application performance and to understand what changes need to be made to ensure extrapolated scalability. The fast multipole method (FMM) was originally developed for accelerating N-body problems in astrophysics and molecular dynamics, but has recently been extended to a wider range of problems, including preconditioners for sparse linear solvers. It’s high arithmetic intensity combined with its linear complexity and asynchronous communication patterns makes it a promising algorithm for exascale systems. In this paper, we discuss the challenges for FMM on current parallel computers and future exascale architectures, with a focus on inter-node communication. We develop a performance model that considers the communication patterns of the FMM, and observe a good match between our model and the actual communication time, when latency, bandwidth, network topology, and multi-core penalties are all taken into account. To our knowledge, this is the first formal characterization of inter-node communication in FMM, which validates the model against actual measurements of communication time.

Tags: Algorithms, Astrophysics, Computer science, CUDA, Fast multipole method, Molecular dynamics, N-body simulation, nVidia, Physics, Tesla K20

June 1, 2014 by hgpu

No votes yet.

Please wait...

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

Engineering Supercomputing Platforms for Biomolecular Applications

high performance computing on graphics processing units: hgpu.org