Petascale elliptic solvers for anisotropic PDEs on GPU clusters

hgpu.org » Programming » Algorithms » Petascale elliptic solvers for anisotropic PDEs on GPU clusters

Petascale elliptic solvers for anisotropic PDEs on GPU clusters

Eike Hermann Muller, Robert Scheichl, Benson Muite, Eero Vainikko

Department of Mathematical Sciences, University of Bath, Bath BA2 7AY, United Kingdom

arXiv:1402.3545 [cs.DC], (14 Feb 2014)

BibTeX

Download (PDF)

View

Source

2004

views

Memory bound applications such as solvers for large sparse systems of equations remain a challenge for GPUs. Fast solvers should be based on numerically efficient algorithms and implemented such that global memory access is minimised. To solve systems with up to one trillion (10^12) unknowns the code has to make efficient use of several million individual processor cores on large GPU clusters. We describe the multi-GPU implementation of two algorithmically optimal iterative solvers for anisotropic elliptic PDEs. For this we parallelise the solvers and adapt them to the specific features of GPU architectures, paying particular attention to efficient global memory access. We achieve a performance of up to 0.78 PFLOPs when solving an equation with 0.55×10^12 unknowns on 16384 GPUs; this corresponds to about 3% of the theoretical peak performance of the machine and we use more than 40% of the peak memory bandwidth with a Conjugate Gradient (CG) solver. Although the other solver, a geometric multigrid algorithm, has a slightly worse performance in terms of FLOPs per second, overall it is faster as it needs less iterations to converge; the multigrid algorithm can solve a linear PDE with half a trillion unknowns in about one second.

Tags: Algorithms, Computer science, CUDA, GPU cluster, Numerical Analysis, nVidia, Partial differential equations, PDEs, Tesla K20, Tesla M2090

February 17, 2014 by hgpu

No votes yet.

Please wait...

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

Engineering Supercomputing Platforms for Biomolecular Applications

high performance computing on graphics processing units: hgpu.org