Towards accelerating Smoothed Particle Hydrodynamics simulations for free-surface flows on multi-GPU clusters

hgpu.org » Programming » Algorithms » Towards accelerating Smoothed Particle Hydrodynamics simulations for free-surface flows on multi-GPU clusters

Towards accelerating Smoothed Particle Hydrodynamics simulations for free-surface flows on multi-GPU clusters

Daniel Valdez-Balderas, Jose M. Dominguez, Benedict D. Rogers, Alejandro J.C. Crespo

Modelling and Simulation Centre (MaSC), School of Mechanical, Aeroespace & Civil Engineering, University of Manchester, Manchester, M13 9PL, UK

arXiv:1210.1017 [physics.comp-ph] (3 Oct 2012)

BibTeX

Download (PDF)

View

Source

2565

views

Starting from the single graphics processing unit (GPU) version of the Smoothed Particle Hydrodynamics (SPH) code DualSPHysics, a multi-GPU SPH program is developed for free-surface flows. The approach is based on a spatial decomposition technique, whereby different portions (sub-domains) of the physical system under study are assigned to different GPUs. Communication between devices is achieved with the use of Message Passing Interface (MPI) application programming interface (API) routines. The use of the sorting algorithm radix sort for inter-GPU particle migration and sub-domain halo building (which enables interaction between SPH particles of different subdomains) is described in detail. With the resulting scheme it is possible, on the one hand, to carry out simulations that could also be performed on a single GPU, but they can now be performed even faster than on one of these devices alone. On the other hand, accelerated simulations can be performed with up to 32 million particles on the current architecture, which is beyond the limitations of a single GPU due to memory constraints. A study of weak and strong scaling behaviour, speedups and efficiency of the resulting programis presented including an investigation to elucidate the computational bottlenecks. Last, possibilities for reduction of the effects of overhead on computational efficiency in future versions of our scheme are discussed.

Tags: Algorithms, Computational Physics, Condensed matter, CUDA, Fluid dynamics, GPU cluster, Instrumentation and Methods for Astrophysics, Molecular dynamics, MPI, nVidia, nVidia GeForce GTX 480, Physics, Sorting, Tesla M2050

October 4, 2012 by hgpu

No votes yet.

Please wait...

Your response

You must be logged in to post a comment.

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

chemtrain: Training Molecular Dynamics Potentials in JAX

chemtrain-deploy: A parallel and scalable framework for machine learning potentials in million-atom MD simulations

microSYCL: SYCL micro-benchmarks repository

Exploring SYCL as a Portability Layer for High-Performance Computing on CPUs

See all packages

* * *

high performance computing on graphics processing units: hgpu.org