A GPU cluster optimized multigrid scheme for computing unsteady incompressible fluid flow

hgpu.org » Applications » Fluid dynamics » A GPU cluster optimized multigrid scheme for computing unsteady incompressible fluid flow

A GPU cluster optimized multigrid scheme for computing unsteady incompressible fluid flow

Gyorgy Tegze, Gyula I. Toth

Institute for Solid State Physics and Optics, Wigner Research Centre for Physics P.O. Box 49, H-1525 Budapest, Hungary

arXiv:1309.7128 [math.NA], (27 Sep 2013)

BibTeX

Download (PDF)

View

Source

1969

views

A multigrid scheme has been proposed that allows efficient implementation on modern CPUs, many integrated core devices (MICs), and graphics processing units (GPUs). It is shown that wide single instruction multiple data (SIMD) processing engines are used efficiently when a deep, 2h grid hierarchy is replaced with a two level scheme using 16h-32h restriction. The restriction length can be fitted to the SIMD width to fully utilize the capabilities of modern CPUs and GPUs. This way, optimal memory transfer is also ensured, since no strided memory access is required. The number of the expensive restriction steps is greatly reduced, and these are executed on bigger chunks of data that allows optimal caching strategies. A higher order interpolated stencil was developed to improve convergence rate via minimizing spurious interference between the coarse and the fine scale solutions. The method is demonstrated on solving the pressure equation for 2D incompressible fluid flow: The benchmark setups cover shear driven laminar flow in cavity, and direct numerical simulation (DNS) of a turbulent jet. We show that the scheme also allows efficient usage of distributed memory computer clusters via decreasing the number of memory transfers between host and compute devices, and among cluster nodes. The actual implementation uses a hybrid OpenCl/MPI based parallelization.

Tags: ATI, ATI Radeon HD 7970, Fluid dynamics, GPU cluster, Mathematics, MPI, Numerical Analysis, Numerical simulation, nVidia, nVidia GeForce GTX 680, OpenCL

September 30, 2013 by hgpu

No votes yet.

Please wait...

high performance computing on graphics processing units: hgpu.org

A GPU cluster optimized multigrid scheme for computing unsteady incompressible fluid flow

Recent source codes

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

chemtrain: Training Molecular Dynamics Potentials in JAX

microSYCL: SYCL micro-benchmarks repository

XaaS containers

SYCL Container

CASS: Cuda-Amd aSSembly

Cluser of smartphones for edge computing application using TensorFlow

CFAL-bench

Efficient Graph Embedding at Scale: Optimizing CPU-GPU-SSD Integration

Can Large Language Models Predict Parallel Code Performance?

Most viewed papers (last 30 days)

A GPU cluster optimized multigrid scheme for computing unsteady incompressible fluid flow

Share this:

Recent source codes

Most viewed papers (last 30 days)