high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Fluid dynamics » Multi-level Parallelism with MPI and OpenACC for CFD Applications

Multi-level Parallelism with MPI and OpenACC for CFD Applications

Andrew J. McCall

Virginia Polytechnic Institute

Virginia Polytechnic Institute, 2017

@phdthesis{mccall2017multi,

title={Multi-level Parallelism with MPI and OpenACC for CFD Applications},

author={McCall, Andrew James},

year={2017},

school={Virginia Tech}

}

Download (PDF)

View

Source

2680

views

High-level parallel programming approaches, such as OpenACC, have recently become popular in complex fluid dynamics research since they are cross-platform and easy to implement. OpenACC is a directive-based programming model that, unlike low-level programming models, abstracts the details of implementation on the GPU. Although OpenACC generally limits the performance of the GPU, this model significantly reduces the work required to port an existing code to any accelerator platform, including GPUs. The purpose of this research is twofold: to investigate the effectiveness of OpenACC in developing a portable and maintainable GPU-accelerated code, and to determine the capability of OpenACC to accelerate large, complex programs on the GPU. In both of these studies, the OpenACC implementation is optimized and extended to a multi-GPU implementation while maintaining a unified code base. OpenACC is shown as a viable option for GPU computing with CFD problems. In the first study, a CFD code that solves incompressible cavity flows is accelerated using OpenACC. Overlapping communication with computation improves performance for the multi-GPU implementation by up to 21%, achieving up to 400 times faster performance than a single CPU and 99% weak scalability efficiency with 32 GPUs. The second study ports the execution of a more complex CFD research code to the GPU using OpenACC. Challenges using OpenACC with modern Fortran are discussed. Three test cases are used to evaluate performance and scalability. The multi-GPU performance using 27 GPUs is up to 100 times faster than a single CPU and maintains a weak scalability efficiency of 95%.

Tags: cfd, Fluid dynamics, Fortran, MPI, nVidia, OpenACC, Tesla C2075, Tesla K80, Tesla M2050, Thesis

June 21, 2017 by hgpu

No votes yet.

Please wait...

Your response

You must be logged in to post a comment.

high performance computing on graphics processing units: hgpu.org

Multi-level Parallelism with MPI and OpenACC for CFD Applications

Your response

Recent source codes

Interleaved Learning and Exploration: A Self-Adaptive Fuzz Testing Framework for MLIR

Pinocchio: PINpointing Orbit Crossing Collapsed Hierarchical Objects

KernelCoder: trained on a curated dataset of reasoning traces and CUDA kernel pairs

VibeCodeHPC - Multi Agentic Vibe Coding for HPC

Compile-Time Resource Safety for GPU APIs: A Low-Overhead Typestate Framework

exa-AMD: Exascale Accelerated Materials Discovery

TRUST: a thermalhydraulic software package for CFD simulations

Modular: The Modular Platform (includes MAX & Mojo)

Allo: Accelerator Design Language

Towards Robust Agentic CUDA Kernel Benchmarking, Verification, and Optimization

Most viewed papers (last 30 days)

Multi-level Parallelism with MPI and OpenACC for CFD Applications

Share this:

Your response

Recent source codes

Most viewed papers (last 30 days)