high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Fluid dynamics » OpenACC offloading of the MFC compressible multiphase flow solver on AMD and NVIDIA GPUs

OpenACC offloading of the MFC compressible multiphase flow solver on AMD and NVIDIA GPUs

Benjamin Wilfong, Anand Radhakrishnan, Henry A. Le Berre, Steve Abbott, Reuben D. Budiardja, Spencer H. Bryngelson

Computational Science and Eng., Georgia Institute of Technology, Atlanta, Georgia

arXiv:2409.10729 [physics.flu-dyn], (16 Sep 2024)

DOI:10.48550/arXiv.2409.10729

@misc{wilfong2024openaccoffloadingmfccompressible,

title={OpenACC offloading of the MFC compressible multiphase flow solver on AMD and NVIDIA GPUs},

author={Benjamin Wilfong and Anand Radhakrishnan and Henry A. Le Berre and Steve Abbott and Reuben D. Budiardja and Spencer H. Bryngelson},

year={2024},

eprint={2409.10729},

archivePrefix={arXiv},

primaryClass={physics.flu-dyn},

url={https://arxiv.org/abs/2409.10729}

}

Download (PDF)

View

Source

Source codes

Package:

MFC: Exascale simulation of multiphase/physics fluid dynamics

1576

views

GPUs are the heart of the latest generations of supercomputers. We efficiently accelerate a compressible multiphase flow solver via OpenACC on NVIDIA and AMD Instinct GPUs. Optimization is accomplished by specifying the directive clauses ‘gang vector’ and ‘collapse’. Further speedups of six and ten times are achieved by packing user-defined types into coalesced multidimensional arrays and manual inlining via metaprogramming. Additional optimizations yield seven-times speedup in array packing and thirty-times speedup of select kernels on Frontier. Weak scaling efficiencies of 97% and 95% are observed when scaling to 50% of Summit and 95% of Frontier. Strong scaling efficiencies of 84% and 81% are observed when increasing the device count by a factor of 8 and 16 on V100 and MI250X hardware. The strong scaling efficiency of AMD’s MI250X increases to 92% when increasing the device count by a factor of 16 when GPU-aware MPI is used for communication.

Tags: AMD Radeon Instinct MI250X, ATI, cfd, Fluid dynamics, MPI, nVidia, OpenACC, Package, Tesla V100

September 29, 2024 by hgpu

No votes yet.

Please wait...

Your response

You must be logged in to post a comment.

* * *

high performance computing on graphics processing units: hgpu.org

OpenACC offloading of the MFC compressible multiphase flow solver on AMD and NVIDIA GPUs

Package:

Your response

Recent source codes

UniCoder: Unified Visual-to-Code Generation via Symbolic Rewards and Reference-Guided Code Optimization

CuFuzz: An API-Knowledge-Graph Coverage-Driven Fuzzing Framework for CUDA Libraries

AutoPass: Evidence-Guided LLM Agents for Compiler Performance Tuning

Probe-and-Refine Tuning of Repository Guidance for AI Coding Agents

CUDAnalyst (CUDA + Analyst)

CodegenBench

KernelBenchX: A Comprehensive Benchmark for Evaluating LLM-Generated GPU Kernels

CUDA Kernel Fusion Benchmarks

IntelliKit: Agent-first tooling for AMD hardware

DITRON: Distributed Compiler based on Triton for Parallel Systems

Most viewed papers (last 30 days)

OpenACC offloading of the MFC compressible multiphase flow solver on AMD and NVIDIA GPUs

Package:

Share this:

Your response

Recent source codes

Most viewed papers (last 30 days)