high performance computing on graphics processing units: hgpu.org

hgpu.org » Programming » Algorithms » GPU Offloading in ExaHyPE Through C++ Standard Algorithms

GPU Offloading in ExaHyPE Through C++ Standard Algorithms

Uzmar Gomez, Gonzalo Brito Gadeschi, Tobias Weinzierl

Department of Computer Science, Durham University, United Kingdom

arXiv:2302.09005 [cs.MS], (17 Feb 2023)

DOI:10.48550/arXiv.2302.09005

BibTeX

Download (PDF)

View

Source

1100

views

The ISO C++17 standard introduces parallel algorithms, a parallel programming model promising portability across a wide variety of parallel hardware including multi-core CPUs, GPUs, and FPGAs. Since 2019, the NVIDIA HPC SDK compiler suite supports this programming model for multi-core CPUs and GPUs. ExaHyPE is a solver engine for hyperbolic partial differential equations for complex wave phenomena. It supports multiple numerical methods including Finite Volumes and ADER-DG, and employs adaptive mesh refinement with dynamic load balancing via space-filling curves as well as task-based parallelism and offloading to GPUs. This study ports ExaHyPE’s tasks over blocks of Finite Volumes to the ISO C++ parallel algorithms programming model, and compares its performance and usability against an OpenMP implementation with offloading via OpenMP target directives. It shows that ISO C++ is a feasible programming model for non-trivial applications like our task-based AMR code. The realisation is bare of vendor-specific or non-C++ extensions. It however is slower than its OpenMP counterpart.

Tags: Algorithms, Computer science, CUDA, Differential equations, Mathematical Software, nVidia, nVidia GeForce RTX 2080 Ti, OpenMP, Partial differential equations, PDEs

February 26, 2023 by hgpu

No votes yet.

Please wait...

Your response

You must be logged in to post a comment.

* * *

high performance computing on graphics processing units: hgpu.org

GPU Offloading in ExaHyPE Through C++ Standard Algorithms

Your response

Recent source codes

Mutual-Supervised Learning for Sequential-to-Parallel Code Translation

Hardware Compute Partitioning on NVIDIA GPUs for Composable Systems

KISim: Kubernetes Intelligent Scheduling Simulator

Efficient GPU Implementation of Multi-Precision Integer Division

exa-AMD: Exascale Accelerated Materials Discovery

ParEval: A Parallel Code Evaluation Benchmark

FlashSparse: Minimizing Computation Redundancy for Fast Sparse Matrix Multiplications on Tensor Cores

WiLLM: An Open Wireless LLM Communication System

Vcc: the Vulkan Clang Compiler

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

Most viewed papers (last 30 days)

GPU Offloading in ExaHyPE Through C++ Standard Algorithms

Share this:

Your response

Recent source codes

Most viewed papers (last 30 days)