high performance computing on graphics processing units: hgpu.org

hgpu.org » Programming » CUDA » Daino: A High-level Framework for Parallel and Efficient AMR on GPUs

Daino: A High-level Framework for Parallel and Efficient AMR on GPUs

Mohamed Wahib, Naoya Maruyama, Takayuki Aoki

RIKEN Advanced Institute for Computational Science, Kobe, Japan

SC16: The International Conference for High Performance Computing, Networking, Storage and Analysis 2016, Salt Lake City, UT

@{,

}

Download (PDF)

View

Source

1474

views

Adaptive Mesh Refinement methods reduce computational requirements of problems by increasing resolution for only areas of interest. However, in practice, efficient AMR implementations are difficult considering that the mesh hierarchy management must be optimized for the underlying hardware. Architecture complexity of GPUs can render efficient AMR to be particularity challenging in GPU-accelerated supercomputers. This paper presents a compiler-based high-level framework that can automatically transform serial uniform mesh code annotated by the user into parallel adaptive mesh code optimized for GPU-accelerated supercomputers. We also present a method for empirical analysis of a uniform mesh to project an upper- bound on achievable speedup of a GPU-optimized AMR code. We show experimental results on three production applications. The speedups of code generated by our framework are comparable to hand-written AMR code while achieving good and weak scaling up to 1000 GPUs.

Tags: Adaptive Mesh Refinement, CUDA

August 5, 2016 by wahibium

Rating: 2.1/5. From 27 votes.

Please wait...

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

gpu_tracker: Python package for tracking and profiling GPU utilization in both desktop and high-performance computing environments

high performance computing on graphics processing units: hgpu.org

Daino: A High-level Framework for Parallel and Efficient AMR on GPUs

Recent source codes

SimSYCL: Synchronous, single-threaded, library-only SYCL implementation for debugging and verification

GPU plugin for PySCF

QArray

Celerity: High-level C++ for Accelerator Clusters

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

CIFAR-10 Airbench: 94% on CIFAR-10 in 3.29 second

LOOPer: a polyhedral compiler for expressing fast and portable data parallel algorithms

OpenMC Monte Carlo Code

Polygeist: C/C++ frontend for MLIR

Parallel Gaussian process with kernel approximation in CUDA

Most viewed papers (last 30 days)

Daino: A High-level Framework for Parallel and Efficient AMR on GPUs

Share this:

Recent source codes

Most viewed papers (last 30 days)