high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » A stencil-based implementation of Parareal in the C++ domain specific embedded language STELLA

A stencil-based implementation of Parareal in the C++ domain specific embedded language STELLA

Andrea Arteaga, Daniel Ruprecht, Rolf Krause

Center for Climate System and Modeling, ETH Zurich, Universitatstrasse 16, CH-8092 Zurich, Switzerland

arXiv:1409.8563 [cs.DC], (30 Sep 2014)

BibTeX

Download (PDF)

View

Source

2032

views

In view of the rapid rise of the number of cores in modern supercomputers, time-parallel methods that introduce concurrency along the temporal axis are becoming increasingly popular. For the solution of time-dependent partial differential equations, these methods can add another direction for concurrency on top of spatial parallelization. The paper presents an implementation of the time-parallel Parareal method in a C++ domain specific language for stencil computations (STELLA). STELLA provides both an OpenMP and a CUDA backend for a shared memory parallelization, using the CPU or GPU inside a node for the spatial stencils. Here, we intertwine this node-wise spatial paralelism with the time-parallel Parareal. This is done by adding an MPI-based implementation of Parareal, which allows us to parallelize in time across nodes. The performance of Parareal with both backends is analyzed in terms of speedup, parallel efficiency and energy-to-solution for an advection-diffusion problem with a time-dependent diffusion coefficient.

Tags: Computer science, CUDA, Differential equations, MPI, nVidia, Partial differential equations, PDEs, Tesla K20

October 3, 2014 by hgpu

No votes yet.

Please wait...

Your response

You must be logged in to post a comment.

* * *

high performance computing on graphics processing units: hgpu.org

A stencil-based implementation of Parareal in the C++ domain specific embedded language STELLA

Your response

Recent source codes

Mutual-Supervised Learning for Sequential-to-Parallel Code Translation

Hardware Compute Partitioning on NVIDIA GPUs for Composable Systems

KISim: Kubernetes Intelligent Scheduling Simulator

Efficient GPU Implementation of Multi-Precision Integer Division

exa-AMD: Exascale Accelerated Materials Discovery

ParEval: A Parallel Code Evaluation Benchmark

FlashSparse: Minimizing Computation Redundancy for Fast Sparse Matrix Multiplications on Tensor Cores

WiLLM: An Open Wireless LLM Communication System

Vcc: the Vulkan Clang Compiler

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

Most viewed papers (last 30 days)

A stencil-based implementation of Parareal in the C++ domain specific embedded language STELLA

Share this:

Your response

Recent source codes

Most viewed papers (last 30 days)