EPSILOD: efficient parallel skeleton for generic iterative stencil computations in distributed GPUs

hgpu.org » Applications » Computer science » EPSILOD: efficient parallel skeleton for generic iterative stencil computations in distributed GPUs

EPSILOD: efficient parallel skeleton for generic iterative stencil computations in distributed GPUs

Manuel de Castro, Inmaculada Santamaria-Valenzuela, Yuri Torres, Arturo Gonzalez-Escribano, Diego R. Llanos

Departamento de Informática, Escuela de Ingeniería Informática, Universidad de Valladolid, Campus Miguel Delibes s/n, 47011 Valladolid, Spain

The Journal of Supercomputing, 286, 2023

DOI:10.1007/s11227-022-05040-y

BibTeX

Download (PDF)

View

Source

Source codes

Package:

Controllers: a library written in C99 that provides a simplified way to program application that can exploit heterogeneous computational platforms including accelerators and/or multi-core CPUs

979

views

Iterative stencil computations are widely used in numerical simulations. They present a high degree of parallelism, high locality and mostly-coalesced memory access patterns. Therefore, GPUs are good candidates to speed up their computation. However, the development of stencil programs that can work with huge grids in distributed systems with multiple GPUs is not straightforward, since it requires solving problems related to the partition of the grid across nodes and devices, and the synchronization and data movement across remote GPUs. In this work, we present EPSILOD, a high-productivity parallel programming skeleton for iterative stencil computations on distributed multi-GPUs, of the same or different vendors that supports any type of n-dimensional geometric stencils of any order. It uses an abstract specification of the stencil pattern (neighbors and weights) to internally derive the data partition, synchronizations and communications. Computation is split to better overlap with communications. This paper describes the underlying architecture of EPSILOD, its main components, and presents an experimental evaluation to show the benefits of our approach, including a comparison with another state-of-the-art solution. The experimental results show that EPSILOD is faster and shows good strong and weak scalability for platforms with both homogeneous and heterogeneous types of GPU.

Tags: AMD FirePro W9100, ATI, Computer science, Heterogeneous systems, Numerical simulation, nVidia, OpenCL, Package, Stencil computation, Tesla V100

February 12, 2023 by hgpu

No votes yet.

Please wait...

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

Engineering Supercomputing Platforms for Biomolecular Applications

high performance computing on graphics processing units: hgpu.org