high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » Performance Analysis of a High-level Abstractions-based Hydrocode on Future Computing Systems

Performance Analysis of a High-level Abstractions-based Hydrocode on Future Computing Systems

G.R. Mudalige, I.Z. Reguly, M.B. Giles, A.C. Mallinson, W.P. Gaudin, J.A. Herdman

Oxford e-Research Centre, University of Oxford, 7, Keble Road Oxford OX1 3QG

5th International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS14), 2014

BibTeX

Download (PDF)

View

Source

1393

views

In this paper we present research on applying a domain specific high-level abstractions (HLA) development strategy with the aim to "future-proof" a key class of high performance computing (HPC) applications that simulate hydrodynamics computations at AWE plc. We build on an existing high-level abstraction framework, OPS, that is being developed for the solution of multi-block structured mesh-based applications at the University of Oxford. OPS uses an "active library" approach where a single application code written using the OPS API can be transformed into different highly optimized parallel implementations which can then be linked against the appropriate parallel library enabling execution on different back-end hardware platforms. The target application in this work is the CloverLeaf mini-app from Sandia National Laboratory’s Mantevo suite of codes that consists of algorithms of interest from hydrodynamics workloads. Specifically, we present (1) the lessons learnt in re-engineering an industrial representative hydro-dynamics application to utilize the OPS high-level framework and subsequent code generation to obtain a range of parallel implementations, and (2) the performance of the auto-generated OPS versions of CloverLeaf compared to that of the performance of the hand-coded original CloverLeaf implementations on a range of platforms. Benchmarked systems include Intel multi-core CPUs and NVIDIA GPUs, the Archer (Cray XC30) CPU cluster and the Titan (Cray XK7) GPU cluster with different parallelizations (OpenMP, OpenACC, CUDA, OpenCL and MPI). Our results show that the development of parallel HPC applications using a high-level framework such as OPS is no more time consuming nor difficult than writing a one-off parallel program targeting only a single parallel implementation. However the OPS strategy pays off with a highly maintainable single application source, through which multiple parallelizations can be realized, without compromising performance portability on a range of parallel systems.

Tags: Code generation, Computer science, CUDA, GPU cluster, MPI, nVidia, OpenACC, OpenCL, Tesla K20

December 1, 2014 by hgpu

No votes yet.

Please wait...

high performance computing on graphics processing units: hgpu.org

Performance Analysis of a High-level Abstractions-based Hydrocode on Future Computing Systems

Recent source codes

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

chemtrain: Training Molecular Dynamics Potentials in JAX

microSYCL: SYCL micro-benchmarks repository

XaaS containers

SYCL Container

CASS: Cuda-Amd aSSembly

Cluser of smartphones for edge computing application using TensorFlow

CFAL-bench

Efficient Graph Embedding at Scale: Optimizing CPU-GPU-SSD Integration

Can Large Language Models Predict Parallel Code Performance?

Most viewed papers (last 30 days)

Performance Analysis of a High-level Abstractions-based Hydrocode on Future Computing Systems

Share this:

Recent source codes

Most viewed papers (last 30 days)