high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » A portable C++ library for memory and compute abstraction on multi-core CPUs and GPUs

A portable C++ library for memory and compute abstraction on multi-core CPUs and GPUs

Pietro Incardona, Aryaman Gupta, Serhii Yaskovets, Ivo F. Sbalzarini

Faculty of Computer Science, Technische Universität Dresden, Dresden, Germany

Concurrency Computation Practice and Experience, e7870, 2023

DOI:10.1002/cpe.7870

BibTeX

Download (PDF)

View

Source

Source codes

Package:

OpenFPM: A portable C++ library for memory and compute abstraction on multi-core CPUs and GPUs

937

views

We present a C++ library for transparent memory and compute abstraction across CPU and GPU architectures. Our library combines generic data structures like vectors, multi-dimensional arrays, maps, graphs, and sparse grids with basic generic algorithms like arbitrary-dimensional convolutions, copying, merging, sorting, prefix sum, reductions, neighbor search, and filtering. The memory layout of the data structures is adapted at compile time using C++ tuples with optional memory double-mapping between host and device and the capability of using memory managed by external libraries with no data copying. We combine this transparent memory layout with generic thread-parallel algorithms under two alternative common interfaces: a CUDA-like kernel interface and a lambda-function interface. We quantify the memory and compute performance and portability of our implementation using micro-benchmarks, showing that the abstractions introduce negligible performance overhead, and we compare performance against the current state of the art in a real-world scientific application from computational fluid mechanics.

Tags: AMD RX Vega 64, ATI, Computer science, CUDA, nVidia, nVidia A100, nVidia GeForce RTX 3090, OpenACC, OpenCL, OpenMP, Package, Performance, performance portability, SYCL

July 30, 2023 by hgpu

No votes yet.

Please wait...

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

Engineering Supercomputing Platforms for Biomolecular Applications

high performance computing on graphics processing units: hgpu.org

A portable C++ library for memory and compute abstraction on multi-core CPUs and GPUs

Package:

Recent source codes

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

chemtrain: Training Molecular Dynamics Potentials in JAX

microSYCL: SYCL micro-benchmarks repository

XaaS containers

CASS: Cuda-Amd aSSembly

Cluser of smartphones for edge computing application using TensorFlow

SYCL Container

Efficient Graph Embedding at Scale: Optimizing CPU-GPU-SSD Integration

Can Large Language Models Predict Parallel Code Performance?

Most viewed papers (last 30 days)

A portable C++ library for memory and compute abstraction on multi-core CPUs and GPUs

Package:

Share this:

Recent source codes

Most viewed papers (last 30 days)