high performance computing on graphics processing units: hgpu.org

Posts

Mar, 2

Performance and accuracy of hardware-oriented native-, emulated- and mixed-precision solvers in FEM simulations (Part 2: Double Precision GPUs)

In a previous publication, we have examined the fundamental difference between computational precision and result accuracy in the context of the iterative solution of linear systems as they typically arise in the Finite Element discretization of Partial Differential Equations (PDEs) [1]. In particular, we evaluated mixed- and emulatedprecision schemes on commodity graphics processors (GPUs), which […]

CUDA

Mar, 2

Accelerating Double Precision FEM Simulations with GPUs

In visualization and computer graphics it has been shown that the numerical solution of PDE problems can be obtained much faster on graphics processors (GPUs) than on CPUs. However, GPUs are restricted to single precision floating point arithmetics which is insufficient for most technical scientific computations. Since we do not expect double precision support natively […]

OpenGL

Mar, 2

Integrating GPUs as fast co-processors into the existing parallel FE package FEAST

We report on our experiences with integrating GPUs as fast, parallel floating-point coprocessors into the parallel FE package FEAST. Since a full re-implementation of such a package is not feasible, we identify the smoothing of an outer domain-decomposition multigrid solver as a natural entry-point for a minimally invasive integration of GPUs. We address the issue […]

Mar, 2

Mixed-Precision GPU-Multigrid Solvers with Strong Smoothers

In this chapter, we present efficient fine-grained parallelization techniques for robust multigrid solvers, in particular for numerically strong, inherently sequential smoothing operators. We apply them to sparse ill-conditioned linear systems of equations that arise from grid-based discretization techniques like finite differences, volumes and elements. Our exemplary results demonstrate both the numerical and runtime performance of […]

CUDA

Mar, 2

Hardware-Oriented Multigrid Finite Element Solvers on GPU-Accelerated Clusters

The accurate simulation of real-world phenomena in computational science is often based on an underlying mathematical model comprising a system of partial differential equations (PDEs). Important research fields that we pursue in this setting are computational solid mechanics and computational fluid dynamics (CSM and CFD, see Section 3). Practical applications range from material failure tests, […]

CUDA

Mar, 2

Finite Element Integration on GPUs

We present a novel finite element integration method for low order elements on GPUs. We achieve more than 100GF for element integration on first order discretizations of both the Laplacian and Elasticity operators.

CUDA

Mar, 1

GPU Computation Using Mathematica and CUDA webinar

The webinar will provide an overview and use cases for CUDA and OpenCL, as well as a tutorial on how to use CUDA from within Mathematica. Topics: Overview of GPU, CUDA, and OpenCL Image Processing on the GPU Programming the GPU Using Mathematica GPU Programming Workflow within Mathematica

Mar, 1

UCHPC – UnConventional High Performance Computing for Finite Element Simulations

Processor technology is still dramatically advancing and promises enormous improvements in processing data for the next decade. These improvements are driven by parallelisation and specialisation of resources, and “unconventional hardware” like GPUs or the Cell processor can be seen as forerunners of this development. At the same time, much smaller advances are expected in moving […]

Mar, 1

Performance and accuracy of hardware-oriented native-, emulated- and mixed-precision solvers in FEM simulations

In this survey paper, we compare native double precision solvers with emulated- and mixed-precision solvers of linear systems of equations as they typically arise in finite element discretisations. The emulation utilises two single float numbers to achieve higher precision, while the mixed precision iterative refinement computes residuals and updates the solution vector in double precision […]

Mar, 1

Using GPUs to Improve Multigrid Solver Performance on a Cluster

This article explores the coupling of coarse and fine-grained parallelism for Finite Element simulations based on efficient parallel multigrid solvers. The focus lies on both system performance and a minimally invasive integration of hardware acceleration into an existing software package, requiring no changes to application code. Because of their excellent price performance ratio, we demonstrate […]

OpenGL

Mar, 1

Co-processor acceleration of an unmodified parallel solid mechanics code with FEASTGPU

We have previously presented an approach to include graphics processing units as co-processors in a parallel Finite Element multigrid solver called FEAST. In this paper we show that the acceleration transfers to real applications built on top of FEAST, without any modifications of the application code. The chosen solid mechanics code is well suited to […]

CUDA

Mar, 1

Performance of inverse atomistic scale fracture modeling on GPGPU architectures

The present work has been motivated by the continuous growth of General Purpose Graphic Processor Unit (GPGPU) technologies as well as the necessity of linking usability with multiscale materials processing and design. The inverse problem of determining the phenomenological interparticle Lenard-Jones potential governing the fracture dynamics of a two dimensional structure under tension, is used […]

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

chemtrain: Training Molecular Dynamics Potentials in JAX

chemtrain-deploy: A parallel and scalable framework for machine learning potentials in million-atom MD simulations

microSYCL: SYCL micro-benchmarks repository

Exploring SYCL as a Portability Layer for High-Performance Computing on CPUs

See all packages

* * *

high performance computing on graphics processing units: hgpu.org

Posts

Performance and accuracy of hardware-oriented native-, emulated- and mixed-precision solvers in FEM simulations (Part 2: Double Precision GPUs)

Accelerating Double Precision FEM Simulations with GPUs

Integrating GPUs as fast co-processors into the existing parallel FE package FEAST

Mixed-Precision GPU-Multigrid Solvers with Strong Smoothers

Hardware-Oriented Multigrid Finite Element Solvers on GPU-Accelerated Clusters

Finite Element Integration on GPUs

GPU Computation Using Mathematica and CUDA webinar

UCHPC – UnConventional High Performance Computing for Finite Element Simulations

Performance and accuracy of hardware-oriented native-, emulated- and mixed-precision solvers in FEM simulations

Using GPUs to Improve Multigrid Solver Performance on a Cluster

Co-processor acceleration of an unmodified parallel solid mechanics code with FEASTGPU

Performance of inverse atomistic scale fracture modeling on GPGPU architectures

Recent source codes

Efficient GPU Implementation of Multi-Precision Integer Division

ParEval: A Parallel Code Evaluation Benchmark

FlashSparse: Minimizing Computation Redundancy for Fast Sparse Matrix Multiplications on Tensor Cores

exa-AMD: Exascale Accelerated Materials Discovery

WiLLM: An Open Wireless LLM Communication System

Vcc: the Vulkan Clang Compiler

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

chemtrain: Training Molecular Dynamics Potentials in JAX

microSYCL: SYCL micro-benchmarks repository

Most viewed papers (last 30 days)