high performance computing on graphics processing units: hgpu.org

Posts

Jul, 10

A hybrid Hermitian general eigenvalue solver

The adoption of hybrid GPU-CPU nodes in traditional supercomputing platforms opens acceleration opportunities for electronic structure calculations in materials science and chemistry applications, where medium sized Hermitian generalized eigenvalue problems must be solved many times. The small size of the problems limits the scalability on a distributed memory system, hence they can benefit from the […]

CUDA

Jul, 9

Parallelising the Transfer-Matrix Method using Graphics Processors

We study the disorder-induced Anderson localisation of a d-dimensional solid, computing the localisation lengths using the Transfer-Matrix Method (TMM) and aiming to develop an efficient parallel implementation to run on Graphics Processing Units (GPUs). In the TMM, a quasi one-dimensional bar of length L >> M is split into slices of size M^(d-1). The Schrodinger […]

CUDA

Jul, 9

Performance models for CUDA streams on NVIDIA GeForce series

Graphics Processing Units (GPU) have impressively arisen as generalpurpose coprocessors in high performance computing applications, since the launch of the Compute Unified Device Architecture (CUDA). However, they present an inherent performance bottleneck in the fact that communication between two separate address spaces (the main memory of the CPU and the memory of the GPU) is […]

CUDA

Jul, 9

Elastically Deformable Models based on the Finite Element Method Accelerated on Graphics Hardware using CUDA

Elastically deformable models have found applications in various areas ranging from mechanical sciences and engineering to computer graphics. The method of Finite Elements has been the tool of choice for solving the underlying PDE, when accuracy and stability of the computations are more important than, e.g., computation time. In this paper we show that the […]

CUDA

Jul, 9

Intensity model with blur effect on GPUs applied to large-scale star simulators

Intensity model with blur effect is widely employed to accurately simulate the imaging process of star simulator used for attitude determination and guiding system. It imposes great demands of computing power for realistic domains and modern Graphics Processing Units (GPUs) have demonstrated to be a powerful accelerator for this kind of computationally intensive simulations. This […]

CUDA

Jul, 9

Complete PISO and SIMPLE solvers on Graphics Processing Units

We implemented the pressure-implicit with splitting of operators (PISO) and semi-implicit method for pressure-linked equations (SIMPLE) solvers of the Navier-Stokes equations on Fermi-class graphics processing units (GPUs) using the CUDA technology. We also introduced a new format of sparse matrices optimized for performing elementary CFD operations, like gradient or divergence discretization, on GPUs. We verified […]

CUDA

Jul, 8

Fast GPU Garment Simulation and Collision Detection

This paper describes a technique for garment simulation and collision detection implemented on modern Graphics Processors (GPU). It exploits a mass-spring cloth model with velocity modification approach to overcome the super-elasticity. Our novel algorithms for cloth-body and cloth-cloth collision detection and response are based on image-space interference tests. For collision detection a 3D texture is […]

OpenGL

Jul, 8

Interactive BRDF Estimation for Mixed-Reality Applications

Recent methods in augmented reality allow simulating mutual light interactions between real and virtual objects. These methods are able to embed virtual objects in a more sophisticated way than previous methods. However, their main drawback is that they need a virtual representation of the real scene to be augmented in the form of geometry and […]

Jul, 8

Teaching Parallel Programming Models on a Shallow-Water Code

We present a software package that supports teaching different parallel programming models in a computational science and engineering context. It implements a Finite Volume solver for the shallow water equations, with application to tsunami simulation in mind. The numerical model is kept simple, using patches of Cartesian grids as computational domain, which can be connected […]

CUDA

Jul, 8

Utilizing GPGPU in Computer Emulation

The article deals with the idea of computer emulation using the GPGPU technology in order to get performance improvements. Basic assumptions for using stream processing in computer emulation effectively are discussed and the structure of an emulator, together with the emulation technique are proposed. The emulator structure, in this case, is of distributed nature, so […]

OpenCL

Jul, 8

GPU-Optimized Molecular Dynamics Simulations

Protein and RNA biomolecular folding and assembly problems have important applications because misfolding events are associated with diseases like Alzheimer’s and Parkinson’s. However, simulating biologically relevant sized biomolecules on timescales that correspond to biological functions is an extraordinary challenge due to computational bottlenecks that are mainly involved in force calculations. We briefly review the molecular […]

CUDA

Jul, 7

Random Walks for Image Cosegmentation

We recast the Cosegmentation problem using Random Walker (RW) segmentation as the core segmentation algorithm, rather than the traditional MRF approach adopted in the literature so far. Our formulation is similar to previous approaches in the sense that it also permits Cosegmentation constraints (which impose consistency between the extracted objects from >= 2 images) using […]

CUDA

high performance computing on graphics processing units: hgpu.org

Posts

A hybrid Hermitian general eigenvalue solver

Parallelising the Transfer-Matrix Method using Graphics Processors

Performance models for CUDA streams on NVIDIA GeForce series

Elastically Deformable Models based on the Finite Element Method Accelerated on Graphics Hardware using CUDA

Intensity model with blur effect on GPUs applied to large-scale star simulators

Complete PISO and SIMPLE solvers on Graphics Processing Units

Fast GPU Garment Simulation and Collision Detection

Interactive BRDF Estimation for Mixed-Reality Applications

Teaching Parallel Programming Models on a Shallow-Water Code

Utilizing GPGPU in Computer Emulation

GPU-Optimized Molecular Dynamics Simulations

Random Walks for Image Cosegmentation

Recent source codes

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

chemtrain: Training Molecular Dynamics Potentials in JAX

microSYCL: SYCL micro-benchmarks repository

XaaS containers

SYCL Container

CASS: Cuda-Amd aSSembly

Cluser of smartphones for edge computing application using TensorFlow

CFAL-bench

Efficient Graph Embedding at Scale: Optimizing CPU-GPU-SSD Integration

Can Large Language Models Predict Parallel Code Performance?

Most viewed papers (last 30 days)