high performance computing on graphics processing units: hgpu.org

Posts

Apr, 29

Serpent encryption algorithm implementation on Compute Unified Device Architecture (CUDA)

CUDA is a platform developed by Nvidia for general purpose computing on Graphic Processing Unit to utilize the parallelism capabilities. Serpent encryption is considered to have high security margin as its advantage; however it lacks in speed as its disadvantage. We present a methodology for the transformation of CPU-based implementation of Serpent encryption algorithm (in […]

CUDA

Apr, 29

Finite temperature lattice QCD with GPUs

Graphics Processing Units (GPUs) are being used in many areas of physics, since the performance versus cost is very attractive. The GPUs can be addressed by CUDA which is a NVIDIA’s parallel computing architecture. It enables dramatic increases in computing performance by harnessing the power of the GPU. We present a performance comparison between the […]

CUDA

Apr, 28

Accelerating DNA analysis applications on GPU clusters

DNA analysis is an emerging application of high performance bioinformatics. Modern sequencing machinery are able to provide, in few hours, large input streams of data which needs to be matched against exponentially growing databases of known fragments. The ability to recognize these patterns effectively and fastly may allow extending the scale and the reach of […]

Apr, 28

Aeolian Sand Movement and Interacting with Vegetation: A GPU Based Simulation and Visualization Method

Simulation and visualization on aeolian sand movement and its interaction with vegetation are a challenging subject. In this work, we proposed a physically and procedurally based modeling and simulation method that can be used to synthesize sandy terrain with vegetation covers. For realizing a real-time simulation process, we implemented the method on the programming graphics […]

OpenGL

Apr, 28

GPU acceleration of linear systems for computational electromagnetic simulations

The use of Graphical Processing Units (GPUs) to perform computational electromagnetic simulations has been proven over past several years to increase the calculation speed. By examining the various mathematical processes used in various techniques, appropriate algorithms for the GPU can be developed to speed up the simulations. Understanding how to map algorithms appropriately to the […]

Apr, 28

MATLAB graphical interface for GPU based FDTD method

Graphical processing units (GPU) has been recently used for the implementation of the FDTD method for electromagnetics. These processors, found inside graphics cards, are able to execute numerical calculations at speeds many times faster than that of modern CPUpsilas. Such speed gains allow for many common FDTD simulations to be performed in as little as […]

Apr, 28

GPU Ray Casting with Arbitrary Shaped Proxy

An improved volume rendering method is presented for interactive visualization of volume datasets. The key idea of our technique is using arbitrary shaped geometry as GPU (Graphical Processing Units) ray casting proxy instead of bounding box geometry. We implement this technique on CT datasets. The results show that the improvement of rendering efficiency can be […]

Apr, 28

GPU implementation of the pixel purity index algorithm for hyperspectral image analysis

Hyperspectral imaging is a new technique in remote sensing that generates images with hundreds of spectral bands, at different wavelength channels, for the same area on the surface of the Earth. The price paid for such a wealth of spectral information is the enormous amounts of data to be processed. In recent years, several efforts […]

CUDA

Apr, 28

Multi-Pass and Frame Parallel Algorithms of Motion Estimation in H.264/AVC for Generic GPU

In this paper, multi-pass and frame parallel algorithms are proposed to accelerate various motion estimation (ME) tools in H.264 with the graphics processing unit (GPU). By the multi-pass method to unroll and rearrange the multiple nested loops, the integer-pel ME can be implemented with two-pass process on GPU. Moreover, fractional ME needs six passes for […]

Apr, 28

Petascale Direct Numerical Simulation of Blood Flow on 200K Cores and Heterogeneous Architectures

We present a fast, petaflop-scalable algorithm for Stokesian particulate flows. Our goal is the direct simulation of blood, which we model as a mixture of a Stokesian fluid (plasma) and red blood cells (RBCs). Directly simulating blood is a challenging multiscale, multiphysics problem. We report simulations with up to 200 million deformable RBCs. The largest […]

CUDA

Apr, 28

Three-Dimensional Image Warping on Programmable Graphics Hardware

Many image-based rendering systems are based on three-dimensional image warping (3D Warping), which transforms pixels in reference image to destination view. However, the original 3D warping equation, proposed by McMillan and Bishop, is derived under one special coordinate system, making its inability of direct implementation on programmable graphics hardware. In this paper, we revisit the […]

OpenGL

Apr, 28

An 80-Fold Speedup, 15.0 TFlops Full GPU Acceleration of Non-Hydrostatic Weather Model ASUCA Production Code

Regional weather forecasting demands fast simulation over fine-grained grids, resulting in extremely memory- bottlenecked computation, a difficult problem on conventional supercomputers. Early work on accelerating mainstream weather code WRF using GPUs with their high memory performance, however, resulted in only minor speedup due to partial GPU porting of the huge code. Our full CUDA porting […]

CUDA

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

* * *

high performance computing on graphics processing units: hgpu.org

Posts

Serpent encryption algorithm implementation on Compute Unified Device Architecture (CUDA)

Finite temperature lattice QCD with GPUs

Accelerating DNA analysis applications on GPU clusters

Aeolian Sand Movement and Interacting with Vegetation: A GPU Based Simulation and Visualization Method

GPU acceleration of linear systems for computational electromagnetic simulations

MATLAB graphical interface for GPU based FDTD method

GPU Ray Casting with Arbitrary Shaped Proxy

GPU implementation of the pixel purity index algorithm for hyperspectral image analysis

Multi-Pass and Frame Parallel Algorithms of Motion Estimation in H.264/AVC for Generic GPU

Petascale Direct Numerical Simulation of Blood Flow on 200K Cores and Heterogeneous Architectures

Three-Dimensional Image Warping on Programmable Graphics Hardware

An 80-Fold Speedup, 15.0 TFlops Full GPU Acceleration of Non-Hydrostatic Weather Model ASUCA Production Code

Recent source codes

WiLLM: An Open Wireless LLM Communication System

Vcc: the Vulkan Clang Compiler

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

chemtrain: Training Molecular Dynamics Potentials in JAX

microSYCL: SYCL micro-benchmarks repository

XaaS containers

CASS: Cuda-Amd aSSembly

Cluser of smartphones for edge computing application using TensorFlow

SYCL Container

Most viewed papers (last 30 days)