high performance computing on graphics processing units: hgpu.org

Posts

Jul, 28

GPU accelerated Monte Carlo simulations of lattice spin models

We consider Monte Carlo simulations of classical spin models of statistical mechanics using the massively parallel architecture provided by graphics processing units (GPUs). We discuss simulations of models with discrete and continuous variables, and using an array of algorithms ranging from single-spin flip Metropolis updates over cluster algorithms to multicanonical and Wang-Landau techniques to judge […]

CUDA

Jul, 28

A group theoretical toolbox for color image operators

In this paper we describe how to use the direct product of the dihedral group D(4) and the symmetric group 5(3) to automatically derive low-level image processing filter systems for RGB images. For important classes of stochastic processes it can be shown that the resulting operators lead to a block-diagonalization of the correlation matrix. We […]

Jul, 28

A collision detection algorithm using adaptive particle sensor

We present an adaptive algorithm for collision detection between rigid and non-rigid polygonal objects by improving particle sensor-based method which is more efficient in handling deformation of objects than bounding volume hierarchies-based method which has to be updated its bounding representations frequently. However, a problem of particle sensor-based is, a number of particles on each […]

Jul, 28

Iterative layer-based raytracing on CUDA

A raytracer consists in an application capable of tracing rays from a point into a scene in order to determine the closest sphere intersection along the ray’s direction. Because of their recursive nature, raytracing algorithms are hard to implement on architectures which do not support recursion, such as the NVIDIA CUDA architecture. Even if the […]

CUDA

Jul, 28

Real-time Semi-Global Matching on the CPU

Among the top-performing stereo algorithms on the Middlebury Stereo Database, Semi-Global Matching (SGM) is commonly regarded as the most efficient algorithm. Consequently, real-time implementations of the algorithm for graphics hardware (GPU) and reconfigurable hardware (FPGA) exist. However, the computation time on general purpose PCs is still more than a second. In this paper, a real-time […]

Jul, 28

Towards automatic Digital Surface Model generation using a Graphics Processing Unit

Digital Surface Models (DSM) are widely used in the earth sciences. It provides information for various geological studies and other applications. There are a number of methods for automatic DSM generation, each of which has its own strengths and weaknesses, none of which are perfect. Even though there are plenty of algorithms to date which […]

Jul, 28

Navier-Stokes on programmable graphics hardware using SMAC

Modern programmable graphics hardware offers sufficient computing power to suggest the implementation of traditional algorithms on the graphics processor. This paper describes a complete implementation of a standard technique to solve the incompressible Navier-Stokes fluid equations running entirely on the GPU: the SMAC (simplified marker and cell) method. This method is widely used in engineering […]

OpenGL

Jul, 28

View-Dependent Real-Time Rendering of Large Outdoor Scenes

A new real-time point-based rendering method of large outdoor scenes is presented. Based on our interactive subdivide method, polygonal trees and other vegetation were converted to point-based models, and then different level of details of trees and other vegetation were created using hierarchical clustering. Different level of details of terrain were created using diamond tree […]

Jul, 28

A novel hardware acceleration technique for high performance parallel FDTD method

In this paper, we introduce one novel hardware acceleration technique based on a vector unit built in a regular CPU for high performance electromagnetic simulation. We investigate the performance of parallel FDTD method on the Intel and AMD processors accelerated by the Vector Arithmetic Logic Unit (VALU), high performance cluster, and Graphics Processing Unit (GPU). […]

Jul, 28

Simulation of real-time explosion smoke based on Simplex-Noise

Classical smoke simulation methods based on Computational Fluid Dynamics (CFD) are rather expensive that cannot be used in real-time systems. In this paper, a method based on Simplex noise is proposed to achieve high realistic effects of explosion smoke. The colors of the smoke particles are disturbed by 3D Simplex-Noise and faded according to time, […]

Jul, 27

Image representation by blob and its application in CT reconstruction from few projections

The localized radial symmetric function, or blob, is an ideal alternative to the pixel basis for X-ray computed tomography (CT) image reconstruction. In this paper we develop image representation models using blob, and propose reconstruction methods for few projections data. The image is represented in a shift invariant space generated by a Gaussian blob or […]

CUDA

Jul, 27

Computation of electron quantum transport in graphene nanoribbons using GPU

The performance potential for simulating quantum electron transport on graphical processing units (GPUs) is studied. Using graphene ribbons of realistic sizes as an example it is shown that GPUs provide significant speed-ups in comparison to central processing units as the transverse dimension of the ribbon grows. The recursive Green’s function algorithm is employed and implementation […]

CUDA

high performance computing on graphics processing units: hgpu.org

Posts

GPU accelerated Monte Carlo simulations of lattice spin models

A group theoretical toolbox for color image operators

A collision detection algorithm using adaptive particle sensor

Iterative layer-based raytracing on CUDA

Real-time Semi-Global Matching on the CPU

Towards automatic Digital Surface Model generation using a Graphics Processing Unit

Navier-Stokes on programmable graphics hardware using SMAC

View-Dependent Real-Time Rendering of Large Outdoor Scenes

A novel hardware acceleration technique for high performance parallel FDTD method

Simulation of real-time explosion smoke based on Simplex-Noise

Image representation by blob and its application in CT reconstruction from few projections

Computation of electron quantum transport in graphene nanoribbons using GPU

Recent source codes

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

chemtrain: Training Molecular Dynamics Potentials in JAX

microSYCL: SYCL micro-benchmarks repository

XaaS containers

SYCL Container

CASS: Cuda-Amd aSSembly

Cluser of smartphones for edge computing application using TensorFlow

CFAL-bench

Efficient Graph Embedding at Scale: Optimizing CPU-GPU-SSD Integration

Can Large Language Models Predict Parallel Code Performance?

Most viewed papers (last 30 days)