high performance computing on graphics processing units: hgpu.org

Posts

Jan, 3

High Speed Articulated Object Tracking Using GPUs: A Particle Filter Approach

This paper presents a novel application of the GPU processing power to a very computationally demanding articulated human body tracking problem in a view-based approach. This work includes some optimizations at the algorithmic level as well as some tricks at the implementation level using OpenGL and shader programming. An underlying particle filter framework is combined […]

OpenGL

Jan, 3

FPGA-based acceleration of CHARMM-potential minimization

Energy minimization is an important step in molecular modeling, with applications in molecular docking and in mapping binding sites. Minimization involves repeated evaluation of various bonded and non-bonded energies of a protein complex. It is a computationally expensive process, with runtimes typically being many hours on a desktop system. In the current article, we present […]

Jan, 3

Wave field synthesis for 3D audio: architectural prospectives

In this paper, we compare the architectural perspectives of the Wave Field Synthesis (WFS) 3D-audio algorithm mapped on three different platforms: a General Purpose Processor (GPP), a Graphics Processor Unit (GPU) and a Field Programmable Gate Array (FPGA). Previous related work reveals that, up to now, WFS sound systems are based on standard PCs. However, […]

CUDA

Jan, 3

Utilizing jump flooding in image-based soft shadows

This paper studies the usage of the GPU as a collection of groups of related processing units, where each group communicates in some way to complete a computation efficiently and effectively. In particular, we use the GPU to perform jump flooding to pass information among groups of processing units in the design of two simple […]

Jan, 3

Achieving O(1) IP lookup on GPU-based software routers

IP address lookup is a challenging problem due to the increasing routing table size, and higher line rate. This paper investigates a new way to build an efficient IP lookup scheme using graphics processor units(GPU). Our contribution here is to design a basic architecture for high-performance IP lookup engine with GPU, and to develop efficient […]

CUDA

Jan, 3

SBArt4 – Breeding abstract animations in realtime

SBART was developed in early 1990’s as one of the derivatives from Artificial Evolution by Karl Sims. It has a functionality to create a movie from a bred image through post-processing. The innovation of graphics processing unit (GPU) in these years improved the calculation performance to be fast enough to realize breeding animations in realtime […]

OpenGL

Jan, 3

Cooperative Multitasking for GPU-Accelerated Grid Systems

Exploiting the graphics processing unit (GPU) is useful to obtain higher performance with a less number of host machines in grid systems. One problem in GPU-accelerated grid systems is the lack of efficient multitasking mechanisms. In this paper, we propose a cooperative multitasking method capable of simultaneous execution of a graphics application and a CUDA-based […]

CUDA

Jan, 2

An Accelerated 3D Navier-Stokes Solver for Flows in Turbomachines

A new three-dimensional Navier-Stokes solver for flows in turbomachines has been developed. The new solver is based on the latest version of the Denton codes but has been implemented to run on graphics processing units (GPUs) instead of the traditional central processing unit. The change in processor enables an order-of-magnitude reduction in run-time due to […]

CUDA

Jan, 2

Generation of Random Numbers on Graphics Processors: Forced Indentation In Silico of the Bacteriophage HK97

The use of graphics processing units (GPUs) in simulation applications offers a significant speed gain as compared to computations on central processing units (CPUs). Many simulation methods require a large number of independent random variables generated at each step. We present two approaches for implementation of random number generators (RNGs) on a GPU. In the […]

CUDA

Jan, 2

Multiresolution MIP Rendering of Large Volumetric Data Accelerated on Graphics Hardware

This paper is concerned with a multiresolution representation for maximum intensity projection (MIP) volume rendering based on morphological pyramids which allows progressive refinement. We consider two algorithms for progressive rendering from the morphological pyramid: one which projects detail coefficients level by level, and a second one, called streaming MIP, which resorts the detail coefficients of […]

OpenGL

Jan, 2

Towards chip-on-chip neuroscience: fast mining of neuronal spike streams using graphics hardware

Computational neuroscience is being revolutionized with the advent of multi-electrode arrays that provide real-time, dynamic perspectives into brain function. Mining neuronal spike streams from these chips is critical to understand the firing patterns of neurons and gain insight into the underlying cellular activity. To address this need, we present a solution that uses a massively […]

CUDA

Jan, 2

Accelerating Euler Equations Numerical Solver on Graphics Processing Units

Finite volume numerical methods have been widely studied, implemented and parallelized on multiprocessor systems or on clusters. Modern graphics processing units (GPU) provide architectures and new programing models that enable to harness their large processing power and to design computational fluid dynamics simulations at both high performance and low cost. We report on solving the […]

CUDA

* * *

high performance computing on graphics processing units: hgpu.org

Posts

High Speed Articulated Object Tracking Using GPUs: A Particle Filter Approach

FPGA-based acceleration of CHARMM-potential minimization

Wave field synthesis for 3D audio: architectural prospectives

Utilizing jump flooding in image-based soft shadows

Achieving O(1) IP lookup on GPU-based software routers

SBArt4 – Breeding abstract animations in realtime

Cooperative Multitasking for GPU-Accelerated Grid Systems

An Accelerated 3D Navier-Stokes Solver for Flows in Turbomachines

Generation of Random Numbers on Graphics Processors: Forced Indentation In Silico of the Bacteriophage HK97

Multiresolution MIP Rendering of Large Volumetric Data Accelerated on Graphics Hardware

Towards chip-on-chip neuroscience: fast mining of neuronal spike streams using graphics hardware

Accelerating Euler Equations Numerical Solver on Graphics Processing Units

Recent source codes

UniCoder: Unified Visual-to-Code Generation via Symbolic Rewards and Reference-Guided Code Optimization

CuFuzz: An API-Knowledge-Graph Coverage-Driven Fuzzing Framework for CUDA Libraries

AutoPass: Evidence-Guided LLM Agents for Compiler Performance Tuning

Probe-and-Refine Tuning of Repository Guidance for AI Coding Agents

CUDAnalyst (CUDA + Analyst)

CodegenBench

KernelBenchX: A Comprehensive Benchmark for Evaluating LLM-Generated GPU Kernels

CUDA Kernel Fusion Benchmarks

IntelliKit: Agent-first tooling for AMD hardware

DITRON: Distributed Compiler based on Triton for Parallel Systems

Most viewed papers (last 30 days)