high performance computing on graphics processing units: hgpu.org

Posts

Dec, 15

Acceleration of cardiac tissue simulation with graphic processing units

In this technical note we show the promise of using graphic processing units (GPUs) to accelerate simulations of electrical wave propagation in cardiac tissue, one of the more demanding computational problems in cardiology. We have found that the computational speed of two-dimensional (2D) tissue simulations with a single commercially available GPU is about 30 times […]

CUDA

Dec, 15

Optimizing Monte Carlo radiosity on graphics hardware

The radiosity method is usually employed for the rendering of highly realistic synthetic images. In this paper we present an implementation of the Monte Carlo radiosity algorithm on the GPU using CUDA. Our proposal is based on the partition of the scene into sub-scenes to be processed in parallel to exploit the graphics card structure. […]

CUDA

Dec, 15

Scalable and highly parallel implementation of Smith-Waterman on graphics processing unit using CUDA

Program development environments have enabled graphics processing units (GPUs) to become an attractive high performance computing platform for the scientific community. A commonly posed problem in computational biology is protein database searching for functional similarities. The most accurate algorithm for sequence alignments is Smith-Waterman (SW). However, due to its computational complexity and rapidly increasing database […]

CUDA

Dec, 15

OpenMP to GPGPU: a compiler framework for automatic translation and optimization

GPGPUs have recently emerged as powerful vehicles for general-purpose high-performance computing. Although a new Compute Unified Device Architecture (CUDA) programming model from NVIDIA offers improved programmability for general computing, programming GPGPUs is still complex and error-prone. This paper presents a compiler framework for automatic source-to-source translation of standard OpenMP applications into CUDA-based GPGPU applications. The […]

CUDA

Dec, 15

Compiler support for general-purpose computation on GPUs

In recent years, the GPU (graphics processing unit) has evolved into an extremely powerful and flexible processor, with it now representing an attractive platform for general-purpose computation. Moreover, changes to the design and programmability of GPUs provide the opportunity to perform general-purpose computation on a GPU (GPGPU). Even though many programming languages, software tools, and […]

OpenGL

Dec, 15

A Parallel Mediated Reality Platform

Realtime image processing provides a general framework for robust mediated reality problems. This paper presents a realtime mediated reality system that is built upon realtime image processing algorithms. It has been shown that the graphics processing unit (GPU) is capable of efficiently performing image processing tasks. The system presented uses a parallel GPU architecture for […]

OpenGL

Dec, 15

A scalable GPU-based approach to shading and shadowing for photorealistic real-time augmented reality

Visually realistic Augmented Reality (AR) entails addressing several difficult problems. The most difficult problem is that of rendering the virtual objects with illumination which is consistent with the illumination of the real scene. The paper describes a complete AR rendering system centered around the use of High Dynamic Range environment maps for representing the real […]

Dec, 15

Accelerating a three-dimensional finite-difference wave propagation code using GPU graphics cards

We accelerate a 3-D finite-difference in the time domain wave propagation code by a factor between about 20 and 60 compared to a serial implementation using graphics processing unit computing on NVIDIA graphics cards with the CUDA programming language. We describe the implementation of the code in CUDA to simulate the propagation of seismic waves […]

CUDA

Dec, 15

Distributed GPU Volume Rendering of ASKAP Spectral Data Cubes

The Australian SKA Pathfinder (ASKAP) will be producing 2.2 terabyte HI spectral-line cubes for each 8 hours of observation by 2013. Global views of spectral data cubes are vital for the detection of instrumentation errors, the identification of data artefacts and noise characteristics, and the discovery of strange phenomena, unexpected relations, or unknown patterns. We […]

Dec, 14

gProximity: Hierarchical GPU-based Operations for Collision and Distance Queries

We present novel parallel algorithms for collision detection and separation distance computation for rigid and deformable models that exploit the computational capabilities of many-core GPUs. Our approach uses thread and data parallelism to perform fast hierarchy construction, updating, and traversal using tight-fitting bounding volumes such as oriented bounding boxes (OBB) and rectangular swept spheres (RSS). […]

CUDA

Dec, 14

Fast Ray Sorting and Breadth-First Packet Traversal for GPU Ray Tracing

We present a novel approach to ray tracing execution on commodity graphics hardware using CUDA. We decompose a standard ray tracing algorithm into several data-parallel stages that are mapped efficiently to the massively parallel architecture of modern GPUs. These stages include: ray sorting into coherent packets, creation of frustums for packets, breadth-first frustum traversal through […]

CUDA

Dec, 14

GPU-Based Spherical Light Field Rendering with Per-Fragment Depth Correction

Image-based rendering techniques are a powerful alternative to traditional polygon-based computer graphics. This paper presents a novel light field rendering technique which performs per-pixel depth correction of rays for high-quality reconstruction. Our technique stores combined RGB and depth values in a parabolic 2D texture for every light field sample acquired at discrete positions on a […]

* * *

high performance computing on graphics processing units: hgpu.org

Posts

Acceleration of cardiac tissue simulation with graphic processing units

Optimizing Monte Carlo radiosity on graphics hardware

Scalable and highly parallel implementation of Smith-Waterman on graphics processing unit using CUDA

OpenMP to GPGPU: a compiler framework for automatic translation and optimization

Compiler support for general-purpose computation on GPUs

A Parallel Mediated Reality Platform

A scalable GPU-based approach to shading and shadowing for photorealistic real-time augmented reality

Accelerating a three-dimensional finite-difference wave propagation code using GPU graphics cards

Distributed GPU Volume Rendering of ASKAP Spectral Data Cubes

gProximity: Hierarchical GPU-based Operations for Collision and Distance Queries

Fast Ray Sorting and Breadth-First Packet Traversal for GPU Ray Tracing

GPU-Based Spherical Light Field Rendering with Per-Fragment Depth Correction

Recent source codes

UniCoder: Unified Visual-to-Code Generation via Symbolic Rewards and Reference-Guided Code Optimization

CuFuzz: An API-Knowledge-Graph Coverage-Driven Fuzzing Framework for CUDA Libraries

AutoPass: Evidence-Guided LLM Agents for Compiler Performance Tuning

Probe-and-Refine Tuning of Repository Guidance for AI Coding Agents

CUDAnalyst (CUDA + Analyst)

CodegenBench

KernelBenchX: A Comprehensive Benchmark for Evaluating LLM-Generated GPU Kernels

CUDA Kernel Fusion Benchmarks

IntelliKit: Agent-first tooling for AMD hardware

DITRON: Distributed Compiler based on Triton for Parallel Systems

Most viewed papers (last 30 days)