high performance computing on graphics processing units: hgpu.org

Posts

Feb, 25

Porous Rock Simulations and Lattice Boltzmann on GPUs

Investigating how fluids flow inside the complicated geometries of porous rocks is an important problem in the petroleum industry. The lattice Boltzmann method (LBM) can be used to calculate porous rockst’ permeability. In this paper, we show how to implement this method ef?ciently on modern GPUs. Both a sequential CPU implementation and a parallelized GPU […]

CUDA

Feb, 25

Towards real time 2D to 3D registration for ultrasound-guided endoscopic and laparoscopic procedures

PURPOSE: A method to register endoscopic and laparoscopic ultrasound (US) images in real time with pre-operative computed tomography (CT) data sets has been developed with the goal of improving diagnosis, biopsy guidance, and surgical interventions in the abdomen. METHODS: The technique, which has the potential to operate in real time, is based on a new […]

CUDA

Feb, 25

GPU Surface Flow Simulation and Multiresolution Animation in Digital Terrain Models

In this work, we focus on surface flow animation in a digital terrain model (DTM) for computer graphics applications. In this context, we propose the combination of an efficient fluid model, that does not require solution of complicated equations, with a multiresolution approach for visualization. Firstly, we describe a GPU implementation of a fluid model […]

OpenGL

Feb, 25

HexServer: an FFT-based protein docking server powered by graphics processors

HexServer (http://hexserver.loria.fr/) is the first Fourier transform (FFT)-based protein docking server to be powered by graphics processors. Using two graphics processors simultaneously, a typical 6D docking run takes ~15 s, which is up to two orders of magnitude faster than conventional FFT-based docking approaches using comparable resolution and scoring functions. The server requires two protein […]

CUDA

Feb, 25

Interactive multiple anisotropic scattering in clouds

We propose an algorithm for the real time realistic simulation of multiple anisotropic scattering of light in a volume. Contrary to previous real-time methods we account for all kinds of light paths through the medium and preserve their anisotropic behavior. Our approach consists of estimating the energy transport from the illuminated cloud surface to the […]

Feb, 25

Pyramid Methods in GPU-Based Image Processing

There are numerous applications and variants of pyramid methods in digital image processing. Many of them feature a linear time complexity in the number of pixels; thus, they are particularly well suited for real-time image processing. In this work, we show that modern GPUs allow us to implement pyramid methods based on bilinear texture interpolation […]

OpenGL

Feb, 25

Parallel Image Processing Based on CUDA

CUDA (compute unified device architecture) is a novel technology of general-purpose computing on the GPU, which makes users develop general GPU (graphics processing unit) programs easily. This paper analyzes the distinct features of CUDA GPU, summarizes the general program mode of CUDA. Furthermore, we implement several classical image processing algorithms by CUDA, such as histogram […]

CUDA

Feb, 25

SU(2) Lattice Gauge Theory Simulations on Fermi GPUs

In this work we explore the performance of CUDA in quenched lattice SU(2) simulations. CUDA, NVIDIA Compute Unified Device Architecture, is a hardware and software architecture developed by NVIDIA for computing on the GPU. We present an analysis and performance comparison between the GPU and CPU in single and double precision. Analyses with multiple GPUs […]

CUDA

Feb, 25

OpenGL application live migration with GPU acceleration in personal cloud

Live migration of virtual machine (VM) across physical machines with limited downtime provides many new benefits for users having multiple devices within a personal cloud such as resource sharing and composition. Meanwhile, there are many graphics acceleration solutions within virtual machine which could improve 3D performance for Guest OS, such as VMGL (VMM-independent GPU acceleration), […]

OpenGL

Feb, 25

Achieving a single compute device image in OpenCL for multiple GPUs

In this paper, we propose an OpenCL framework that combines multiple GPUs and treats them as a single compute device. Providing a single virtual compute device image to the user makes an OpenCL application written for a single GPU portable to the platform that has multiple GPU devices. It also makes the application exploit full […]

CUDA

•

OpenCL

Feb, 24

Promise of embedded system with GPU in artificial leg control: Enabling time-frequency feature extraction from electromyography

Applying electromyographic (EMG) signal pattern recognition to artificial leg control is challenging because leg EMGs are non-stationary. Time-frequency features are suitable for representing non-stationary signals; however, the computational complexity to extract time-frequency features is too high and current embedded systems used for artificial limb control are inadequate for real-time computing. The aim of this study […]

CUDA

Feb, 24

Statistical Power Consumption Analysis and Modeling for GPU-based Computing

In recent years, more and more transistors have been integrated within the GPU, which has resulted in steadily rising power consumption requirements. In this paper we present a preliminary scheme to statistically analyze and model the power consumption of a mainstream GPU (NVidia GeForce 8800gt) by exploiting the innate coupling among power consumption characteristics, runtime […]

CUDA

high performance computing on graphics processing units: hgpu.org

Posts

Porous Rock Simulations and Lattice Boltzmann on GPUs

Towards real time 2D to 3D registration for ultrasound-guided endoscopic and laparoscopic procedures

GPU Surface Flow Simulation and Multiresolution Animation in Digital Terrain Models

HexServer: an FFT-based protein docking server powered by graphics processors

Interactive multiple anisotropic scattering in clouds

Pyramid Methods in GPU-Based Image Processing

Parallel Image Processing Based on CUDA

SU(2) Lattice Gauge Theory Simulations on Fermi GPUs

OpenGL application live migration with GPU acceleration in personal cloud

Achieving a single compute device image in OpenCL for multiple GPUs

Promise of embedded system with GPU in artificial leg control: Enabling time-frequency feature extraction from electromyography

Statistical Power Consumption Analysis and Modeling for GPU-based Computing

Recent source codes

CudaForge: An Agent Framework with Hardware Feedback for CUDA Kernel Optimization

LC Framework

pplx-garden: Perplexity open source garden for inference technology

Atlas CLI: Machine Learning (ML) Lifecycle & Transparency Manager

transformers_tvm: Implementation of Encoder Decoder transformer on TVM

OpScanner

INT v.s. FP: A framework to compare low-bit integer and float-point formats

AutoDock-GPU: AutoDock for GPUs and other accelerators

NCCLX: collective communication framework

Tutoring LLM into a Better CUDA Optimizer

Most viewed papers (last 30 days)