high performance computing on graphics processing units: hgpu.org

Posts

Oct, 29

SIMD-Based Large-Scale Transient Stability Simulation on the Graphics Processing Unit

This paper presents a single-instruction-multiple-data (SIMD) based implementation of the transient stability simulation on the Graphics Processing Unit (GPU). Two programming models to implement the standard method of the transient stability simulation are proposed and implemented on a single GPU. In the first model the CPU is responsible for part of the simulation, while the […]

Oct, 29

CUSA and CUDE: GPU-accelerated methods for estimating solvent accessible surface area and desolvation

It is well-established that a linear correlation exists between accessible surface areas and experimentally measured solvation energies. Combining this knowledge with an analytic formula for calculation of solvent accessible surfaces, we derive a simple model of desolvation energy as a differentiable function of atomic positions. Additionally, we find that this algorithm is particularly well suited […]

CUDA

Oct, 29

Performance study of interference on GPU and CPU resources with multiple applications

In the last years, the performance and capabilities of Graphics Processing Units (GPUs) improved drastically, mostly due to the demands of the entertainment market, with consumers and companies alike pushing for improvements in the level of visual fidelity, which is only achieved with high performing GPU solutions. Beside the entertainment market, there is an ongoing […]

Oct, 29

Current performance gains from utilizing the GPU or the ASIC MDGRAPE-3 within an enhanced Poisson Boltzmann approach

Scientific applications do frequently suffer from limited compute performance. In this article, we investigate the suitability of specialized computer chips to overcome this limitation. An enhanced Poisson Boltzmann program is ported to the graphics processing unit and the application specific integrated circuit MDGRAPE-3 and resulting execution times are compared to the conventional performance obtained on […]

Oct, 29

GPU-based image manipulation and enhancement techniques for dynamic volumetric medical image visualization

An important part of an image-guided surgical system is the display component, and seamless interactivity is critical to its successful application in a clinical environment. In this paper, we present several novel techniques for 4D medical image manipulation and enhancement that employ a graphics processing unit (GPU) to accelerate image processing. We describe three types […]

Oct, 29

An analytical model for a GPU architecture with memory-level and thread-level parallelism awareness

GPU architectures are increasingly important in the multi-core era due to their high number of parallel processors. Programming thousands of massively parallel threads is a big challenge for software engineers, but understanding the performance bottlenecks of those parallel programs on GPU architectures to improve application performance is even more difficult. Current approaches rely on programmers […]

CUDA

Oct, 29

GPU-based Video Feature Tracking and Matching

Abstract This paper describes novel implementations of the KLT feature tracking and SIFT feature extraction algorithms that run on the graphics processing unit (GPU) and is suitable for video analysis in real-time vision systems. While significant acceleration over standard CPU implementations is obtained by exploiting parallelism provided by modern programmable graphics hardware, the CPU is […]

OpenGL

Oct, 28

GPU acceleration of cutoff pair potentials for molecular modeling applications

The advent of systems biology requires the simulation of ever-larger biomolecular systems, demanding a commensurate growth in computational power. This paper examines the use of the NVIDIA Tesla C870 graphics card programmed through the CUDA toolkit to accelerate the calculation of cutoff pair potentials, one of the most prevalent computations required by many different molecular […]

CUDA

Oct, 28

Real-time eye blink detection with GPU-based SIFT tracking

This paper reports on the implementation of a GPUbased, real-time eye blink detector on very low contrast images acquired under near-infrared illumination. This detector is part of a multi-sensor data acquisition and analysis system for driver performance assessment and training. Eye blinks are detected inside regions of interest that are aligned with the subject’s eyes […]

Oct, 28

AUTO-GC: Automatic translation of data mining applications to GPU clusters

Because of the very favorable price to performance ratio of the GPUs, a popular parallel programming configuration today is a cluster of GPUs. However, extracting performance on such a configuration would typically require programming in both MPI and CUDA, thus requiring a high degree of expertise and effort. It is clearly desirable to be able […]

CUDA

Oct, 28

GPU-based streaming architectures for fast cone-beam CT image reconstruction and demons deformable registration

This paper shows how to significantly accelerate cone-beam CT reconstruction and 3D deformable image registration using the stream-processing model. We describe data-parallel designs for the Feldkamp, Davis and Kress (FDK) reconstruction algorithm, and the demons deformable registration algorithm, suitable for use on a commodity graphics processing unit. The streaming versions of these algorithms are implemented […]

Oct, 28

GPU Gems: Programming Techniques, Tips and Tricks for Real-Time Graphics

GPU Gems has won a prestigious Front Line Award from Game Developer Magazine. The Front Line Awards recognize products that enable faster and more efficient game development, advancing the state of the art.FULL COLOR THROUGHOUT! “This collection of articles is particularly impressive for its depth and breadth. The book includes product-oriented case studies, previously unpublished […]

* * *

high performance computing on graphics processing units: hgpu.org

Posts

SIMD-Based Large-Scale Transient Stability Simulation on the Graphics Processing Unit

CUSA and CUDE: GPU-accelerated methods for estimating solvent accessible surface area and desolvation

Performance study of interference on GPU and CPU resources with multiple applications

Current performance gains from utilizing the GPU or the ASIC MDGRAPE-3 within an enhanced Poisson Boltzmann approach

GPU-based image manipulation and enhancement techniques for dynamic volumetric medical image visualization

An analytical model for a GPU architecture with memory-level and thread-level parallelism awareness

GPU-based Video Feature Tracking and Matching

GPU acceleration of cutoff pair potentials for molecular modeling applications

Real-time eye blink detection with GPU-based SIFT tracking

AUTO-GC: Automatic translation of data mining applications to GPU clusters

GPU-based streaming architectures for fast cone-beam CT image reconstruction and demons deformable registration

GPU Gems: Programming Techniques, Tips and Tricks for Real-Time Graphics

Recent source codes

SYCL Container

CASS: Cuda-Amd aSSembly

Cluser of smartphones for edge computing application using TensorFlow

CFAL-bench

Efficient Graph Embedding at Scale: Optimizing CPU-GPU-SSD Integration

Can Large Language Models Predict Parallel Code Performance?

PELSI: Power-Efficient Layer-Switched Inference

Ouroboros: Virtualized Queues for dynamic memory management

MSCCL++: A GPU-driven communication stack for scalable AI applications

Benchmark compute shader of Unity against InteropUnityCUDA

Most viewed papers (last 30 days)