high performance computing on graphics processing units: hgpu.org

Posts

Jan, 13

GPU-PIV

Digital Particle Image Velocimetry (PIV) is an optical technique used to measure the velocity of seeded particles in real flow. A CCD camera captures the flow field twice under exposure to a short duration laser flash. Recorded image pairs are cross-correlated to extract velocity information from these records. Time resolved PIV technology can capture images […]

OpenGL

Jan, 13

A Scalable and Reconfigurable Shared-Memory Graphics Cluster Architecture

If the computational demands of an interactive graphics rendering application cannot be met by a single commodity Graphics Processing Unit (GPU), multiple graphics accelerators may be utilised on multi-GPU based systems such as SLI [1] or Crossfire [2] or by a cluster of PCs in conjunction with a software infrastructure. Typically these PC cluster solutions […]

OpenGL

Jan, 13

Speeding up Mutual Information Computation Using NVIDIA CUDA Hardware

We present an efficient method for mutual information (MI) computation between images (2D or 3D) for NVIDIA’s “compute unified device architecture” (CUDA) compatible devices. Efficient parallelization of MI is particularly challenging on a “graphics processor unit” (GPU) due to the need for histogram-based calculation of joint and marginal probability mass functions (pmfs) with large number […]

CUDA

Jan, 13

Adaptive sampling in three dimensions for volume rendering on GPUs

Direct volume rendering of large volumetric data sets on programmable graphics hardware is often limited by the amount of available graphics memory and the bandwidth from main memory to graphics memory. Therefore, several approaches to volume rendering from compact representations of volumetric data have been published that avoid most of the data transfer between main […]

OpenGL

Jan, 13

Low-cost, high-speed computer vision using NVIDIA’s CUDA architecture

In this paper, we introduce real time image processing techniques using modern programmable Graphic Processing Units (GPU). GPUs are SIMD (Single Instruction, Multiple Data) device that is inherently data-parallel. By utilizing NVIDIA’s new GPU programming framework, “Compute Unified Device Architecture” (CUDA) as a computational resource, we realize significant acceleration in image processing algorithm computations. We […]

CUDA

Jan, 13

Efficient fault simulation on many-core processors

Fault simulation is essential in test generation, design for test and reliability assessment of integrated circuits. Reliability analysis and the simulation of self-test structures are particularly computationally expensive as a large number of patterns has to be evaluated. In this work, we propose to map a fault simulation algorithm based on the parallel-pattern single-fault propagation […]

CUDA

Jan, 13

GPU-Based 3D Texture Advection for the Visualization of Unsteady Flow Fields

We present an interactive visualization approach for the dense representation of unsteady 3D flow fields. The first part of this approach is a GPU-based 3D texture advection scheme that allows a slice of the 3D visual representation to be updated in a single rendering pass. In the second step, the result of the advection process […]

OpenGL

Jan, 12

K-Means on Commodity GPUs with CUDA

K-means algorithm is one of the most famous unsupervised clustering algorithms. Many theoretical improvements for the performance of original algorithms have been put forward, while almost all of them are based on single instruction single data (SISD) architecture processors (GPUs), which partly ignored the inherent paralleled characteristic of the algorithms. In this paper, a novel […]

CUDA

Jan, 12

Sketch Based Facial Expression Recognition Using Graphics Hardware

In this paper, a novel system is proposed to recognize facial expression based on face sketch, which is produced by programmable graphics hardware-GPU(Graphics Processing Unit). Firstly, an expression subspace is set up from a corpus of images consisting of seven basic expressions. Secondly, by applying a GPU based edge detection algorithm, the real-time facial expression […]

Jan, 12

Practical logarithmic rasterization for low-error shadow maps

Logarithmic shadow maps can deliver the same quality as competing shadow map algorithms with substantially less storage and bandwidth. We show how current GPU architectures can be modified incrementally to support rendering of logarithmic shadow maps at current GPU fill rates. Specifically, we modify the rasterizer to support rendering to a nonuniform grid with the […]

OpenGL

Jan, 12

Compensated Visual Hull for Defective Segmentation and Occlusion

We propose an advanced visual hull technique to compensate for outliers using the reliabilities of the silhouettes. The proposed method consists of a foreground extraction technique based on the Generalized Gaussian Family model and a compensated shape-from-silhouette algorithm. They are connected by the intra-/inter-silhouette reliabilities to compensate for carving errors from defective segmentation or partial […]

Jan, 12

Automatic Hepatic Vessel Segmentation Using Graphics Hardware

The accurate segmentation of liver vessels is an important prerequisite for creating oncologic surgery planning tools as well as medical visualization applications. In this paper, a fully automatic approach is presented to quickly enhance and extract the vascular system of the liver from CT datasets. Our framework consists of three basic modules: vessel enhancement on […]

* * *

high performance computing on graphics processing units: hgpu.org

Posts

GPU-PIV

A Scalable and Reconfigurable Shared-Memory Graphics Cluster Architecture

Speeding up Mutual Information Computation Using NVIDIA CUDA Hardware

Adaptive sampling in three dimensions for volume rendering on GPUs

Low-cost, high-speed computer vision using NVIDIA’s CUDA architecture

Efficient fault simulation on many-core processors

GPU-Based 3D Texture Advection for the Visualization of Unsteady Flow Fields

K-Means on Commodity GPUs with CUDA

Sketch Based Facial Expression Recognition Using Graphics Hardware

Practical logarithmic rasterization for low-error shadow maps

Compensated Visual Hull for Defective Segmentation and Occlusion

Automatic Hepatic Vessel Segmentation Using Graphics Hardware

Recent source codes

SYCL Container

CASS: Cuda-Amd aSSembly

Cluser of smartphones for edge computing application using TensorFlow

CFAL-bench

Efficient Graph Embedding at Scale: Optimizing CPU-GPU-SSD Integration

Can Large Language Models Predict Parallel Code Performance?

PELSI: Power-Efficient Layer-Switched Inference

Ouroboros: Virtualized Queues for dynamic memory management

MSCCL++: A GPU-driven communication stack for scalable AI applications

Benchmark compute shader of Unity against InteropUnityCUDA

Most viewed papers (last 30 days)