high performance computing on graphics processing units: hgpu.org

Posts

Dec, 13

Embracing Heterogeneity: Parallel Programming for Changing Hardware

Computer systems are undergoing significant change: to improve performance and efficiency, architects are exposing more microarchitectural details directly to programmers. Software that exploits specialized accelerators, such as GPUs, and specialized processor features, such as software-controlled memory, exposes limitations in existing compiler and OS infrastructure. In this paper we propose a pragmatic approach, motivated by our […]

Dec, 13

Pseudo-random number generators for Monte Carlo simulations on ATI Graphics Processing Units

Basic uniform pseudo-random number generators are implemented on ATI Graphics Processing Units (GPU). The performance results of the realized generators (multiplicative linear congruential (GGL), XOR-shift (XOR128), RANECU, RANMAR, RANLUX and Mersenne Twister (MT19937)) on CPU and GPU are discussed. The obtained speed up factor is hundreds of times in comparison with CPU. RANLUX generator is […]

Dec, 13

Parallel N-Body Simulation using GPUs

We present a novel parallel implementation of N-body gravitational simulation. Our algorithm uses graphics hardware to accelerate local computation, and is optimized to account for low bandwidth between the CPU and the graphics card, as well as low bandwidth across the network. The number of bodies that can be simulated with our implementation is limited […]

Dec, 13

Prototyping flexible touch screen devices using collocated haptic-graphic elastic-object deformation on the GPU

Rapid advances in flexible display technologies and the benefits that they provide are promising enough to consider them for futuristic mobile devices. Current prototyping methods lack facilities to simulate such flexible touch screen displays and the interaction with them. In this paper, we present a technique that provides product developers a tool to interactively simulate […]

Dec, 13

A generic library for structured real-time computations: GPU implementation applied to retinal and cortical vision processes

Most graphics cards in standard personal computers are now equipped with several pixel pipelines running shader programs. Taking advantage of this technology by transferring parallel computations from the CPU side to the GPU side increases the overall computational power even in non-graphical applications by freeing the main processor from an heavy work. A generic library […]

OpenGL

Dec, 13

GPU-Based Ray-Casting of Spherical Functions Applied to High Angular Resolution Diffusion Imaging

Any sufficiently smooth, positive, real-valued function $psi: S^2 rightarrow realset^+$ on a sphere $S^2$ can be expanded by a Laplace expansion into a sum of spherical harmonics. Given the Laplace expansion coefficients, we provide a CPU and GPU-based algorithm that renders the radial graph of $psi$ in a fast and efficient way by ray-casting the […]

Dec, 13

Wavefront raycasting using larger filter kernels for on-the-fly GPU gradient reconstruction

The quality of images generated by volume rendering strongly depends on the accuracy of gradient estimation. However, the most commonly used techniques for on-the-fly gradient reconstruction are still very simple, such as central differences; they generally gather only limited neighbourhood information and thus ultimately produce rather poor quality images. While there are many higher-order reconstruction […]

CUDA

Dec, 13

Orders-of-magnitude performance increases in GPU-accelerated correlation of images from the International Space Station

We implement image correlation, a fundamental component of many real-time imaging and tracking systems, on a graphics processing unit (GPU) using NVIDIA’s CUDA platform. We use our code to analyze images of liquid-gas phase separation in a model colloid-polymer system, photographed in the absence of gravity aboard the International Space Station (ISS). Our GPU code […]

CUDA

Dec, 13

Development of a GPU-based multithreaded software application to calculate digitally reconstructed radiographs for radiotherapy

To provide faster calculation of digitally reconstructed radiographs (DRRs) in patient-positioning verification, we developed and evaluated a graphic processing unit (GPU)-based DRR software application and compared it with a central processing unit (CPU)-based application. The evaluation metrics were calculation speed and image quality for various slice thicknesses. The results showed that the GPU-based DRR computation […]

CUDA

Dec, 13

Seeded ND medical image segmentation by cellular automaton on GPU

PURPOSE: We present a GPU-based framework to perform organ segmentation in N-dimensional (ND) medical image datasets by computation of weighted distances using the Ford-Bellman algorithm (FBA). Our GPU implementation of FBA gives an alternative and optimized solution to other graph-based segmentation techniques. METHODS: Given a number of K labelled-seeds, the segmentation algorithm evolves and segments […]

OpenGL

Dec, 13

Real-time anomaly detection in hyperspectral images using multivariate normal mixture models and GPU processing

Hyperspectral imaging, which records a detailed spectrum of light arriving in each pixel, has many potential uses in remote sensing as well as other application areas. Practical applications will typically require real-time processing of large data volumes recorded by a hyperspectral imager. This paper investigates the use of graphics processing units (GPU) for such real-time […]

CUDA

Dec, 13

Fast and automatic object pose estimation for range images on the GPU

We present a pose estimation method for rigid objects from single range images. Using 3D models of the objects, many pose hypotheses are compared in a data-parallel version of the downhill simplex algorithm with an image-based error function. The pose hypothesis with the lowest error value yields the pose estimation (location and orientation), which is […]

CUDA

* * *

high performance computing on graphics processing units: hgpu.org

Posts

Embracing Heterogeneity: Parallel Programming for Changing Hardware

Pseudo-random number generators for Monte Carlo simulations on ATI Graphics Processing Units

Parallel N-Body Simulation using GPUs

Prototyping flexible touch screen devices using collocated haptic-graphic elastic-object deformation on the GPU

A generic library for structured real-time computations: GPU implementation applied to retinal and cortical vision processes

GPU-Based Ray-Casting of Spherical Functions Applied to High Angular Resolution Diffusion Imaging

Wavefront raycasting using larger filter kernels for on-the-fly GPU gradient reconstruction

Orders-of-magnitude performance increases in GPU-accelerated correlation of images from the International Space Station

Development of a GPU-based multithreaded software application to calculate digitally reconstructed radiographs for radiotherapy

Seeded ND medical image segmentation by cellular automaton on GPU

Real-time anomaly detection in hyperspectral images using multivariate normal mixture models and GPU processing

Fast and automatic object pose estimation for range images on the GPU

Recent source codes

Specx: Speculative task-based runtime system

Mutual-Supervised Learning for Sequential-to-Parallel Code Translation

KISim: Kubernetes Intelligent Scheduling Simulator

Hardware Compute Partitioning on NVIDIA GPUs for Composable Systems

Efficient GPU Implementation of Multi-Precision Integer Division

ParEval: A Parallel Code Evaluation Benchmark

FlashSparse: Minimizing Computation Redundancy for Fast Sparse Matrix Multiplications on Tensor Cores

exa-AMD: Exascale Accelerated Materials Discovery

WiLLM: An Open Wireless LLM Communication System

Vcc: the Vulkan Clang Compiler

Most viewed papers (last 30 days)