Posts
Dec, 12
A Lattice-Preserving Multigrid Method for Solving the Inhomogeneous Poisson Equations Used in Image Analysis
The inhomogeneous Poisson (Laplace) equation with internal Dirichlet boundary conditions has recently appeared in several applications ranging from image segmentation [1, 2, 3] to image colorization [4], digital photo matting [5, 6] and image filtering [7, 8]. In addition, the problem we address may also be considered as the generalized eigenvector problem associated with Normalized […]
Dec, 12
Exploring Parallel Algorithms for Volumetric Mass-Spring-Damper Models in CUDA
Since the advent of programmable graphics processors (GPUs) their computational powers have been utilized for general purpose computation. Initially by “exploiting” graphics APIs and recently through dedicated parallel computation frameworks such as the Compute Unified Device Architecture (CUDA) from Nvidia. This paper investigates multiple implementations of volumetric Mass-Spring-Damper systems in CUDA. The obtained performance is […]
Dec, 12
Map-reduce as a Programming Model for Custom Computing Machines
The map-reduce model requires users to express their problem in terms of a map function that processes single records in a stream, and a reduce function that merges all mapped outputs to produce a final result. By exposing structural similarity in this way, a number of key issues associated with the design of custom computing […]
Dec, 12
A decompression pipeline for accelerating out-of-core volume rendering of time-varying data
This paper presents a decompression pipeline capable of accelerating out-of-core volume rendering of time-varying scalar data. Our pipeline is based on a two-stage compression method that cooperatively uses the CPU and the graphics processing unit (GPU) to transfer compressed data entirely from the storage device to the video memory. This method combines two different compression […]
Dec, 12
The visible ear surgery simulator
This paper presents a real-time computer simulation of surgical procedures in the ear, in which a surgeon drills into the temporal bone to gain access to the middle or inner ear. The purpose of this simulator is to support development of anatomical insight and training of drilling skills for both medical students and experienced otologists. […]
Dec, 12
Using Mixed Precision for Sparse Matrix Computations to Enhance the Performance while Achieving 64-bit Accuracy
By using a combination of 32-bit and 64-bit floating point arithmetic, the performance of many sparse linear algebra algorithms can be significantly enhanced while maintaining the 64-bit accuracy of the resulting solution. These ideas can be applied to sparse multifrontal and supernodal direct techniques and sparse iterative techniques such as Krylov subspace methods. The approach […]
Dec, 12
Parallel algorithms for approximation of distance maps on parametric surfaces
We present an efficient O( n ) numerical algorithm for first-order approximation of geodesic distances on geometry images, where n is the number of points on the surface. The structure of our algorithm allows efficient implementation on parallel architectures. Two implementations on a SIMD processor and on a GPU are discussed. Numerical results demonstrate up […]
Dec, 12
Stream Processing of Integral Images for Real-Time Object Detection
This paper presents the design and evaluation of the stream processing implementation of the Integral Image algorithm. The Integral Image is a key component of many image processing algorithms in particular the Haar-like feature based systems. Modern GPUs provide a large number of processors with a peak floating point performance that is significantly higher than […]
Dec, 12
Real-time digital holographic microscopy using the graphic processing unit
Digital holographic microscopy (DHM) is a well-known powerful method allowing both the amplitude and phase of a specimen to be simultaneously observed. In order to obtain a reconstructed image from a hologram, numerous calculations for the Fresnel diffraction are required. The Fresnel diffraction can be accelerated by the FFT (Fast Fourier Transform) algorithm. However, real-time […]
Dec, 12
A compiler framework for optimization of affine loop nests for gpgpus
GPUs are a class of specialized parallel architectures with tremendous computational power. The new Compute Unified Device Architecture (CUDA) programming model from NVIDIA facilitates programming of general purpose applications on their GPUs. However, manual development of high-performance parallel code for GPUs is still very challenging. In this paper, a number of issues are addressed towards […]
Dec, 12
Two-electron integral evaluation on the graphics processor unit
We propose the algorithm to evaluate the Coulomb potential in the ab initio density functional calculation on the graphics processor unit (GPU). The numerical accuracy required for the algorithm is investigated in detail. It is shown that GPU, which supports only the single-precision floating number natively, can take part in the major computational tasks. Because […]
Dec, 12
Deformable model collision detection using A-buffer
This paper presents a new image-space algorithm for real-time collision detection, where the GPU computes the potentially colliding sets (PCSs), and the CPU performs the standard triangle/triangle intersection test. When the bounding boxes of two objects intersect, the intersection is passed to the GPU. By rendering the objects in the intersection region, the GPU saves […]