Posts
Apr, 14
Massively parallel implementation of cyclic LDPC codes on a general purpose graphics processing unit
Simulation of low-density parity-check (LDPC) codes frequently takes several days, thus the use of general purpose graphics processing units (GPGPUs) is very promising. However, GPGPUs are designed for compute-intensive applications, and they are not optimized for data caching or control management. In LDPC decoding, the parity check matrix H needs to be accessed at every […]
Apr, 14
Count Sort for GPU Computing
Counting sort is a simple, stable and efficient sort algorithm with linear running time, which is a fundamental building block for many applications. This paper depicts the design issues of a data parallel implementation of the count sort algorithm on a commodity multiprocessor GPU using the Compute Unified Device Architecture (CUDA) platform, both from NVIDIA […]
Apr, 14
GPU-based high-speed and high-precision visual tracking
This paper presents a method for implementing the ESM visual tracker proposed by Malis et al. on a GPU to realize fast and accurate visual tracking. The ESM tracker is effective especially for the images in which feature points are difficult to obtain, since it uses entire image pixels of the target image region. Although […]
Apr, 14
A high-performance fault-tolerant software framework for memory on commodity GPUs
As GPUs are increasingly used to accelerate HPC applications by allowing more flexibility and programmability, their fault tolerance is becoming much more important than before when they were used only for graphics. The current generation of GPUs, however, does not have standard error detection and correction capabilities, such as SEC-DED ECC for DRAM, which is […]
Apr, 14
GPU acceleration of method of moments matrix assembly using Rao-Wilton-Glisson basis functions
In this paper, a GPU accelerated implementation of the matrix assembly phase of the methods of moments is presented. The modelling of PEC structures using the electric field integral equation and the Rao-Wilton-Glisson basis functions introduced in is considered. NVIDIA CUDA is used to do the GPU development and the double precision support offered by […]
Apr, 14
Accelerating spatial clustering detection of epidemic disease with graphics processing unit
The statistics of disease clustering is of interest to epidemiologists. In order to detect spatial clustering of disease in all the regions of China, we adopted a likelihood ratio based method which utilizes Monte Carlo simulation and spatial exploring to analyze the real time updating data stored in database. However, large number of random tests […]
Apr, 14
Scaleable Sparse Matrix-Vector Multiplication with Functional Memory and GPUs
Sparse matrix-vector multiplication on GPUs faces to a serious problem when the vector length is too large to be stored in GPU’s device memory. To solve this problem, we propose a novel software-hardware hybrid method for a heterogeneous system with GPUs and functional memory modules connected by PCI express. The functional memory contains huge capacity […]
Apr, 14
Efficient design and implementation of visual computing algorithms on the GPU
In this paper, we explore the key factors in the design and implementation of visual computing (image processing and computer vision) algorithms on the massive parallel GPU (graphics processing units). The goal of the exploration is to provide common perspective and guidelines of using GPU for visual computing applications. We have selected three nontrivial applications […]
Apr, 14
Implicit Feature-Based Alignment System for Radiotherapy
In this paper we present a robust alignment algorithm for correcting the effects of out-of-plane rotation to be used for automatic alignment of the Computed Tomography (CT) volumes and the generally low quality fluoroscopic images for radiotherapy applications. Analyzing not only in-plane but also out-of-plane rotation effects on the Dignitary Reconstructed Radiograph (DRR) images, we […]
Apr, 14
OpenCL/OpenGL aproach for studying active Brownian motion
This work presents a methodology for studying active Brownian dynamics on ratchet potentials using interoperating OpenCL and OpenGL frameworks. Programing details along with optimization issues are discussed, followed by a comparison of performance on different devices. Time of visualization using OpenGL sharing buffer with OpenCL has been tested against another technique which, while using OpenGL, […]
Apr, 13
23d International Conference on Parallel Computational Fluid Dynamics 2011, ParCFD 2011
ParCFD is the annual international conference devoted to the discussion of recent developments and applications of parallel computing in the field of CFD and related disciplines. Since establishment of the ParCFD conference series, parallel computers have become the dominant form of large-scale computing. Emergence of multi-core and heterogeneous architectures in parallel computers has created new […]
Apr, 13
Hardware-Efficient Belief Propagation
Loopy belief propagation (BP) is an effective solution for assigning labels to the nodes of a graphical model such as the Markov random field (MRF), but it requires high memory, bandwidth, and computational costs. Furthermore, the iterative, pixel-wise, and sequential operations of BP make it difficult to parallelize the computation. In this paper, we propose […]