Posts
Mar, 4
Acceleration of Medical Image Registration using Graphics Process Units in Computing Normalized Mutual Information
This paper presents a computational performance analysis of an accelerated medical image registration using Graphics Processing Units (GPUs). In our previous work, a multi-resolution approach using normalized mutual information (NMI) has proven to be useful in medical image registration. In this paper, we propose an acceleration of the NMI procedure using GPU implementation because of […]
Mar, 4
Understanding GPU Programming for Statistical Computation: Studies in Massively Parallel Massive Mixtures
We describe advances in statistical computation for large-scale data analysis in structured Bayesian mixture models via GPU (graphics processing unit) programming. The developments are partly motivated by computational challenges arising in increasingly prevalent biological studies using high-throughput flow cytometry methods, generating many, very large data sets and requiring increasingly high-dimensional mixture models with large numbers […]
Mar, 4
Architecture-Aware Optimization Targeting Multithreaded Stream Computing
Optimizing program execution targeted for Graphics Processing Units (GPUs) can be very challenging. Our ability to efficiently map serial code to a GPU or stream processing platform is a time consuming task and is greatly hampered by a lack of detail about the underlying hardware. Programmers are left to attempt trial and error to produce […]
Mar, 4
Redesigning combustion modeling algorithms for the Graphics Processing Unit (GPU): Chemical kinetic rate evaluation and ordinary differential equation integration
Detailed modeling of complex combustion kinetics remains challenging and often intractable, due to prohibitive computational costs incurred when solving the associated large kinetic mechanisms. The Graphics Processing Unit (GPU), originally designed for graphics rendering on computer and gaming systems, has recently emerged as a powerful, cost-effective supplement to the Central Processing Unit (CPU) for dramatically […]
Mar, 4
Multi-GPU Performance of Incompressible Flow Computation by Lattice Boltzmann Method on GPU Cluster
GPGPU has drawn much attention on accelerating non-graphic applications. The simulation by D3Q19 model of Lattice Boltzmann method was executed successfully on multi-node GPU cluster by using CUDA programming and MPI library. The GPU code runs on the multi-node GPU cluster TSUBAME of Tokyo Institute of technology, in which total 680 GPUs of NVIDIA Tesla […]
Mar, 3
Real-Time Multiprocessor Systems with GPUs
Graphics processing units, GPUs, are powerful processors that can offer significant performance advantages over traditional CPUs. The last decade has seen rapid advancement in GPU computational power and generality. Recent technologies make it possible to use GPUs as co-processors to the CPU. The performance advantages of GPUs can be great, often outperforming traditional CPUs by […]
Mar, 3
Smooth Mixed-Resolution GPU Volume Rendering
We propose a mixed-resolution volume ray-casting approach that enables more flexibility in the choice of downsampling positions and filter kernels, allows freely mixing volume bricks of different resolutions during rendering, and does not require modifying the original sample values. A C^0-continuous function is obtained everywhere with hardware-native filtering at full speed by simply warping texture […]
Mar, 3
The sparse matrix vector product on GPUs
The sparse matrix vector product (SpMV) is a paramount operation in engineering and scientific computing and, hence, has been a subject of intense research for long. The irregular computations involved in SpMV make its optimization challenging. Therefore, enormous effort has been devoted to devise data formats to store the sparse matrix with the ultimate aim […]
Mar, 3
Unified – A Sharp Turn in the Latest Era of Graphic Processors
The need of high performance and realism has increased a lot in the last few decades, especially in gaming, 3D graphics and computationally demanding applications. It has compelled the GPU vendors to put their best effort towards the improvement of ILP (Instruction Level Parallelism). As a result of which, the GPU has entered in a […]
Mar, 3
Building Correlators with Many-Core Hardware
Radio telescopes typically consist of multiple receivers whose signals are cross-correlated to filter out noise. A recent trend is to correlate in software instead of custom-built hardware, taking advantage of the flexibility that software solutions offer. Examples include e-VLBI and LOFAR. However, the data rates are usually high and the processing requirements challenging. Many-core processors […]
Mar, 3
RankBoost Acceleration on both NVIDIA CUDA and ATI Stream Platforms
NVIDIA CUDA and ATI Stream are the two major general-purpose GPU (GPGPU) computing technologies. We implemented RankBoost, a web relevance ranking algorithm, on both NVIDIA CUDA and ATI Stream platforms to accelerate the algorithm and illustrate the differences between these two technologies. It shows that the performances of GPU programs are highly dependent on the […]
Mar, 3
Parallel Cycle Based Logic Simulation Using Graphics Processing Units
Graphics Processing Units (GPUs) are gaining popularity for parallelization of general purpose applications. GPUs are massively parallel processors with huge performance in a small and readily available package. At the same time, the emergence of general purpose programming environments for GPUs such as CUDA shorten the learning curve of GPU programming. We present a GPU-based […]