## Posts

Oct, 28

### GPU Ray Marching for Real-Time Rendering of Participating Media

This paper presents a GPU based ray marching algorithm for real-time rendering of participating media. We fire a ray at each pixel being shaded on the cube surface, and then we find an intersection between the ray and inner-volume recorded by a 3D texture, using both linear and binary searches. At this intersection, the ray […]

Oct, 28

### Jump flooding in GPU with applications to Voronoi diagram and distance transform

This paper studies jump flooding as an algorithmic paradigm in the general purpose computation with GPU. As an example application of jump flooding, the paper discusses a constant time algorithm on GPU to compute an approximation to the Voronoi diagram of a given set of seeds in a 2D grid. The errors due to the […]

Oct, 28

### String Matching on a Multicore GPU Using CUDA

Graphics processing units (GPUs) have evolved over the past few years from dedicated graphics rendering devices to powerful parallel processors, outperforming traditional central processing units (CPUs) in many areas of scientific computing. The use of GPUs as processing elements was very limited until recently, when the concept of general-purpose computing on graphics processing units (GPGPU) […]

Oct, 28

### Pseudo-random number generators for Monte Carlo simulations on Graphics Processing Units

Basic uniform pseudo-random number generators are implemented on ATI Graphics Processing Units (GPU). The performance results of the realized generators (multiplicative linear congruential (GGL), XOR-shift (XOR128), RANECU, RANMAR, RANLUX and Mersenne Twister (MT19937)) on CPU and GPU are discussed. The obtained speed-up factor is hundreds of times in comparison with CPU. RANLUX generator is found […]

Oct, 28

### Compute Pairwise Manhattan Distance and Pearson Correlation Coefficient of Data Points with GPU

Graphics processing units (GPUs) are powerful computational devices tailored towards the needs of the 3-D gaming industry for high-performance, real-time graphics engines. Nvidia Corporation released a new generation of GPUs designed for general-purpose computing in 2006, and it released a GPU programming language called CUDA in 2007. The DNA microarray technology is a high throughput […]

Oct, 28

### Motion Compensation and Reconstruction of H.264/AVC Video Bitstreams using the GPU

Most modern computers are equipped with powerful yet cost-effective graphics processing units (GPUs) to accelerate graphics operations. Although programmable shaders on these GPUs were designed for the creation of 3-D rendering effects, they can also be used as generic processing units for vector data. This paper proposes a hardware Tenderer capable of executing motion compensation, […]

Oct, 28

### GPU-based object-order ray-casting for large datasets

We propose a GPU-based object-order ray-casting algorithm for the rendering of large volumetric datasets, such as the Visible Human CT datasets. A volumetric dataset is decomposed into small sub-volumes, which are then organized using a min-max octree structure. The small sub-volumes are stored in the leaf nodes of the min-max octree, which are also called […]

Oct, 28

### Accelerating Kirchhoff Migration by CPU and GPU Cooperation

We discuss the performance of Petrobras production Kirchhoff prestack seismic migration on a cluster of 64 GPUs and 256 CPU cores. Porting and optimization of the application hot spot (98.2% of a single CPU core execution time) to a single GPU reduces total execution time by a factor of 36 on a control run. We […]

Oct, 28

### Hybrid GPU-Based Single- and Double-Bounce SAR Simulation

In this paper, a new hybrid graphics-processing-unit (GPU)-based real-time synthetic aperture radar (SAR) simulation system is presented. Previous real-time SAR simulators only supported single-bounce simulation in real time. The new hybrid system uses the rasterization approach for real-time single-bounce simulation and a new image-based GPU ray-tracing approach for monostatic SAR double-bounce simulation. This approach provides […]

Oct, 28

### The Heisenberg spin glass model on GPU: myths and actual facts

We describe different implementations of the 3D Heisenberg spin glass model for Graphics Processing Units (GPU). The results show that the fast shared memory gives better performance with respect to the slow global memory only if a multi-hit technique is used.

Oct, 28

### Accelerating astrophysical particle simulations with programmable hardware (FPGA and GPU)

In a previous paper we have shown that direct gravitational N-body simulations in astrophysics scale very well for moderately parallel supercomputers (order 10–100 nodes). The best balance between computation and communication is reached if the nodes are accelerated by special purpose hardware; in this paper we describe the implementation of particle based astrophysical simulation codes […]

Oct, 28

### Analyzing CUDA workloads using a detailed GPU simulator

Modern graphic processing units (GPUs) provide sufficiently flexible programming models that understanding their performance can provide insight in designing tomorrow’s manycore processors, whether those are GPUs or otherwise. The combination of multiple, multithreaded, SIMD cores makes studying these GPUs useful in understanding tradeoffs among memory, data, and thread level parallelism. While modern GPUs offer orders […]