## Posts

Nov, 4

### Interactive 3D distance field computation using linear factorization

We present an interactive algorithm to compute discretized 3D Euclidean distance fields. Given a set of piecewise linear geometric primitives, our algorithm computes the distance field for each slice of a uniform spatial grid. We express the non-linear distance function of each primitive as a dot product of linear factors. The linear terms are efficiently […]

Nov, 4

### NBSymple, a double parallel, symplectic N-body code running on Graphic Processing Units

We present and discuss the characteristics and performances, both in term of computational speed and precision, of a numerical code which numerically integrates the equation of motions of N ‘particles’ interacting via Newtonian gravitation and move in an external galactic smooth field. The force evaluation on every particle is done by mean of direct summation […]

Nov, 3

### Brook for GPUs: Stream Computing on Graphics Hardware

In this paper, we present Brook for GPUs, a system for general-purpose computation on programmable graphics hardware. Brook extends C to include simple data-parallel constructs, enabling the use of the GPU as a streaming co-processor. We present a compiler and runtime system that abstracts and virtualizes many aspects of graphics hardware. In addition, we present […]

Nov, 3

### High performance discrete Fourier transforms on graphics processors

We present novel algorithms for computing discrete Fourier transforms with high performance on GPUs. We present hierarchical, mixed radix FFT algorithms for both power-of-two and non-power-of-two sizes. Our hierarchical FFT algorithms efficiently exploit shared memory on GPUs using a Stockham formulation. We reduce the memory transpose overheads in hierarchical algorithms by combining the transposes into […]

Nov, 3

### Streaming Algorithms for Biological Sequence Alignment on GPUs

Sequence alignment is a common and often repeated task in molecular biology. Typical alignment operations consist of finding similarities between a pair of sequences (pairwise sequence alignment) or a family of sequences (multiple sequence alignment). The need for speeding up this treatment comes from the rapid growth rate of biological sequence databases: every year their […]

Nov, 3

### Studying Thermal Management for Graphics-Processor Architectures

We have previously presented Qsilver, a flexible simulation system for graphics architectures. In this paper we describe our extensions to this system, which we use – instrumented with a power model and HotSpot – to analyze the application of standard CPU static and runtime thermal management techniques on the GPU. We describe experiments implementing clock […]

Nov, 3

### Porting a high-order finite-element earthquake modeling application to NVIDIA graphics cards using CUDA

We port a high-order finite-element application that performs the numerical simulation of seismic wave propagation resulting from earthquakes in the Earth on NVIDIA GeForce 8800 GTX and GTX 280 graphics cards using CUDA. This application runs in single precision and is therefore a good candidate for implementation on current GPU hardware, which either does not […]

Nov, 3

### Harvesting graphics power for MD simulations

We discuss an implementation of molecular dynamics (MD) simulations on a graphic processing unit (GPU) in the NVIDIA CUDA language. We tested our code on a modern GPU, the NVIDIA GeForce 8800 GTX. Results for two MD algorithms suitable for short-ranged and long-ranged interactions, and a congruential shift random number generator are presented. The performance […]

Nov, 3

### Real-time Visual Tracker by Stream Processing

In this work, we implement a real-time visual tracker that targets the position and 3D pose of objects in video sequences, specifically faces. The use of stream processors for the computations and efficient Sparse-Template-based particle filtering allows us to achieve real-time processing even when tracking multiple objects simultaneously in high-resolution video frames. Stream processing is […]

Nov, 3

### Parallel, stochastic measurement of molecular surface area

Biochemists often wish to compute surface areas of proteins. A variety of algorithms have been developed for this task, but they are designed for traditional single-processor architectures. The current trend in computer hardware is towards increasingly parallel architectures for which these algorithms are not well suited. We describe a parallel, stochastic algorithm for molecular surface […]

Nov, 3

### Maximum mipmaps for fast, accurate, and scalable dynamic height field rendering

This paper presents a GPU-based, fast, and accurate dynamic height field rendering technique that scales well to large scale height fields. Current real-time rendering algorithms for dynamic height fields employ approximate ray-height field intersection methods, whereas accurate algorithms require pre-computation in the order of seconds to minutes and are thus not suitable for dynamic height […]

Nov, 3

### Neural Network Implementation Using CUDA and OpenMP

Many algorithms for image processing and pattern recognition have recently been implemented on GPU (graphic processing unit) for faster computational times. However, the implementation using GPU encounters two problems. First, the programmer should master the fundamentals of the graphics shading languages that require the prior knowledge on computer graphics. Second, in a job which needs […]