Posts
Nov, 4
Artificial neural network computation on graphic process unit
Artificial neural network (ANN) is widely used in pattern recognition related area. In some case, the computational load is very heavy, in other case, real time process is required. So there is a need to apply a parallel algorithm on it, and usually the computation for ANN is inherently parallel. In this paper, graphic hardware […]
Nov, 4
Implementing sparse matrix-vector multiplication on throughput-oriented processors
Sparse matrix-vector multiplication (SpMV) is of singular importance in sparse linear algebra. In contrast to the uniform regularity of dense linear algebra, sparse operations encounter a broad spectrum of matrices ranging from the regular to the highly irregular. Harnessing the tremendous potential of throughput-oriented processors for sparse operations requires that we expose substantial fine-grained parallelism […]
Nov, 4
CuPP – A framework for easy CUDA integration
This paper reports on CuPP, our newly developed C++ framework designed to ease integration of NVIDIAs GPGPU system CUDA into existing C++ applications. CuPP provides interfaces to reoccurring tasks that are easier to use than the standard CUDA interfaces. In this paper we concentrate on memory management and related data structures. CuPP offers both a […]
Nov, 4
Interactive 3D distance field computation using linear factorization
We present an interactive algorithm to compute discretized 3D Euclidean distance fields. Given a set of piecewise linear geometric primitives, our algorithm computes the distance field for each slice of a uniform spatial grid. We express the non-linear distance function of each primitive as a dot product of linear factors. The linear terms are efficiently […]
Nov, 4
NBSymple, a double parallel, symplectic N-body code running on Graphic Processing Units
We present and discuss the characteristics and performances, both in term of computational speed and precision, of a numerical code which numerically integrates the equation of motions of N ‘particles’ interacting via Newtonian gravitation and move in an external galactic smooth field. The force evaluation on every particle is done by mean of direct summation […]
Nov, 3
Brook for GPUs: Stream Computing on Graphics Hardware
In this paper, we present Brook for GPUs, a system for general-purpose computation on programmable graphics hardware. Brook extends C to include simple data-parallel constructs, enabling the use of the GPU as a streaming co-processor. We present a compiler and runtime system that abstracts and virtualizes many aspects of graphics hardware. In addition, we present […]
Nov, 3
High performance discrete Fourier transforms on graphics processors
We present novel algorithms for computing discrete Fourier transforms with high performance on GPUs. We present hierarchical, mixed radix FFT algorithms for both power-of-two and non-power-of-two sizes. Our hierarchical FFT algorithms efficiently exploit shared memory on GPUs using a Stockham formulation. We reduce the memory transpose overheads in hierarchical algorithms by combining the transposes into […]
Nov, 3
Streaming Algorithms for Biological Sequence Alignment on GPUs
Sequence alignment is a common and often repeated task in molecular biology. Typical alignment operations consist of finding similarities between a pair of sequences (pairwise sequence alignment) or a family of sequences (multiple sequence alignment). The need for speeding up this treatment comes from the rapid growth rate of biological sequence databases: every year their […]
Nov, 3
Studying Thermal Management for Graphics-Processor Architectures
We have previously presented Qsilver, a flexible simulation system for graphics architectures. In this paper we describe our extensions to this system, which we use – instrumented with a power model and HotSpot – to analyze the application of standard CPU static and runtime thermal management techniques on the GPU. We describe experiments implementing clock […]
Nov, 3
Porting a high-order finite-element earthquake modeling application to NVIDIA graphics cards using CUDA
We port a high-order finite-element application that performs the numerical simulation of seismic wave propagation resulting from earthquakes in the Earth on NVIDIA GeForce 8800 GTX and GTX 280 graphics cards using CUDA. This application runs in single precision and is therefore a good candidate for implementation on current GPU hardware, which either does not […]
Nov, 3
Harvesting graphics power for MD simulations
We discuss an implementation of molecular dynamics (MD) simulations on a graphic processing unit (GPU) in the NVIDIA CUDA language. We tested our code on a modern GPU, the NVIDIA GeForce 8800 GTX. Results for two MD algorithms suitable for short-ranged and long-ranged interactions, and a congruential shift random number generator are presented. The performance […]
Nov, 3
Real-time Visual Tracker by Stream Processing
In this work, we implement a real-time visual tracker that targets the position and 3D pose of objects in video sequences, specifically faces. The use of stream processors for the computations and efficient Sparse-Template-based particle filtering allows us to achieve real-time processing even when tracking multiple objects simultaneously in high-resolution video frames. Stream processing is […]