1303

Posts

Nov, 3

High performance discrete Fourier transforms on graphics processors

We present novel algorithms for computing discrete Fourier transforms with high performance on GPUs. We present hierarchical, mixed radix FFT algorithms for both power-of-two and non-power-of-two sizes. Our hierarchical FFT algorithms efficiently exploit shared memory on GPUs using a Stockham formulation. We reduce the memory transpose overheads in hierarchical algorithms by combining the transposes into […]
Nov, 3

Streaming Algorithms for Biological Sequence Alignment on GPUs

Sequence alignment is a common and often repeated task in molecular biology. Typical alignment operations consist of finding similarities between a pair of sequences (pairwise sequence alignment) or a family of sequences (multiple sequence alignment). The need for speeding up this treatment comes from the rapid growth rate of biological sequence databases: every year their […]
Nov, 3

Studying Thermal Management for Graphics-Processor Architectures

We have previously presented Qsilver, a flexible simulation system for graphics architectures. In this paper we describe our extensions to this system, which we use – instrumented with a power model and HotSpot – to analyze the application of standard CPU static and runtime thermal management techniques on the GPU. We describe experiments implementing clock […]
Nov, 3

Porting a high-order finite-element earthquake modeling application to NVIDIA graphics cards using CUDA

We port a high-order finite-element application that performs the numerical simulation of seismic wave propagation resulting from earthquakes in the Earth on NVIDIA GeForce 8800 GTX and GTX 280 graphics cards using CUDA. This application runs in single precision and is therefore a good candidate for implementation on current GPU hardware, which either does not […]
Nov, 3

Harvesting graphics power for MD simulations

We discuss an implementation of molecular dynamics (MD) simulations on a graphic processing unit (GPU) in the NVIDIA CUDA language. We tested our code on a modern GPU, the NVIDIA GeForce 8800 GTX. Results for two MD algorithms suitable for short-ranged and long-ranged interactions, and a congruential shift random number generator are presented. The performance […]
Nov, 3

Real-time Visual Tracker by Stream Processing

In this work, we implement a real-time visual tracker that targets the position and 3D pose of objects in video sequences, specifically faces. The use of stream processors for the computations and efficient Sparse-Template-based particle filtering allows us to achieve real-time processing even when tracking multiple objects simultaneously in high-resolution video frames. Stream processing is […]
Nov, 3

Parallel, stochastic measurement of molecular surface area

Biochemists often wish to compute surface areas of proteins. A variety of algorithms have been developed for this task, but they are designed for traditional single-processor architectures. The current trend in computer hardware is towards increasingly parallel architectures for which these algorithms are not well suited. We describe a parallel, stochastic algorithm for molecular surface […]
Nov, 3

Maximum mipmaps for fast, accurate, and scalable dynamic height field rendering

This paper presents a GPU-based, fast, and accurate dynamic height field rendering technique that scales well to large scale height fields. Current real-time rendering algorithms for dynamic height fields employ approximate ray-height field intersection methods, whereas accurate algorithms require pre-computation in the order of seconds to minutes and are thus not suitable for dynamic height […]
Nov, 3

Neural Network Implementation Using CUDA and OpenMP

Many algorithms for image processing and pattern recognition have recently been implemented on GPU (graphic processing unit) for faster computational times. However, the implementation using GPU encounters two problems. First, the programmer should master the fundamentals of the graphics shading languages that require the prior knowledge on computer graphics. Second, in a job which needs […]
Nov, 3

High-Precision Numerical Simulations of Rotating Black Holes Accelerated by CUDA

Hardware accelerators (such as Nvidia’s CUDA GPUs) have tremendous promise for computational science, because they can deliver large gains in performance at relatively low cost. In this work, we focus on the use of Nvidia’s Tesla GPU for high-precision (double, quadruple and octal precision) numerical simulations in the area of black hole physics — more […]
Nov, 3

PAPER – Accelerating parallel evaluations of ROCS

Modern graphics processing units (GPUs) are flexibly programmable and have peak computational throughput significantly faster than conventional CPUs. Herein, we describe the design and implementation of PAPER, an open-source implementation of Gaussian molecular shape overlay for NVIDIA GPUs. We demonstrate one to two order-of-magnitude speedups on high-end commodity GPU hardware relative to a reference CPU […]
Nov, 3

Real-time KD-tree construction on graphics hardware

We present an algorithm for constructing kd-trees on GPUs. This algorithm achieves real-time performance by exploiting the GPU’s streaming architecture at all stages of kd-tree construction. Unlike previous parallel kd-tree algorithms, our method builds tree nodes completely in BFS (breadth-first search) order. We also develop a special strategy for large nodes at upper tree levels […]

* * *

* * *

HGPU group © 2010-2019 hgpu.org

All rights belong to the respective authors

Contact us: