3424

Posts

Mar, 22

Bridging the GPGPU-FPGA efficiency gap

This paper compares an implementation of a Bayesian inference algorithm across several FPGAs and GPGPUs, while embracing both the execution model and high-level architecture of a GPGPU. Our study is motivated by recent work in template-based programming and architectural models for FPGA computing. The comparison we present is meant to demonstrate the FPGA’s potential, while […]
Mar, 22

Improving accuracy for matrix multiplications on GPUs

Reproducibility of an experiment is a commonly used metric to determine its validity. Within scientific computing, this can become difficult due to the accumulation of floating point rounding errors in the numerical computation, greatly reducing the accuracy of the computation. Matrix multiplication is particularly susceptible to these rounding errors which is why there exist so […]
Mar, 22

Evaluating force field accuracy with long-time simulations of a beta-hairpin tryptophan zipper peptide

We have combined graphics processing unit-accelerated all-atom molecular dynamics with parallel tempering to explore the folding properties of small peptides in implicit solvent on the time scale of microseconds. We applied this methodology to the synthetic beta-hairpin, trpzip2, and one of its sequence variants, W2W9. Each simulation consisted of over 8 ms of aggregated virtual […]
Mar, 22

Constructing Two-Dimensional Voronoi Diagrams via Divide-and-Conquer of Envelopes in Space (thesis)

We present a general framework for computing two-dimensional Voronoi diagrams of different classes of sites under various distance functions. The framework is sufficiently general to support diagrams embedded on a family of two-dimensional parametric surfaces in $R^3$. The computation of the diagrams is carried out through the construction of envelopes of surfaces in 3-space provided […]
Mar, 22

Constructing Two-Dimensional Voronoi Diagrams via Divide-and-Conquer of Envelopes in Space

We present a general framework for computing Voronoi diagrams of different classes of sites under various distance functions in $R^3$. Most diagrams mentioned in the paper are in the plane. However, the framework is sufficiently general to support diagrams embedded on a family of two-dimensional parametric surfaces in three-dimensions. The computation of the diagrams is […]
Mar, 22

fastHOG – a real-time GPU implementation of HOG

We introduce a parallel implementation of the histogram of oriented gradients algorithm for object detection. Our implementation uses the GPU and the NVIDIA CUDA framework. We achieve speedups of over 67x from the standard sequential code, using a single video card. Furthermore it supports multiple video cards so speedups of 120x or more can be […]
Mar, 22

Hierarchical belief propagation to reduce search space using CUDA for stereo and motion estimation

This paper describes a hierarchical belief propagation implementation in which a ‘rough’ disparity map calculation or motion estimation in higher levels is used to limit the search space and enable the calculation of the desired disparity map/set of motion vectors using a smaller search space than traditional belief propagation. We implement our algorithm on the […]
Mar, 22

GPU implementation of belief propagation using CUDA for cloud tracking and reconstruction

This paper describes an efficient CUDA-based GPU implementation of the belief propagation algorithm that can be used to speed up stereo image processing and motion tracking calculations without loss of accuracy. Preliminary results in using belief propagation to analyze satellite images of hurricane Luis for real-time cloud structure and tracking are promising with speed-ups of […]
Mar, 21

GPU implemention of fast Gabor filters

With their parallel multi-core architecture, Programmable Graphics Processing Units (GPUs) are well suited for implementing biologically-inspired visual processing algorithms, such as Gabor filtering. We compare several GPU implementations of Gabor filtering. On the same graphics card (an NVIDIA GeForce 9800 GTX+) and for convolution kernel radii from 8 to 48 pixels, an algorithm that decomposes […]
Mar, 21

GPU-based password cracking

In this research the following question is answered: what should KPMG advice their clients regarding to password length and complexity, now GPU-based password cracking has become a reality. To be able to answer this question, tests with different tools and hashes were performed on a system with four high end GPUs. The test system showed […]
Mar, 21

GPU Accelerated VLSI Design Verification

Today’s Very Large Scale Integrated-Circuit (VLSI) designs require intensive verification effort. However, traditional sequential verification solutions could no longer provide the scalability for future large designs. The so-called verification gap hinders the development of future VLSI products. In this paper, we review our recent works on accelerating typical VLSI verification tasks with modern GPUs. Our […]
Mar, 21

An evaluation of GPU acceleration for sparse reconstruction

Image processing applications typically parallelize well. This gives a developer interested in data throughput several different implementation options, including multiprocessor machines, general purpose computation on the graphics processor, and custom gate-array designs. Herein, we will investigate these first two options for dictionary learning and sparse reconstruction, specifically focusing on the K-SVD algorithm for dictionary learning […]

Recent source codes

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us:

contact@hpgu.org