Posts
Dec, 4
Fingerprint grid enhancement on GPU
This paper presents an optimized GPU (Graphics Processing Unit) implementation for fingerprint images enhancement using a Gabor filter-bank based algorithm. Given a batch of fingerprint images, we apply the Gabor filter bank and compute image variances of the convolution responses. We then select parts of these responses and compose the final enhanced batches. The algorithm […]
Dec, 3
Multithreaded Transposition of Square Matrices with Common Code for Intel Xeon Processors and Intel Xeon Phi Coprocessors
In-place matrix transposition, a standard operation in linear algebra, is a memory bandwidth-bound operation. The theoretical maximum performance of transposition is the memory copy bandwidth. However, due to non-contiguous memory access in the transposition operation, practical performance is usually lower. The ratio of the transposition rate to the memory copy bandwidth is a measure of […]
Dec, 3
GPU and CPU Cooperative Accelerated Road Detection
In this paper, we propose a fast and robust unstructured road detection method that integrates GPU (Graphics Processing Unit) and CPU implementations. In order to ensure the robustness of the algorithm, BP (Back Propagation) Neural Network is employed to learn the color features from a set of sample of both road region and off-road region, […]
Dec, 3
SESH framework: A Space Exploration Framework for GPU Application and Hardware Codesign
Graphics processing units (GPUs) have become increasingly popular accelerators in supercomputers, and this trend is likely to continue. With its disruptive architecture and a variety of optimization options, it is often desirable to understand the dynamics between potential application transformations and potential hardware features when designing future GPUs for scientific workloads. However, current codesign efforts […]
Dec, 3
Real-time High Resolution Fusion of Depth Maps on GPU
A system for live high quality surface reconstruction using a single moving depth camera on a commodity hardware is presented. High accuracy and real-time frame rate is achieved by utilizing graphics hardware computing capabilities via OpenCL and by using sparse data structure for volumetric surface representation. Depth sensor pose is estimated by combining serial texture […]
Dec, 3
Accelerated Event-by-Event Neutrino Oscillation Reweighting with Matter Effects on a GPU
Oscillation probability calculations are becoming increasingly CPU intensive in modern neutrino oscillation analyses. The independency of reweighting individual events in a Monte Carlo sample lends itself to parallel implementation on a Graphics Processing Unit. The library "Prob3++" was ported to the GPU using the CUDA C API, allowing for large scale parallelized calculations of neutrino […]
Nov, 30
GenBase: A Complex Analytics Genomics Benchmark
This paper introduces a new benchmark, designed to test database management system (DBMS) performance on a mix of data management tasks (joins, filters, etc.) and complex analytics (regression, singular value decomposition, etc.) Such mixed workloads are prevalent in a number of application areas, including most science workloads and web analytics. As a specific use case, […]
Nov, 30
Fractal Based Method on Hardware Acceleration for Natural Environments
Natural scenes from the real world are highly complex, such that the modeling and rendering of natural shapes, like mountains, trees and clouds, are very difficult and time consuming and require a huge amount of memory. Intuitively, the critical characteristics of natural scenes are their self- similarity properties. Motivated by the self-similarity feature of the […]
Nov, 30
Digitize Your Body and Action in 3-D at Over 10 FPS: Real Time Dense Voxel Reconstruction and Marker-less Motion Tracking via GPU Acceleration
In this paper, we present an approach to reconstruct 3-D human motion from multi-cameras and track human skeleton using the reconstructed human 3-D point (voxel) cloud. We use an improved and more robust algorithm, probabilistic shape from silhouette to reconstruct human voxel. In addition, the annealed particle filter is applied for tracking, where the measurement […]
Nov, 30
Design and Storage Optimization of GPU-based Parallel Program of Image Registration for Remote Sensing
Image registration is a crucial step of many remote sensing related applications. As the scale of data and complexity of algorithm keep growing, image registration faces great challenges of its processing speed. In recent years, the computing capacity of GPU improves greatly. Taking the benefits of using GPU to solve general propose problem, we research […]
Nov, 30
GPU Accelerated Parallel Occupancy Voxel Based ICP for Position Tracking
Tracking the position of a robot in an unknown environment is an important problem in robotics. Iterative closest point algorithms using range data are commonly used for position tracking, but can be computationally intensive. We describe a highly parallel occupancy grid iterative closest point position tracking algorithm designed for use on a GPU, that uses […]
Nov, 29
Highly Optimized Full GPU-Acceleration of Non-hydrostatic Weather Model SCALE-LES
SCALE-LES is a non-hydrostatic weather model developed at RIKEN, Japan. It is intended to be a global high- resolution model that would be scaled to exascale systems. This paper introduces the full GPU acceleration of all SCALE-LES modules. Moreover, the paper demonstrates the strategies to handle the unique challenges of accelerating SCALE-LES using GPU. The […]