11490

Posts

Feb, 23

Real Time Face Detection on GPU Using OpenCL

This paper presents a novel approach for real time face detection using heterogeneous computing. The algorithm uses local binary pattern (LBP) as feature vector for face detection. OpenCL is used to accelerate the code using GPU[1]. Illuminance invariance is achieved using gamma correction and Difference of Gaussian(DOG) to make the algorithm robust against varying lighting […]
Feb, 23

Accelerating Content-Based Image Retrieval via GPU-adaptive Index Structure

A tremendous amount of work has been conducted in content-based image retrieval (CBIR) on designing efficient index structure to accelerate the retrieval process. Most of them improve the retrieval efficiency via complex index structures, and few take into account the parallel implementation of algorithm on underlying hardware. It makes the existing index structures suffer from […]
Feb, 23

Multi-Elimination ILU Preconditioners on GPUs

Iterative solvers for sparse linear systems often benefit from using preconditioners. While there are implementations for many iterative methods that leverage the computing power of accelerators, porting the latest developments in preconditioners to accelerators has been challenging. In this paper we develop a self-adaptive multi-elimination preconditioner for graphics processing units (GPUs). The preconditioner is based […]
Feb, 23

WPA/WPA2 Password Security Testing using Graphics Processing Units

This thesis focuses on the testing of WPA/WPA 2 password strength. Recently, due to progress in calculation power and technology, new factors must be taken into account when choosing a WPA/WPA2 secure password. A study regarding the security of the current deployed password is reported here. Harnessing the computational power of a single and old […]
Feb, 23

Parallel Spectral Graph Partitioning on CUDA

Parallelization of scientific problems is a challenging task which has a wide application area both on distributed programming, cloud computing and recently on GPGPU. Spectral graph partitioning is a widely used technique in many fields such as image processing, scientific computing, machine learning etc. In this study we analyze spectral graph partitioning subroutines on a […]
Feb, 21

Fast Boolean Calculations Using the GPU

The growing number of Boolean variables requires very efficient approaches to solve the given tasks. We explore the utilization of the GPU for fast parallel Boolean calculations in this paper. Hundreds of processor cores of the GPU offer a significant potential for improvements. Constraints in their application may restrict the reachable speedup. This paper summarizes […]
Feb, 21

Performance Impact of Data Layout on the GPU-accelerated IDW Interpolation

This paper focuses on evaluating the performance impact of different data layouts on the GPU-accelerated IDW interpolation. First, we redesign and improve our previous GPU implementation that was performed by exploiting the feature CUDA Dynamic Parallel (CDP). And then, we implement three versions of GPU implementations, i.e., the naive version, the tiled version, and the […]
Feb, 21

Computing least squares condition numbers on hybrid multicore/GPU systems

This paper presents an efficient computation for least squares conditioning or estimates of it. We propose performance results using new routines on top of the multicore-GPU library MAGMA. This set of routines is based on an efficient computation of the variance-covariance matrix for which, to our knowledge, there is no implementation in current public domain […]
Feb, 21

MICA: A fast short-read aligner that takes full advantage of Intel Many Integrated Core Architecture (MIC)

BACKGROUND: Short-read aligners have recently gained a lot of speed by exploiting the massive parallelism of GPU. An uprising alternative to GPU is Intel MIC; supercomputers like Tianhe-2, currently top of TOP500, is built with 48,000 MIC boards to offer ~55 PFLOPS. The CPU-like architecture of MIC allows CPU-based software to be parallelized easily; however, […]
Feb, 21

Effects of Easy Hybrid Parallelization with CUDA for Numerical-Atomic-Orbital Density Functional Theory Calculation

We modified a MPI-friendly density functional theory (DFT) source code within hybrid parallelization including CUDA. Our objective is to find out how simple conversions within the hybrid parallelization with mid-range GPUs affect DFT code not originally suitable to CUDA. We settled several rules of hybrid parallelization for numerical-atomic-orbital (NAO) DFT codes. The test was performed […]
Feb, 19

X-ray CT on the GPU

Nondestructive testing (NDT) is a collection of analysis techniques used by scientists and technologists as a way of analyzing the interior of an object without damaging the object. Since the analysis is done without damaging the object, NDT is an extremely valuable technique used in various industries for troubleshooting and research. CNDE has a long […]
Feb, 19

Large-Scale Geospatial Processing on Multi-Core and Many-Core Processors: Evaluations on CPUs, GPUs and MICs

Geospatial Processing, such as queries based on point-to-polyline shortest distance and point-in-polygon test, are fundamental to many scientific and engineering applications, such as post-processing large-scale environmental and climate model outputs and analyzing traffic and travel patterns from massive GPS collections in transportation engineering and urban studies. Commodity parallel hardware, such as multi-core CPUs, many-core GPUs […]

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us:

contact@hpgu.org