high performance computing on graphics processing units: hgpu.org

Posts

Feb, 24

2014 3rd International Conference on System Engineering and Modeling, ICSEM 2014

Submission Deadline: 2014-03-20 Publication: All accepted papers of ICSEM 2014 will be published in the International Journal of Modeling and Optimization (ISSN:2010-3697), and will be included in the Engineering & Technology Digital Library, and indexed by Electronic Journals Library ProQuest,Google Scholar, Crossref, DOAJ and EI (INSPEC, IET). Call for Papers: Information Systems Engineering IS development […]

Feb, 24

2014 4th International Conference on Computer Communication and Management, ICCCM 2014

Publication: All accepted papers of ICCCM 2014 will be published in the following journals with ISSN: * International Journal of Computer and Communication Engineering (ISSN: 2010-3743), which will be indexed by Engineering & Technology Digital Library, Google Scholar, ProQuest, and Crossref. * Journal of Advanced Management Science (ISSN:2168-0787),which will be indexed by Ulrich’s Periodicals Directory, […]

Feb, 23

Real Time Face Detection on GPU Using OpenCL

This paper presents a novel approach for real time face detection using heterogeneous computing. The algorithm uses local binary pattern (LBP) as feature vector for face detection. OpenCL is used to accelerate the code using GPU[1]. Illuminance invariance is achieved using gamma correction and Difference of Gaussian(DOG) to make the algorithm robust against varying lighting […]

OpenCL

Feb, 23

Accelerating Content-Based Image Retrieval via GPU-adaptive Index Structure

A tremendous amount of work has been conducted in content-based image retrieval (CBIR) on designing efficient index structure to accelerate the retrieval process. Most of them improve the retrieval efficiency via complex index structures, and few take into account the parallel implementation of algorithm on underlying hardware. It makes the existing index structures suffer from […]

CUDA

Feb, 23

Multi-Elimination ILU Preconditioners on GPUs

Iterative solvers for sparse linear systems often benefit from using preconditioners. While there are implementations for many iterative methods that leverage the computing power of accelerators, porting the latest developments in preconditioners to accelerators has been challenging. In this paper we develop a self-adaptive multi-elimination preconditioner for graphics processing units (GPUs). The preconditioner is based […]

CUDA

Feb, 23

WPA/WPA2 Password Security Testing using Graphics Processing Units

This thesis focuses on the testing of WPA/WPA 2 password strength. Recently, due to progress in calculation power and technology, new factors must be taken into account when choosing a WPA/WPA2 secure password. A study regarding the security of the current deployed password is reported here. Harnessing the computational power of a single and old […]

CUDA

Feb, 23

Parallel Spectral Graph Partitioning on CUDA

Parallelization of scientific problems is a challenging task which has a wide application area both on distributed programming, cloud computing and recently on GPGPU. Spectral graph partitioning is a widely used technique in many fields such as image processing, scientific computing, machine learning etc. In this study we analyze spectral graph partitioning subroutines on a […]

CUDA

Feb, 21

Fast Boolean Calculations Using the GPU

The growing number of Boolean variables requires very efficient approaches to solve the given tasks. We explore the utilization of the GPU for fast parallel Boolean calculations in this paper. Hundreds of processor cores of the GPU offer a significant potential for improvements. Constraints in their application may restrict the reachable speedup. This paper summarizes […]

CUDA

Feb, 21

Performance Impact of Data Layout on the GPU-accelerated IDW Interpolation

This paper focuses on evaluating the performance impact of different data layouts on the GPU-accelerated IDW interpolation. First, we redesign and improve our previous GPU implementation that was performed by exploiting the feature CUDA Dynamic Parallel (CDP). And then, we implement three versions of GPU implementations, i.e., the naive version, the tiled version, and the […]

CUDA

Feb, 21

Computing least squares condition numbers on hybrid multicore/GPU systems

This paper presents an efficient computation for least squares conditioning or estimates of it. We propose performance results using new routines on top of the multicore-GPU library MAGMA. This set of routines is based on an efficient computation of the variance-covariance matrix for which, to our knowledge, there is no implementation in current public domain […]

CUDA

Feb, 21

MICA: A fast short-read aligner that takes full advantage of Intel Many Integrated Core Architecture (MIC)

BACKGROUND: Short-read aligners have recently gained a lot of speed by exploiting the massive parallelism of GPU. An uprising alternative to GPU is Intel MIC; supercomputers like Tianhe-2, currently top of TOP500, is built with 48,000 MIC boards to offer ~55 PFLOPS. The CPU-like architecture of MIC allows CPU-based software to be parallelized easily; however, […]

Feb, 21

Effects of Easy Hybrid Parallelization with CUDA for Numerical-Atomic-Orbital Density Functional Theory Calculation

We modified a MPI-friendly density functional theory (DFT) source code within hybrid parallelization including CUDA. Our objective is to find out how simple conversions within the hybrid parallelization with mid-range GPUs affect DFT code not originally suitable to CUDA. We settled several rules of hybrid parallelization for numerical-atomic-orbital (NAO) DFT codes. The test was performed […]

CUDA