high performance computing on graphics processing units: hgpu.org

Posts

Oct, 9

Accelerating Mean Shift Segmentation Algorithm on Hybrid CPU/GPU Platforms

Image segmentation is a very important step in many GIS applications. Mean shift is an advanced and versatile technique for clustering-based segmentation, and is favored in many cases because it is non-parametric. However, mean shift is very computationally intensive compared with other simple methods such as k-means. In this work, we present a hybrid design […]

CUDA

Oct, 9

Applying Genetic Algorithms to Tune Heterogeneous Platform Configurations

Present need to move towards heterogeneous architectures has been well established. This has increased the importance of parallelization of software to achieve good performance. Use of mixed architectures exponentially increases the need of the programmer to understand the intricacies of the underlying hardware to achieve optimal speedup. Obtaining optimal performance on one such architecture is […]

OpenCL

Oct, 9

A PCG Implementation of an Elliptic Kernel in an Ocean Global Circulation Model Based on GPU Libraries

In this paper an inverse preconditioner for the numerical solution of an elliptic Laplace prob- lem of a global circulation ocean model is presented. The inverse preconditiong technique is adopted in order to efficiently compute the numerical solution of the elliptic kernel by using the Conjugate Gradient (CG) method. We show how the performance and […]

CUDA

Oct, 9

Streaming Parallel GPU Acceleration of Large-Scale filter-based Spiking Neural Networks

The arrival of graphics processing (GPU) cards suitable for massively parallel computing promises affordable large-scale neural network simulation previously only available at supercomputing facilities. While the raw numbers suggest that GPUs may outperform CPUs by at least an order of magnitude, the challenge is to develop fine-grained parallel algorithms to fully exploit the particulars of […]

OpenCL

Oct, 9

Efficient deconvolution methods for astronomical imaging: algorithms and IDL-GPU codes

The Richardson-Lucy method is the most popular deconvolution method in astronomy because it preserves the number of counts and the non-negativity of the original object. Regularization is, in general, obtained by an early stopping of Richardson-Lucy iterations. In the case of point-wise objects such as binaries or open star clusters, iterations can be pushed to […]

CUDA

Oct, 8

Learning hash codes for efficient content reuse detection

Content reuse is extremely common in user generated mediums. Reuse detection serves as be the basis for many applications. However, along with the explosion of Internet and continuously growing uses of user generated mediums, the task becomes more critical and difficult. In this paper, we present a novel efficient and scalable approach to detect content […]

CUDA

Oct, 8

Realtime Two-Way Coupling of Meshless Fluids and Nonlinear FEM

In this paper, we present a novel method to couple Smoothed Particle Hydrodynamics (SPH) and nonlinear FEM to animate the interaction of fluids and deformable solids in real time. To accurately model the coupling, we generate proxy particles over the boundary of deformable solids to facilitate the interaction with fluid particles, and develop an efficient […]

CUDA

Oct, 8

Measuring the Performance of Realtime DSP Using Pure Data and GPU

In order to achieve greater amounts of computation while lowering the cost of artistic and scientific projects that rely on realtime digital signal processing techniques, it is interesting to study the performance of commodity parallel processing GPU cards coupled with commonly used software for realtime DSP. In this article, we describe the measurement of data […]

CUDA

Oct, 8

GPU Accelerated NIDS Search

Network Intrusion Detection System (NIDS) analyzes network traffic for malicious activities and report’s findings from events that intend to compromise the security of the computers and other equipment. NIDS looks into both headers and payloads of the network packets to identify possible intrusions. NIDS models that only use Central Processing Units (CPU) such as the […]

CUDA

Oct, 8

CUTE solutions for two-point correlation functions from large cosmological datasets

In the advent of new large galaxy surveys, which will produce enormous datasets with hundreds of millions of objects, new computational techniques are necessary in order to extract from them any two-point statistic, the computational time of which grows with the square of the number of objects to be correlated. Fortunately technology now provides multiple […]

CUDA

Oct, 6

GPGPU accelerated optimization method of Interconnection Network Topology

The optimization of the irregular connection network of the multiprocessor systems with the distributed memory is the NP complete problem which is generally compute-intensive process. Graphics processing units provide a large computational power at a very low price allowing the fine-grained parallelism. This work investigates the use of the GPU in the parallelisation of the […]

CUDA

Oct, 6

Techniques for Mapping Synthetic Aperture Radar Processing Algorithms to Multi-GPU Clusters

This paper presents a design for parallel processing of synthetic aperture radar (SAR) data using multiple Graphics Processing Units (GPUs). Our approach supports real-time reconstruction of a two-dimensional image from a matrix of echo pulses and their response values. Key to runtime efficiency is a partitioning scheme that divides the output image into tiles and […]

CUDA