10167

Posts

Jul, 25

PRAND: GPU accelerated parallel random number generation library: Using most reliable algorithms and applying parallelism of modern GPUs and CPUs

The library PRAND for pseudorandom number generation for modern CPUs and GPUs is presented. It contains both single-threaded and multi-threaded realizations of a number of modern and most reliable generators recently proposed and studied in [1,2,3,4,5] and the efficient SIMD realizations proposed in [6]. One of the useful features for using PRAND in parallel simulations […]
Jul, 24

Effects of Dynamic Voltage and Frequency Scaling on a K20 GPU

Improving energy efficiency is an ongoing challenge in HPC because of the ever-increasing need for performance coupled with power and economic constraints. Though GPU-accelerated heterogeneous computing systems are capable of delivering impressive performance, it is necessary to explore all available power-aware technologies to meet the inevitable energy efficiency challenge. In this paper, we experimentally study […]
Jul, 24

Scheduling by Work-Stealing in Hybrid Parallel Architectures

Nowadays, parallel computing systems have been based on multicore CPUs and specialized coprocessors, such as GPUs, due to the limits achieved by traditional architectures. In order to obtain the expected performance in these systems, the workload must be distributed and redistributed in an efficient way through some technique of scheduling, like work-stealing. This work aims […]
Jul, 24

Parallel Implementation of Texture Based Image Retrieval on The GPU

Most image processing algorithms are inherently parallel, so multithreading processors are suitable in such applications. In huge image databases, image processing takes very long time for run on a single core processor because of single thread execution of algorithms. Graphical Processors Units (GPU) is more common in most image processing applications due to multithread execution […]
Jul, 24

Implementation of Filtering Beamforming Algorithms for Sonar Devices Using GPU

Beamforming is a signal processing technique used in sensor arrays to direct signal transmission or reception. Beamformer combines input signals in the array to achieve constructive interference at particular angles (beams) and destructive interference for other angles. According to the following facts: 1) Beamforming can be computationally intensive, so real-time sonar beamforming algorithms in sonar […]
Jul, 24

CDFC: Collision Detection Based on Fuzzy Clustering for Deformable Objects on GPU’s

We present a novel Collision Detection Based on Fuzzy Clustering for Deformable Objects on GPU’s (CDFC) technique to perform collision queries between rigid and/or deformable models. Our method can handle arbitrary deformations and even discontinuous ones. With our approach, we subdivide the scene into connected but totally independent parts by fuzzy clustering, and therefore, the […]
Jul, 22

Multi-core CUDA Architecture for Parallelization of Hierarchical Text Clustering

Text Clustering is the problem of dividing text documents into groups, such that documents in same group are similar to one another and different from documents in other groups. Because of the general tendency of texts forming hierarchies, text clustering is best performed by using a hierarchical clustering method. An important aspect while clustering large […]
Jul, 22

OpenCL simulations of two-fluid compressible flows with a random choice method

In this paper, we propose a new very simple numerical method for solving liquid-gas compressible flows. Such flows are difficult to simulate because classic conservative finite volume schemes generate pressure oscillations at the liquid-gas interface. We extend to several dimensions the random choice scheme that we have constructed in [2]. The extension is performed through […]
Jul, 22

Performance Evaluation of the Ocean-Land-Atmosphere Model Using Graphics Processing Units

The Ocean-Land-Atmosphere Model (OLAM) is an atmospheric model to simulate and cover all Earth surface. OLAM demands a great amount of processing in a simulation because of the large number of data structures used to represent the atmosphere. Because of this, we investigate in this paper how to increase performance using GPUs to compute the […]
Jul, 22

An overview of techniques for predicting the performance of GPU accelerated applications

The ability to predict the performance of applications in large-scale parallel systems is essential. One of the main incentives for this is the high cost of executing non-production tasks on these systems. An entity may also want to predict the performance in a system that does not yet exist. One popular alternative for increasing a […]
Jul, 22

Automatic Generation of FFT Libraries for GPU Platforms

Compilers introduce a set of optimizations to speed-up source code. However due to the variety of computation platforms, algorithm complexity and problem sizes, general purpose compilers can fail to improve performance. The burden on library developers increases significantly to write optimized libraries since the user code relies on them for performance. This argument strengthens the […]
Jul, 21

Experimental Evaluation of Thread Distribution Effects on Multiple Output Errors in GPUs

Graphic Processing Units are very prone to be corrupted by neutrons. Experimental results show that in the majority of the cases a typical application like matrix multiplication is affected by multiple output errors. In this paper we evaluate how different thread distributions impact the multiple output errors occurrence. The reported results and the performed architecture […]

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us:

contact@hpgu.org