Posts
Jul, 19
Parallel Image Segmentation Using Reduction-Sweeps On Multicore Processors and GPUs
In this paper we introduce the Reduction Sweep algorithm, a novel graph-based image segmentation algorithm that is designed for easy parallelization. It is based on a clustering approach focusing on local image characteristics. Each pixel is compared with its neighbors in an implicitly independent manner, and those deemed sufficiently similar according to a color criterion […]
Jul, 19
On Benchmarking the Matrix Multiplication Algorithm using OpenMP, MPI and CUDA Programming Languages
Parallel programming languages represent a common theme in the evolution of high performance computing (HPC) systems. There are several parallel programming languages that are directly associated with different HPC systems. In this paper, we compare the performance of three commonly used parallel programming languages, namely: OpenMP, MPI and CUDA. Our performance evaluation of these languages […]
Jul, 17
A Software-Based Self Test of CUDA Fermi GPUs
Nowadays, Graphical Processing Units (GPUs) have become increasingly popular due to their high computational power and low prices. This makes them particularly suitable for high-performance computing applications, like data elaboration and financial computation. In these fields, high efficient test methodologies are mandatory. One of the most effective ways to detect and localize hardware faults in […]
Jul, 17
Parallelization the Job-shop Problem on Distributed and Shared Memory Architectures
The paper presents the parallel algorithm for solving the scheduling problem. This algorithm is implemented in the distributed memory multi-computers, and with each machine using CPU – GPU shared memory architecture, so that the time to complete the work as quickly as possible. This algorithm is based on the branching algorithm approach for searching. The […]
Jul, 17
Starchart: Hardware and Software Optimization Using Recursive Partitioning Regression Trees
Graphics processing units (GPUs) are in increasingly wide use, but significant hurdles lie in selecting the appropriate algorithms, runtime parameter settings, and hardware configurations to achieve power and performance goals with them. Exploring hardware and software choices requires time-consuming simulations or extensive real-system measurements. While some auto-tuning support has been proposed, it is often narrow […]
Jul, 17
Early Experiences With The OpenMP Accelerator Model
A recent trend in mainstream computer nodes is the combined use of general-purpose multicore processors and specialized accelerators such as GPUs and DSPs in order to achieve better performance and to reduce power consumption. To support this trend, the OpenMP Language Committee has approved a set of extensions to OpenMP (referred to as the OpenMP […]
Jul, 17
Parallel heterogeneous Branch and Bound algorithms for multi-core and multi-GPU environments
Branch and Bound (B&B) algorithms are attractive for solving to optimality combinatorial optimization problems (COPs) by exploring a tree-based search space. Nevertheless, they are highly time-intensive when dealing with large problem instances (e.g. Taillard’s FSP benchmarks) even using grid computing [Mezmaz et al., IEEE IPDPS’2007]. Massively parallel computing supplied through today’s heterogeneous (GPU-enhanced multicore) platforms […]
Jul, 16
Data Structures and Algorithms for Counting Problems on Graphs using GPU
The availability and utility of large numbers of Graphical Processing Units (GPUs) have enabled parallel computations using extensive multi-threading. Sequential access to global memory and contention at the size-limited shared memory have been main impediments to fully exploiting potential performance in architectures having a massive number of GPUs. After performing extensive study of data structures […]
Jul, 16
A New GPU-based Approach to the Shortest Path Problem
The Single-Source Shortest Path (SSSP) problem arises in many different fields. In this paper we present a GPUbased version of the Crauser et al. SSSP algorithm. Our work significantly speeds up the computation of the SSSP, not only with respect to the CPU-based version, but also to other state-of-the-art GPU implementation based on Dijkstra, due […]
Jul, 16
Coupling Lattice Boltzmann Gas and Level Set Method for Simulating Free Surface Flow in GPU/CUDA Environment
We present here a proof-of-concept of a novel, efficient method for modelling of liquid/gas interface dynamics. Our approach consists in coupling the lattice Boltzmann gas (LBG) and the level set (LS) methods. The inherent parallel character of LBG accelerated by level sets is the principal advantage of our approach over similar particle based solvers. Consequently, […]
Jul, 16
Object support for OpenMP-style programming of GPU clusters in Java
For scientists, it is advantageous to use a high level of abstraction for programming their simulations, so that they can focus on the problem at hand instead of struggling with low-level details. However, current HPC clusters with multiple GPUs per node only offer explicit communication to and from the GPUs, require manual work to keep […]
Jul, 16
Efficient algorithms for the realistic simulation of fluids
Nowadays there is great demand for realistic simulations in the computer graphics field. Physically-based animations are commonly used, and one of the more complex problems in this field is fluid simulation, more so if real-time applications are the goal. Videogames, in particular, resort to different techniques that, in order to represent fluids, just simulate the […]