Posts
Jun, 27
Explicit Shallow Water Simulations on GPUs: Guidelines and Best Practices
Graphics processing units have now been used for scientific calculations for over a decade, going from early proof-of-concepts to industrial use today. The inherent reason is that graphics processors are far more powerful than CPUs when it comes to both floating point operations and memory bandwidth, illustrated by the fact that three of the top […]
Jun, 27
Software Performance Analysis with Parallel Programming Approaches
The term software performance engineering (SPE) is a systematic and quantitative approach for constructing software systems to meet the performance objectives such as response time, throughput, scalability and resource utilization. Optimization is major concern in achieving performance parameters. Optimization is performed during run-time, or in the design phase. This paper proposes the coding practices in […]
Jun, 27
Solving Molecular Distance Geometry Problems in OpenCL
We focus on the following computational chemistry problem: Given a subset of the exact distances between atoms, reconstruct the three-dimensional position of each atom in the given molecule. The distance matrix is generally sparse. This problem is both important and challenging. Our contribution is a novel combination of two known techniques (parallel breadth-first search and […]
Jun, 27
OpenCL Floating Point Software on Heterogeneous Architectures – Portable or Not?
OpenCL is an emerging platform for parallel computing that promises portability of applications across different architectures. This promise is seriously undermined, however, by the frequent use of floating-point arithmetic in scientific applications. Floating-point computations can yield vastly different results on different architectures – even IEEE 754-compliant ones -, potentially causing changes in control flow and […]
Jun, 27
A Parallel Monte Carlo Code for Simulating Collisional N-body Systems
We present a new parallel code for computing the dynamical evolution of collisional N-body systems with up to N~10^7 particles. Our code is based on the the H’enon Monte Carlo method for solving the Fokker-Planck equation, and makes assumptions of spherical symmetry and dynamical equilibrium. The principal algorithmic developments involve optimizing data structures, and the […]
Jun, 26
GreenGPU: A Holistic Approach to Energy Efficiency in GPU-CPU Heterogeneous Architectures
In recent years, GPU-CPU heterogeneous architectures have been increasingly adopted in high performance computing, because of their capabilities of providing high computational throughput. However, the energy consumption is a major concern due to the large scale of such kind of systems. There are a few existing efforts that try to lower the energy consumption of […]
Jun, 26
Multi-GPU Island-Based Genetic Algorithm for Solving the Knapsack Problem
This paper introduces a novel implementation of the genetic algorithm exploiting a multi-GPU cluster. The proposed implementation employs an island-based genetic algorithm where every GPU evolves a single island. The individuals are processed by CUDA warps, which enables the solution of large knapsack instances and eliminates undesirable thread divergence. The MPI interface is used to […]
Jun, 26
Compiling a high-level language for GPUs: (via language support for architectures and compilers)
Languages such as OpenCL and CUDA offer a standard interface for general-purpose programming of GPUs. However, with these languages, programmers must explicitly manage numerous low-level details involving communication and synchronization. This burden makes programming GPUs difficult and error-prone, rendering these powerful devices inaccessible to most programmers. We desire a higher-level programming model that makes GPUs […]
Jun, 26
GPU-based Cloud Computing for Comparing the Structure of Protein Binding Sites
In this paper, we present a novel approach for using a GPU-based Cloud computing infrastructure to efficiently perform a structural comparison of protein binding sites. The original CPU-based Java version of a recent graph-based algorithm called SEGA has been rewritten in OpenCL to run on NVIDIA GPUs in parallel on a set of Amazon EC2 […]
Jun, 26
Evaluation of likelihood functions on CPU and GPU devices
We describe parallel implementations of an algorithm used to evaluate the likelihood function used in data analysis. The implementations run, respectively, on CPU and GPU, and both devices cooperatively (hybrid). CPU and GPU implementations are based on OpenMP and OpenCL, respectively. The hybrid implementation allows the application to run also on multi-GPU systems (not necessarily […]
Jun, 25
A Parallel Algorithm Development Model for the GPU Architecture
Parallel computing has been in use for decades, and throughout many researchers have sought to define a model for algorithm design for such a platform. Valiant developed a model for parallel computing, which was later extended to later include multi-core processors, but it still may not be best suited for the unique GPU architecture. With […]
Jun, 25
Improving GPU Simulations of Spiking Neural P Systems
In this work we present further extensions and improvements of a Spiking Neural P system (for short, SNP systems) simulator on graphics processing units (for short, GPUs). Using previous results on representing SNP system computations using linear algebra, we analyze and implement a computation simulation algorithm on the GPU. A two-level parallelism is introduced for […]