Posts
May, 29
Job Parallelism using Graphical Processing Unit individual Multi-Processors and Highly Localised Memory
Graphical Processing Units(GPUs) are usually programmed to provide data-parallel acceleration to a host processor. Modern GPUs typically have an internal multi-processor (MP) structure that can be exploited in an unusual way to offer semi-independent task parallelism providing the MPs can operate within their own localised memory and apply data-parallelism to their own problem subset. We […]
May, 29
On Migration and Consolidation of VMs in Hybrid CPU-GPU Environments
In this research, we target at the investigation of a dynamic energy-aware management framework on the execution of independent workloads (e.g., bag-of-tasks) in hybrid CPU-GPU PARA-computing platforms, aiming at optimizing the execution of workloads in appropriate computing resources concurrently while balancing the use of solely virtual or physical resources or hybridly selected resources, to achieve […]
May, 29
Finite-difference time-domain simulations of metamaterials
Metamaterials are periodic structures created by many identical scattering objects which are stationary and small compared to the wavelength of electromagnetic wave applied to it so that when combined with different elements, these materials have the potential to be coupled to the applied electromagnetic wave without modifying the structure. Due to their unusual properties that […]
May, 29
Efficient GPU-based Graph Cuts for Stereo Matching
Although graph cuts (GC) is popularly used in many computer vision problems, slow execution time due to its high complexity hinders wide usage. Manycore solution using Graphics Processing Unit (GPU) may solve this problem. However, conventional GC implementation does not fully exploit GPU’s computing power. To address this issue, a new GC algorithm which is […]
May, 29
Parallelization of Mesh Contraction and Fairing using OpenCL
We propose a parallel method for computing local Laplacian curvature flows for triangular meshes. Laplace operator is widely used in mesh processing for mesh fairing, noise removal or curvature estimation. If the Laplacian flow is used in global sense constraining a whole mesh with an iterative weighted linear system, it can be used even for […]
May, 28
Effects of Concurrency Techniques and Algorithm Performance: A Comparative Analysis of Single-Threaded, Multi-Threaded, and GPGPU Programming Techniques
Deployment of parallel architectures in computing systems is increasing. In this paper we study the performance effects of a variety of programming techniques and technologies that utilize these parallel architectures as applied to example algorithms. We demonstrate that algorithms, which are highly parallel in nature, gain significant performance increases through proper application of both parallel […]
May, 28
MATLAB Medical Images Classification on Graphics Processors
Due to their massively parallel hardware design, graphic processors can easily beat ordinary CPUs in applications which involve large amount of data. Considering their great potential, the objective of this paper is to continue previous work and optimize the speed and efficiency of texture and fractal analysis, as used for medical images classification processes for […]
May, 28
Power Modeling and Optimization for GPGPUs
State-of-the-art General-Purpose computing on Graphics Processing Unit (GPGPU) is facing severe power challenge due to the increasing number of cores placed on a chip with decreasing feature size. In order to hide the long latency operations, GPGPU employs the fine-grained multi-threading among numerous active threads, leading to the sizeable register files with massive power consumption. […]
May, 28
On Leveraging GPUs for Security: discussing k-anonymity and pattern matching
In recent years the need to solve complex problems that require large computing resources in shorter time has especially arisen. Some of these in the scientific field are: weather forecast, seismic simulations, chemical reactions simulation and studies on the human genoma [1]. All of them belong to the "Grand Challenge Problems" set. As can be […]
May, 28
Analysis of Parallel Montgomery Multiplication in CUDA
For a given level of security, elliptic curve cryptography (ECC) offers improved efficiency over classic public key implementations. Point multiplication is the most common operation in ECC and, consequently, any significant improvement in perfor- mance will likely require accelerating point multiplication. In ECC, the Montgomery algorithm is widely used for point multiplication. The primary purpose […]
May, 27
Performance Portability in Accelerated Parallel Kernels
Heterogeneous architectures, by definition, include multiple processing components with very different microarchitectures and execution models.In particular, computing platforms from supercomputers to smartphones can now incorporate both CPU and GPU processors. Disparities between CPU and GPU processor architectures have naturally led to distinct programming models and development patterns for each component.Developers for a specific system decompose […]
May, 27
A Performance Modeling and Optimization Analysis Tool for Sparse Matrix-Vector Multiplication on GPUs
This paper presents a performance modeling and optimization analysis tool to predict and optimize the performance of sparse matrix-vector multiplication (SpMV) on GPUs. We make the following contributions: (1) We present an integrated analytical and profile-based performance modeling to accurately predict the kernel execution times of CSR, ELL, COO, and HYB SpMV kernels. Our proposed […]