Posts
May, 29
Efficient GPU-based Graph Cuts for Stereo Matching
Although graph cuts (GC) is popularly used in many computer vision problems, slow execution time due to its high complexity hinders wide usage. Manycore solution using Graphics Processing Unit (GPU) may solve this problem. However, conventional GC implementation does not fully exploit GPU’s computing power. To address this issue, a new GC algorithm which is […]
May, 29
Parallelization of Mesh Contraction and Fairing using OpenCL
We propose a parallel method for computing local Laplacian curvature flows for triangular meshes. Laplace operator is widely used in mesh processing for mesh fairing, noise removal or curvature estimation. If the Laplacian flow is used in global sense constraining a whole mesh with an iterative weighted linear system, it can be used even for […]
May, 28
Effects of Concurrency Techniques and Algorithm Performance: A Comparative Analysis of Single-Threaded, Multi-Threaded, and GPGPU Programming Techniques
Deployment of parallel architectures in computing systems is increasing. In this paper we study the performance effects of a variety of programming techniques and technologies that utilize these parallel architectures as applied to example algorithms. We demonstrate that algorithms, which are highly parallel in nature, gain significant performance increases through proper application of both parallel […]
May, 28
MATLAB Medical Images Classification on Graphics Processors
Due to their massively parallel hardware design, graphic processors can easily beat ordinary CPUs in applications which involve large amount of data. Considering their great potential, the objective of this paper is to continue previous work and optimize the speed and efficiency of texture and fractal analysis, as used for medical images classification processes for […]
May, 28
Power Modeling and Optimization for GPGPUs
State-of-the-art General-Purpose computing on Graphics Processing Unit (GPGPU) is facing severe power challenge due to the increasing number of cores placed on a chip with decreasing feature size. In order to hide the long latency operations, GPGPU employs the fine-grained multi-threading among numerous active threads, leading to the sizeable register files with massive power consumption. […]
May, 28
On Leveraging GPUs for Security: discussing k-anonymity and pattern matching
In recent years the need to solve complex problems that require large computing resources in shorter time has especially arisen. Some of these in the scientific field are: weather forecast, seismic simulations, chemical reactions simulation and studies on the human genoma [1]. All of them belong to the "Grand Challenge Problems" set. As can be […]
May, 28
Analysis of Parallel Montgomery Multiplication in CUDA
For a given level of security, elliptic curve cryptography (ECC) offers improved efficiency over classic public key implementations. Point multiplication is the most common operation in ECC and, consequently, any significant improvement in perfor- mance will likely require accelerating point multiplication. In ECC, the Montgomery algorithm is widely used for point multiplication. The primary purpose […]
May, 27
Performance Portability in Accelerated Parallel Kernels
Heterogeneous architectures, by definition, include multiple processing components with very different microarchitectures and execution models.In particular, computing platforms from supercomputers to smartphones can now incorporate both CPU and GPU processors. Disparities between CPU and GPU processor architectures have naturally led to distinct programming models and development patterns for each component.Developers for a specific system decompose […]
May, 27
A Performance Modeling and Optimization Analysis Tool for Sparse Matrix-Vector Multiplication on GPUs
This paper presents a performance modeling and optimization analysis tool to predict and optimize the performance of sparse matrix-vector multiplication (SpMV) on GPUs. We make the following contributions: (1) We present an integrated analytical and profile-based performance modeling to accurately predict the kernel execution times of CSR, ELL, COO, and HYB SpMV kernels. Our proposed […]
May, 27
Rapid Computation of Sodium Bioscales Using GPU-Accelerated Image Reconstruction
Quantitative sodium magnetic resonance imaging permits noninvasive measurement of the tissue sodium concentration (TSC) bioscale in the brain. Computing the TSC bioscale requires reconstructing and combining multiple datasets acquired with a non-Cartesian acquisition that highly oversamples the center of k-space. Even with an optimized implementation of the algorithm to compute TSC, the overall processing time […]
May, 27
Trapping of giant-planet cores – I. vortex aided trapping at the outer dead zone edge
In this paper the migration of a 10 Earth mass planetary core is investigated at the outer boundary of the dead zone of a protoplanetary disc by means of 2D hydrodynamic simulations done with the GPU version of the FARGO code. In the dead zone the effective viscosity is greatly reduced due to the disc […]
May, 27
Scaling Radio Astronomy Signal Correlation on Heterogeneous Supercomputers Using Various Data Distribution Methodologies
Next generation radio telescopes will require orders of magnitude more computing power to provide a view of the universe with greater sensitivity. In the initial stages of the signal processing flow of a radio telescope, signal correlation is one of the largest challenges in terms of handling huge data throughput and intensive computations. We implemented […]

