Posts
Jul, 7
Comparison of Rectangular Matrix Multiplication with and without Border Conditions
Matrix multiplication algorithms are very common and widely used for computation in almost any field. There are many implementations for matrix multiplication on different platforms and programming models. GPU devices in the recent years have become powerful computational units that have entered the segment of high performance computing. In this paper we are analysing two […]
Jul, 7
Solving 3D Anisotropic Elastic Wave Equations on Parallel GPU Devices
Efficiently modelling seismic datasets in complex 3D anisotropic media by solving the 3D elastic wave equation is an important challenge in computational geophysics. Using a stress-stiffness formulation on a regular grid, we present a 3D finite-difference time-domain (FDTD) solver using a 2nd-order temporal and 8th-order spatial accuracy stencil that leverages the massively parallel architecture of […]
Jul, 7
A Comparative Study of Neighborhood Filters for Artifact Reduction in Iterative Low-Dose CT
Iterative CT algorithms have become increasingly popular in recent years. They have been found useful when the projections are limited in number, irregularly spaced, or noisy, which are often encountered in low-dose CT imaging. One way to cope with the associated streak and noise artifacts is to interleave a regularization objective into the iterative reconstruction […]
Jul, 7
CrowdCL: Web-Based Volunteer Computing with WebCL
We present CrowdCL, an open-source framework for the rapid development of volunteer computing and OpenCL applications on the web. Drawing inspiration from existing GPU libraries like PyCUDA, CrowdCL provides an abstraction layer for WebCL aimed at reducing boilerplate and improving code readability. CrowdCL also provides developers with a framework to easily run computations in the […]
Jul, 7
Comparative study of parallel programming models for multicore computing
Shared memory multi-core processor technology has seen a drastic development with faster and increasing number of processors per chip. This new architecture challenges computer programmers to write code that scales over these many cores to exploit full computational power of these machines. Shared-memory parallel programming paradigms such as OpenMP and Intel Threading Building Blocks (TBB) […]
Jul, 5
Optimize Overall System Performance Through Workload Sequencing for GPUs Data Offloading
With the proliferation of general purpose computation, GPUs are becoming extremely important to significantly improve system performance for many computing systems, including embedded systems. Running massively parallel kernels on GPUs is challenging for system’s overall performance especially when a large number of workloads (kernels) are running together. In this paper, we establish a mechanism to […]
Jul, 5
Hybrid Acceleration of a Molecular Dynamics Simulation Using Short-Ranged Potentials
Molecular dynamics simulations are a very useful tool to study the behavior and interaction of atoms and molecules in chemical and bio-molecular systems. With the fast rising complexity of such simulations hybrid systems with both, multi-core processors (CPUs) and multiple graphics processing units (GPUs), become more and more popular. To obtain an optimal performance this […]
Jul, 5
GPU-enabled Efficient Executions of Radiation Calculations in Climate Modelling
In this paper, we discuss the acceleration of a climate model known as Community Earth System Model (CESM). The use of Graphics Processor Units (GPUs) to accelerate scientific applications that are computationally intensive is well known. This project attempts to extract the performance of GPUs to enable fast execution of CESM to obtain better model […]
Jul, 5
Triangular mesh simplification on the GPU
We present a simplification algorithm for triangular meshes, implemented on the GPU. The algorithm performs edge collapses driven by a quadric error metric. It uses data parallelism as provided by OpenCL and has no sequential segments in its main iterative structure in order to fully exploit the processing power of the GPU. Our implementation produces […]
Jul, 5
OpenCL for FPGAs: Prototyping a Compiler
Hardware acceleration using FPGAs has shown orders of magnitude reduction in runtime of computationally-intensive applications in comparison to traditional stand-alone computers [1]. This is possible because on an FPGA many computations can be performed at the same time in a truly-parallel fashion. However, parallel computation at a hardware level requires a great deal of expertise, […]
Jul, 3
Physical modeling and high-performance GPU computing for characterization, interception, and disruption of hazardous near-Earth objects
For the past few decades, both the scientific community and the general public have been becoming more aware that the Earth lives in a shooting gallery of small objects. We classify all of these asteroids and comets, known or unknown, that cross Earth’s orbit as near-Earth objects (NEOs). A look at our geologic history tells […]
Jul, 3
A GPU Implementation of Local Search Operators for Symmetric Travelling Salesman Problem
The Travelling Salesman Problem (TSP) is one of the most studied combinatorial optimization problem which is significant in many practical applications in transportation problems. The TSP problem is NP-hard problem and requires large computation power to be solved by the exact algorithms. In the past few years, fast development of general-purpose Graphics Processing Units (GPUs) […]