Posts
Feb, 9
Extending the SkelCL Skeleton Library for Stencil Computations on Multi-GPU Systems
The implementation of stencil computations on modern, massively parallel systems with GPUs and other accelerators currently relies on manually-tuned coding using low-level approaches like OpenCL and CUDA, which makes it a complex, time-consuming, and error-prone task. We describe how stencil computations can be programmed in our SkelCL approach that combines high level of programming abstraction […]
Feb, 9
A Scalable Multi-Path Microarchitecture for Efficient GPU Control Flow
Graphics processing units (GPUs) are increasingly used for non-graphics computing. However, applications with divergent control flow incur performance degradation on current GPUs. These GPUs implement the SIMT execution model by serializing the execution of different control flow paths encountered by a warp. This serialization can mask thread level parallelism among the scalar threads comprising a […]
Feb, 9
GPU-based high-performance computing for radiation therapy
Recent developments in radiotherapy therapy demand high computation powers to solve challenging problems in a timely fashion in a clinical environment. The graphics processing unit (GPU), as an emerging high-performance computing platform, has been introduced to radiotherapy. It is particularly attractive due to its high computational power, small size, and low cost for facility deployment […]
Feb, 8
Fast 2-D Ultrasound Strain Imaging: The Benefits of Using a GPU
Deformation of tissue can be accurately estimated from radio-frequency ultrasound data using a 2-dimensional normalized cross correlation (NCC)-based algorithm. This procedure, however, is very computationally time-consuming. A major time reduction can be achieved by parallelizing the numerous computations of NCC. In this paper, two approaches for parallelization have been investigated: the OpenMP interface on a […]
Feb, 8
State-Based Gauss-Seidel Framework for Real-time 2D Ultrasound Image Sequence Denoising on GPUs
The ultrasound image sequences are not only majorly contaminated by multiplicative noises but they are also usually contaminated with additive noises. As in the past few decades, there were some works, which had focused on removing the noises from ultrasound images, such as in the JY model [1] and in the variational model, which were […]
Feb, 8
Fast 3D Graphics Rendering Technique with CUDA Parallel Processing
3D Graphic Rendering has been used to express realistic, 3-dimensional, and emphasized effects in the graphics. As 3D Graphic Rendering developed and became more prevalent, the need for acceleration in data processing grew as well, leading to a development of GPU (Graphic Processing Unit) and shading language used for GPU such as GLSL (OpenGL Shading […]
Feb, 8
Developmental Directions in Parallel Accelerators
Parallel accelerators such as massively-cored graphical processing units or many-cored co-processors such as the Xeon Phi are becoming widespread and affordable on many systems including blade servers and even desktops. The use of a single such accelerator is now quite common for many applications, but the use of multiple devices and hybrid combinations is still […]
Feb, 8
Simulating and Benchmarking the Shallow-Water Fluid Dynamical Equations on Multiple Graphical Processing Units
The shallow-water model equations provide a simple yet realistic benchmark problem in computational fluid dynamics (CFD) that can be implemented on a variety of computational platforms. Graphical Processing Units can be used to accelerate such problems either singly using a data parallel decompositional scheme or with multiple devices using a domain decompositional approach. We implement […]
Feb, 5
Auto-Generation and Auto-Tuning of 3D Stencil Codes on Homogeneous and Heterogeneous GPU Clusters
This paper develops and evaluates search and optimization techniques for auto-tuning 3D stencil (nearest-neighbor) computations on GPUs. Observations indicate that parameter tuning is necessary for heterogeneous GPUs to achieve optimal performance with respect to a search space. Our proposed framework takes a most concise specification of stencil behavior from the user as a single formula, […]
Feb, 5
OpenACC-based GPU Acceleration of a 3-D Unstructured Discontinuous Galerkin Method
A GPU-accelerated discontinuous Galerkin (DG) method is presented for the solution of compressible flows on 3-D unstructured grids. The present work has employed two of the most attractive features in a new programming standard of parallel computing – OpenACC: 1) multi-platform/compiler support and 2) descriptive directive interface to upgrade a legacy CFD solver with the […]
Feb, 5
Depth Map Based Superresolution Method in 3D Reconstruction
In this thesis a superresolution method for geometry reconstruction is presented. Superresolution methods have already been used in order to improve texture resolution and RGB images. In a lesser degree they have also been used to improve the quality of depth maps. The objective of this thesis is to validate the hypothesis that superresolution methods […]
Feb, 5
Improved Implementation of Simulation for Membrane Computing on the Graphic Processing Unit
Membrane computing is a theoretical model of computation that inspired from the structure and functioning of cells. Membrane computing models naturally have parallel structure. Most of the simulations of membrane computing have been done in a serial way on a machine with a central processing unit (CPU). This has neglected the advantage of parallelism in […]