Posts
Nov, 17
Characterization and Transformation of Unstructured Control Flow in GPU Applications
Hardware and compiler techniques for mapping data-parallel programs with divergent control flow to SIMD architectures have recently enabled the emergence of new GPGPU programming models such as CUDA and OpenCL. Although this technology is widely used, commodity GPUs use different schemes to implement it, and the performance limitations of these different schemes under real workloads […]
Nov, 17
Massive Image Editing on the Cloud
Processing massive imagery in a distributed environment currently requires the effort of a skilled team to efficiently handle communication, synchronization, faults, and data/process distribution. Moreover, these implementations are highly optimized for a specific system or cluster, therefore portability or improved performance due to system improvements is rarely considered. Much like early GPU computing, cluster computing […]
Nov, 17
Adaboost GPU-based Classifier for Direct Volume Rendering
In volume visualization, the voxel visibitity and materials are carried out through an interactive editing of Transfer Function. In this paper, we present a two-level GPU-based labeling method that computes in times of rendering a set of labeled structures using the Adaboost machine learning classifier. In a pre-processing step, Adaboost trains a binary classifier from […]
Nov, 16
Simulations of Large Particle Systems in Real Time
Simulation of interacting particle systems has been a well established method for many years now. Such systems can span different scales, including microscopic (where particles represent atoms, as in Molecular Dynamics simulations) as well as macroscopic. In the latter case, growing interest is put into Smoothed Particle Hydrodynamics approach. Traditionally, over many years, simulation of […]
Nov, 16
Object Space Based Collision Detection for Cloth Simulation on the GPU
This paper presents an approach for cloth-body collision detection in computer graphics simulations of clothing. It is an object-space based algorithm implemented in OpenCL on the GPU. The underlying idea behind this work is to speed up the solution of the collision detection problem by utilizing the excessive computational capacity of contemporary GPUs. Results of […]
Nov, 16
Parallel Approach for Longest Common Subsequence problem on GPU
Recent developments in genomic and molecular technologies produced a tremendous amount of information related to molecular biology. The management and analysis of these biological data require intensive computing power. Sequence aligning is one of the algorithmic tools in bioinformatics to look for resemblance among sequences of amino acids. The longest common subsequence (LCS) of biological […]
Nov, 12
Creating HW/SW co-designed MPSoPC’s from high level programming models
FPGA densities have continued to follow Moore’s law and can now support a complete multiprocessor system on programmable chip. The benefits of the FPGA include the ability to build a customized MPSoC system consisting of heterogeneous processing resources, interconnects and memory hierarchies that best match the requirements of each application. In this paper we outline […]
Nov, 12
Safe Asynchronous Multicore Memory Operations
Asynchronous memory operations provide a means for coping with the memory wall problem in multicore processors, and are available in many platforms and languages, e.g., the Cell Broadband Engine, CUDA and OpenCL. Reasoning about the correct usage of such operations involves complex analysis of memory accesses to check for races. We present a method and […]
Nov, 11
Synthetic Aperture Beamformation using the GPU
A synthetic aperture ultrasound beamformer is implemented for a GPU using the OpenCL framework. The implementation supports beamformation of either RF signals or complex baseband signals. Transmit and receive apodization can be either parametric or dynamic using a fixed F-number, a reference, and a direction. Images can be formed using an arbitrary number of emissions […]
Nov, 10
GPU Acceleration of Matrix-based Methods in Computational Electromagnetics
This work considers the acceleration of matrix-based computational electromagnetic (CEM) techniques using graphics processing units (GPUs). These massively parallel processors have gained much support since late 2006, with software tools such as CUDA and OpenCL greatly simplifying the process of harnessing the computational power of these devices. As with any advances in computation, the use […]
Nov, 10
A CPU-GPU Hybrid Runtime for the Aeminium Language
Given that CPU clock speeds are stagnating, programmers are resorting to parallelism to improve the performance of their applications. Although such parallelism has usually been attained using either multicore architectures, multiple CPUs and/or clusters of machines, the GPU has since been used as an alternative. GPUs are an interesting resource because they can provide much […]
Nov, 10
Bit-Parallel Multiple Pattern Matching
Text matching with errors is a regular task in computational biology. We present an extension of the bit-parallel Wu-Manber algorithm to combine several searches for a pattern into a collection of fixed-length words. We further present an OpenCL parallelization of a redundant index on massively parallel multicore processors, within a framework of searching for similarities […]