Posts
Nov, 5
On Dynamic Load Balancing on Graphics Processors
To get maximum performance on the many-core graphics processors it is important to have an even balance of the workload so that all processing units contribute equally to the task at hand. This can be hard to achieve when the cost of a task is not known beforehand and when new sub-tasks are created dynamically […]
Nov, 5
Coarse grain parallelization of evolutionary algorithms on GPGPU cards with EASEA
This paper presents a straightforward implementation of a standard evolutionary algorithm that evaluates its population in parallel on a GPGPU card. Tests done on a benchmark and a real world problem using an old NVidia 8800GTX card and a newer but not top of the range GTX260 card show a roughly 30x (resp. 100x) speedup […]
Nov, 5
MPI within a GPU
GPUs offer high-performance floating-point computation at commodity prices, but their usage is hindered by programming models which expose the user to irregularities in the current shared-memory environments and require learning new interfaces and semantics. This thesis will demonstrate that the message-passing paradigm can be conceptually cleaner than the current data-parallel models for programming GPUs because […]
Nov, 5
Interactive machinability analysis of free-form surfaces using multiple-view image space techniques on the GPU
In this paper we present a set of graphics hardware accelerated algorithms to interactively evaluate the machinability of complex free-form surfaces. These algorithms work in image space and easily interface with all common formats available on CAD systems. The running time of these algorithms is independent of the complexity of the surface to be analyzed […]
Nov, 5
An intelligent semi-automatic application porting system for application accelerators
Work involving the use of application acceleration devices is showing great promise, however, there are still major obstacles preventing their widespread adoption. Currently the process of porting applications to an accelerator requires expertise in both the computer science and application domains, due to the lack of abstraction available. We present our work associated with the […]
Nov, 5
Streamlining GPU applications on the fly: thread divergence elimination through runtime thread-data remapping
Because of their tremendous computing power and remarkable cost efficiency, GPUs (graphic processing unit) have quickly emerged as a kind of influential platform for high performance computing. However, as GPUs are designed for massive data-parallel computing, their performance is subject to the presence of condition statements in a GPU application. On a conditional branch where […]
Nov, 5
An experimental approach to performance measurement of heterogeneous parallel applications using CUDA
Heterogeneous parallel systems using GPU devices for application acceleration have garnered significant attention in the supercomputing community. However, to realize the full potential of GPU computing, application developers will require tools to measure and analyze accelerator performance with respect to the parallel execution as a whole. A performance measurement technology for the NVIDIA CUDA platform […]
Nov, 5
GPU-Accelerated Nearest Neighbor Search for 3D Registration
Nearest Neighbor Search (NNS) is employed by many computer vision algorithms. The computational complexity is large and constitutes a challenge for real-time capability. The basic problem is in rapidly processing a huge amount of data, which is often addressed by means of highly sophisticated search methods and parallelism. We show that NNS based vision algorithms […]
Nov, 5
Debugging GPU stream programs through automatic dataflow recording and visualization
We present a novel framework for debugging GPU stream programs through automatic dataflow recording and visualization. Our debugging system can help programmers locate errors that are common in general purpose stream programs but very difficult to debug with existing tools. A stream program is first compiled into an instrumented program using a compiler. This instrumenting […]
Nov, 5
GPU for Parallel On-Board Hyperspectral Image Processing
Hyperspectral analysis algorithms exhibit inherent parallelism at multiple levels, and map nicely on high performance systems such as massively parallel clusters and networks of computers. Unfortunately, these systems are generally expensive and difficult to adapt to onboard data processing scenarios, in which low-weight and low-power integrated components are desirable to reduce mission pay-load. An exciting […]
Nov, 5
RenderAnts: Interactive REYES Rendering on GPUs
We present RenderAnts, the first system that enables interactive REYES rendering on GPUs. Taking RenderMan scenes and shaders as input, our system first compiles RenderMan shaders to GPU shaders. Then all stages of the basic REYES pipeline, including bounding/splitting, dicing, shading, sampling, compositing and filtering, are executed on GPUs using carefully designed dataparallel algorithms. Advanced […]
Nov, 5
Accelerating MATLAB Image Processing Toolbox functions on GPUs
In this paper, we present our effort in developing an open-source GPU (graphics processing units) code library for the MATLAB Image Processing Toolbox (IPT). We ported a dozen of representative functions from IPT and based on their inherent characteristics, we grouped these functions into four categories: data independent, data sharing, algorithm dependent and data dependent. […]