high performance computing on graphics processing units: hgpu.org

Posts

Aug, 19

Real-time rendering and dynamic updating of 3-d volumetric data

A dense 3-d terrain model obtained using reconstruction methods from aerial images is represented in a probabilistic volumetric framework. The choice of probabilistic representation is to represent inherent ambiguity in reconstruction of surface from images. Such probabilistic representation handles the ambiguity very well but leads to expensive dense volumetric storage. The area coverage required for […]

OpenCL

Aug, 19

Caracal: dynamic translation of runtime environments for GPUs

Graphics Processing Units (GPU) have become the platform of choice for accelerating a large range of data parallel and task parallel applications. Both AMD and NVIDIA have developed GPU implementations targeted at the high performance computing market. The rapid adoption of GPU computing has been greatly aided by the introduction of high-level programming environments such […]

CUDA

•

OpenCL

Aug, 19

Auto-tuning SkePU: a multi-backend skeleton programming framework for multi-GPU systems

SkePU is a C++ template library that provides a simple and unified interface for specifying data-parallel computations with the help of skeletons on GPUs using CUDA and OpenCL. The interface is also general enough to support other architectures, and SkePU implements both a sequential CPU and a parallel OpenMP backend. It also supports multi-GPU systems. […]

CUDA

•

OpenCL

Aug, 19

Frameworks for multi-core architectures: a comprehensive evaluation using 2D/3D image registration

The development of standard processors changed in the last years moving from bigger, more complex, and faster cores to putting several more simple cores onto one chip. This changed also the way programs are written in order to leverage the processing power of multiple cores of the same processor. In the beginning, programmers had to […]

OpenCL

Aug, 18

SkePU: a multi-backend skeleton programming library for multi-GPU systems

We present SkePU, a C++ template library which provides a simple and unified interface for specifying data-parallel computations with the help of skeletons on GPUs using CUDA and OpenCL. The interface is also general enough to support other architectures, and SkePU implements both a sequential CPU and a parallel OpenMP backend. It also supports multi-GPU […]

CUDA

•

OpenCL

Aug, 18

Energy-aware metrics for benchmarking heterogeneous systems

With the advent of heterogeneous computing systems consisting of multi-core CPUs and many-core GPUs, robust methods are needed to facilitate fair benchmark comparisons between different systems. In this paper we present a benchmarking methodology for measuring a number of performance metrics for heterogeneous systems. Methods for comparing performance and energy efficiency are included. Consideration is […]

OpenCL

Aug, 18

ATI Stream Profiler: a tool to optimize an OpenCL kernel on ATI Radeon GPUs

Modern GPUs have been shown to be highly efficient machines for data-parallel applications such as graphics, image, video processing, or physical simulation applications. For example, a single ATI Radeon HD 5870 GPU has a theoretical peak of 2.72 teraflops (1012 floating-point operations per second) with a video memory bandwidth of 153.6 GB/s. While it is […]

OpenCL

Aug, 18

Physical and graphical effects in OpenCL by example

There are strong indications that the future of interactive graphics involves a more flexible programming model than today’s OpenGL/Direct3D pipelines. That means that graphics developers will need a basic understanding of how to combine emerging parallel-programming techniques with the traditional interactive rendering pipeline. This course provides an introduction to parallel-programming architectures and environments for interactive […]

OpenCL

Aug, 18

Parallelization of the x264 encoder using OpenCL

With the introduction of H.264, the complexity on video encoders has increased dramatically. As hardware based encoding solutions profit from the strict sequential design and already feature real time capabilities for high definition material, software solutions lack most of the encoding performance. More precisely, the performance of software encoders is limited due to the computation […]

OpenCL

Aug, 18

Simulating Biological-Inspired Spiking Neural Networks with OpenCL

The algorithms used for simulating biologically-inspired spiking neural networks (BIANN) often utilize functions which are computationally complex and have to model a large number of neurons – or even a much larger number of synapses in parallel. To use all available computing resources provided by a standard desktop PC is an opportunity to shorten the […]

OpenCL

Aug, 18

Parallel Batch Training of the Self-Organizing Map Using OpenCL

The Self-Organizing Maps (SOMs) are popular artificial neural networks that are often used for data analyses through clustering and visualisation. SOM’s mathematical model is inherently parallel. However, many implementations have not successfully exploited its parallelism because previous attempts often required cluster-like infrastructures. This article presents the parallel implementation of SOMs, particularly the batch map variant […]

OpenCL

Aug, 18

Maestro: Data Orchestration and Tuning for OpenCL Devices

As heterogeneous computing platforms become more prevalent, the programmer must account for complex memory hierarchies in addition to the difficulties of parallel programming. OpenCL is an open standard for parallel computing that helps alleviate this difficulty by providing a portable set of abstractions for device memory hierarchies. However, OpenCL requires that the programmer explicitly controls […]

OpenCL