high performance computing on graphics processing units: hgpu.org

Posts

Jan, 23

Exploratory research on embedding CUDA code into hetrogeneous MP-SOC achitectures programmed with the Daedalus framework

The objective of this Bachelor Thesis is to explore the possibilities of using NVIDIA CUDA enabled GPU Processors within the HDPC framework. The HDPC framework is one of the heterogeneous MP-SoC architectures programmed with the Daedalus framework. This paper will focus on the transfer overhead introduced by using the GPU and how to best cope […]

CUDA

Jan, 23

CUDA Based Polyphase Filter

This paper presents the evaluation of the use of a graphics processor for realtime radio astronomy DSP (Digital Signal Processing) within VLBI (Very Long Baseline Interferometry). A polyphase filter bank (pfb) was implemented in a prototype application to convert external ADC input into channelized frequency streams. This system was tested with a 32 channel pfb, […]

CUDA

Jan, 23

Convolution of large 3D images on GPU and its decomposition

In this paper we propose a method for computing convolution of large 3D images. The convolution is performed in a frequency domain using a convolution theorem. The algorithm is accelerated on a graphic card by means of the CUDA parallel computing model. Convolution is decomposed in a frequency domain using the DIF (decimation in frequency) […]

CUDA

Jan, 23

Object Oriented Framework for CUDA based Pyramidal Image Blending

In this paper, we propose and implement the object oriented framework for the CUDA based pyramidal image blending. This algorithm is an essential part of an image stitching process for a seamless panoramic mosaic. The CUDA framework is a novel GPU programming framework from NVIDIA. It offers a complex integration framework and require more than […]

CUDA

Jan, 23

Strassen’s Matrix Multiplication on GPUs

We provide efficient single-precision and integer GPU implementations of Strassen’s algorithm as well as of Winograd’s variant. On an NVIDIA C1060 GPU, a speedup of 32% (35%) is obtained for Strassen’s 4-level implementation and 33% (36%) for Winograd’s variant relative to the sgemm (integer version of sgemm) code in CUBLAS 3.0 when multiplying 16384×16384 matrices. […]

CUDA

Jan, 23

Faster Upper Body Pose Estimation Using CUDA

Determining upper body poses using computer vision can have long execution times when using traditional linear methods on the CPU. This paper shows how parallel processing methods, and in particular the usage of a GPU API called CUDA, can increase system performance.

CUDA

Jan, 23

Scientific Computation on Graphics Processing Unit using CUDA

The Partial Differential Equations (PDEs) play major role in mathematical modeling of problems in engineering and science. The engineering disciplines such as electro-magnetics and fluid dynamics use PDEs heavily and development of products in these engineering fields employs computational intensive numerical methods. These computational intensive methods takes reasonable amount of time on state of the […]

CUDA

Jan, 23

Exact Symbolic-Numeric Computation of Planar Algebraic Curves

We present a novel certified and complete algorithm to compute arrangements of real planar algebraic curves. It provides a geometric-topological analysis of the decomposition of the plane induced by a finite number of algebraic curves in terms of a cylindrical algebraic decomposition. From a high-level perspective, the overall method splits into two main subroutines, namely […]

CUDA

Jan, 23

Analysis Acceleration in TMVA for the ATLAS Experiment at CERN using GPU Computing

ATLAS is one of two general purpose collision detectors within the Large Hadron Collider, detecting millions of events per second. One tool for the eventual analysis of this data is TMVA, the Toolkit for Multi-Variate Analysis. Comprising of a number of machine learning techniques, it supports physicists in classifying events. This project forms a feasibility […]

CUDA

Jan, 23

Real-time Compressive Sensing MRI Reconstruction using GPU Computing and Split Bregman Methods

Compressive sensing (CS) has been shown to enable dramatic acceleration of MRI acquisition in some applications. Being an iterative reconstruction technique, CS MRI reconstructions can be more time consuming than traditional inverse Fourier reconstruction. We have accelerated our CS MRI reconstruction by factors of up to 27 by using a split Bregman solver combined with […]

CUDA

Jan, 23

A New Approach to rCUDA

In this paper we propose a first step towards a general and open source approach for using GPGPU (General-Purpose Computation on GPUs) features within virtual machines (VMs). In particular, we describe the use of rCUDA, a GPGPU virtualization framework, to permit the execution of GPU-accelerated applications within VMs, thus enabling GPGPU capabilities on any virtualized […]

CUDA

Jan, 23

Computing optical flow using fast total variation

During my internship, I was in charge of implementing a GPU version of the optical flow algorithm. The optical flow algorithm is based on the total variation features described in my bibliography. The internship takes place in VITRONIC (Wiesbaden, Germany), a pioneer and one of the leading organizations worldwide in the field of machine vision. […]

OpenCL