Posts
Jun, 6
ElastiFace: Matching and Blending Textured Faces
In this paper we present ELASTIFACE, a simple and versatile method for establishing correspondence between textured face models, either for the construction of a blend-shape facial rig or for the exploration of new characters by morphing between a set of input models. While there exists a wide variety of approaches for inter-surface mapping and mesh […]
Jun, 6
Accelerating Fast Fourier Transform for Wideband Channelization
Wideband channelization is a compute-intensive task with performance requirements that are arguably greater than what current multi-core CPUs can provide. To date, researchers have used dedicated hardware such as field programmable gate arrays (FPGAs) to address the performancecritical aspects of the channelizer. In this work, we assess the viability of the graphics processing unit (GPU) […]
Jun, 4
Efficient Execution of AMR Computations on GPU Systems
Adaptive Mesh Refinement (AMR) is a method which dynamically varies the spatio – temporal resolution of localized mesh regions in numerical simulations, based on the strength of the solution features. Due to high resolution discretization of localized regions of interests into rectangular mesh units called patches, AMR provides low cost of computations and high degree […]
Jun, 4
Towards shared memory consistency models for GPUs
With the widespread use of GPUs, it is important to ensure that programmers have a clear understanding of their shared memory consistency model i.e. what values can be read when issued concurrently with writes. While memory consistency has been studied for CPUs, GPUs present very different memory and concurrency systems and have not been well […]
Jun, 4
Using RenderScript and RCUDA for Compute Intensive tasks on Mobile Devices: a Case Study
The processing power of mobile devices is continuously increasing. In this paper we perform a case study in which we assess three different programming models that can be used to leverage this processing power for compute intensive tasks. We use an imaging algorithm and compare a reference implementation of this algorithm based on OpenCV with […]
Jun, 4
Coating Process Monitoring Using Computer Vision
The aim of this Bachelor’s Thesis was to make a prototype system for Metso Paper Inc. for monitoring a paper roll coating process. If the coating is done badly and there are faults one has to redo the process which lowers the profits of the company since the process is costly. The work was proposed […]
Jun, 4
Parallel Acceleration on Manycore Systems and Its Performance Analysis: OpenCL Case Study
OpenCL (Open Computing Language) is a heterogeneous programming framework for developing applications that executes across a range of device types made by different vendors[11] which efficiently maps to both heterogeneous and homogeneous, single or multiple device system consisting of CPUs, GPUs and others types of devices. OpenCL provides many benefits in the field of high-performance […]
Jun, 4
Plasma Visualization in Parallel using Particle Systems on Graphical Processing Units
Visualising and simulating charged plasma systems present additional challenges to conventional particle methods. Plasmas exhibit multi scale phenomena that often prevent the use of standard localisation approximations. Plasmas as particle systems that emit light are important in many interesting components of games, computer animated movies such as weapons fire, explosions, astronomical effects. They also have […]
Jun, 4
Using sparse optical flow for multiple Kinect applications
The use of Multiple Microsoft Kinects has become prominent in the last two years and enjoyed widespread acceptance. While several work has been published to mitigate quality degradations in the precomputed depth image, this work focuses on employing an optical flow suitable for dot patterns as employed in the Kinect to retrieve subtle scene data […]
Jun, 4
Data-driven versus Topology-driven Irregular Computations on GPUs
Irregular algorithms are algorithms with complex main data structures such as directed and undirected graphs, trees, etc. A useful abstraction for many irregular algorithms is its operator formulation in which the algorithm is viewed as the iterated application of an operator to certain nodes, called active nodes, in the graph. Each operator application, called an […]
Jun, 4
GPU Acceleration of a Genetic Algorithm for the Synthesis of FSM-based Bimodal Predictors
This paper presents a fast GPU implementation of a genetic algorithm for synthesizing bimodal predictor FSMs of a given size. Bimodal predictors, i.e., predictors that make binary yes/no predictions, are ubiquitous in microprocessors. Many of these predictors are based on finite-state machines (FSMs). However, there are countless possible FSMs and even heuristic searches for finding […]
Jun, 4
CUDA Leaks: Information Leakage in GPU Architectures
Graphics Processing Units (GPUs) are deployed on most present server, desktop, and even mobile platforms. Nowadays, a growing number of applications leverage the high parallelism offered by this architecture to speed-up general purpose computation. This phenomenon is called GPGPU computing (General Purpose GPU computing). The aim of this work is to discover and highlight security […]