Posts
Dec, 7
GPU-based solution of Continuous Time Markov Chains using CUSP
This technical report describes the parallelisation of the response-time analyser HYDRA using CUSP and the results of executing it on HECToR’s GPGPU testbed. We achieved good speed-ups in execution time, but these were outweighed by increased setup time.
Dec, 7
Effective Mapping of Grammatical Evolution to CUDA Hardware Model
Several papers have shown that symbolic regression is suitable for data analysis and prediction in ?nance markets. The Grammatical Evolution (GE) has been successfully applied in solving various tasks including symbolic regression. However, performance of this method can limit the area of possible applications. This paper deals with utilizing mainstream graphics processing unit (GPU) for […]
Dec, 7
Efficient Two-Level Preconditionined Conjugate Gradient Method on the GPU
We present an implementation of Two-Level Preconditioned Conjugate Gradient Method for the GPU. We investigate a Truncated Neumann Series based preconditioner in combination with deflation and compare it with Block Incomplete Cholesky schemes. This combination exhibits fine-grain parallelism and hence we gain considerably in execution time. It’s numerical performance is also comparable to the Block […]
Dec, 7
Towards a complete FEM-based simulation toolkit on GPUs: Geometric Multigrid solvers
We describe a GPU- and multicore-oriented implementation technique for a key component of finite element based simulation toolkits for partial differential equations on unstructured grids: Geometric Multigrid solvers. We use efficient sparse matrix-vector multiplications throughout the solver pipeline: within the coarse-grid solver, smoothers and even grid transfers. Our implementation can handle several low- and high-order […]
Dec, 6
Automatic Fusions of CUDA-GPU Kernels for Parallel Map
When implementing a function mapping on the contemporary GPU, several contradictory performance factors affecting distribution of computation into GPU kernels have to be balanced. A decomposition-fusion scheme suggest to decompose computational problem to be solved by several simple functions implemented as standalone kernels and some of these functions later fuse into more complex kernels to […]
Dec, 6
Multiprocessing Acceleration of H.264/AVC Motion Estimation Full Search Algorithm under CUDA Architecture
This work presents a parallel GPU-based solution for the Motion Estimation (ME) process in a videoencoding system. We propose a way to partition the steps of Full Search block matching algorithm in the CUDA architecture, and to compare the performance with a theoretical model and two implementations (sequential and parallel using OpenMP library). We obtained […]
Dec, 6
DTAM: Dense tracking and mapping in real-time
DTAM is a system for real-time camera tracking and reconstruction which relies not on feature extraction but dense, every pixel methods. As a single hand-held RGB camera flies over a static scene, we estimate detailed textured depth maps at selected keyframes to produce a surface patchwork with millions of vertices. We use the hundreds of […]
Dec, 6
Massively Parallel Identification of Intersection Points for GPGPU Ray Tracing
The latest advancements in computer graphics architectures, as the replacement of some fixed stages of the pipeline for programmable stages (shaders), have been enabling the development of parallel general purpose applications on massively parallel graphics architectures (Streaming Processors). For years the graphics processing unit (GPU) is being optimized for increasingly high throughput of massively parallel […]
Dec, 6
Simulation of pollutant transport in shallow water on a CUDA architecture
Shallow water simulation enables the study of problems such as dam break, river, canal and coastal hydrodynamics, as well as the transport of inert substances, such as pollutants, on a fluid. This article describes a GPU efficient and cost-effective CUDA implementation of a finite volume numerical scheme for solving pollutant transport problems in bidimensional domains. […]
Dec, 6
GPU-Based Liquid Crystal Display Processing Platform
In the past decade liquid crystal displays (LCD) have taken over the television (TV) and monitor market from cathode ray tube (CRT) display. Compared to CRT displays, LCD offers larger screen sizes, higher resolution, thinner, lighter, and more energy efficient. However, with respect to image quality, LCD does not catch up to CRT display in […]
Dec, 6
Performance Analysis of GPU compared to Single-core and Multi-core CPU for Natural Language Applications
In Natural Language Processing (NLP) applications, the main time-consuming process is string matching due to the large size of lexicon. In string matching processes, data dependence is minimal and hence it is ideal for parallelization. A dedicated system with memory interleaving and parallel processing techniques for string matching can reduce this burden of host CPU, […]
Dec, 6
Real-time Terrain Modeling using CPU-GPU Coupled Computation
Motivated by the importance of having real-time feedback in sketch-based modeling tools, we present a framework for terrain edition capable of generating and displaying complex and high-resolution terrains. Our system is efficient and fast enough to allow the user to see the terrain morphing at the same time the drawing editing occurs. We have two […]