Posts
Mar, 10
GPU Accelerated Discontinuous Galerkin Methods for Shallow Water Equations
We discuss the development, verification, and performance of a GPU accelerated discontinuous Galerkin method for the solutions of two dimensional nonlinear shallow water equations. The shallow water equations are hyperbolic partial differential equations and are widely used in the simulation of tsunami wave propagations. Our algorithms are tailored to take advantage of the single instruction […]
Mar, 10
A GPU Accelerated Aggregation Algebraic Multigrid Method
We present an efficient, robust and fully GPU-accelerated aggregation-based algebraic multigrid preconditioning technique for the solution of large sparse linear systems. These linear systems arise from the discretization of elliptic PDEs. The method involves two stages, setup and solve. In the setup stage, hierarchical coarse grids are constructed through aggregation of the fine grid nodes. […]
Mar, 9
XeonPhi Meets Astrophysical Fluid Dynamics
This white paper reports on ours efforts to optimize a 2D/3D astrophysical (magento-)hydrodynamics Fortran code for XeonPhi. The code is parallelized with OpenMP and is suitable for execution on a shared memory system. Due to complexity of the code combined with immaturity of compiler we were unable to stay within the boundaries of Intel Compiler […]
Mar, 9
Real-time video denoising for 2D ultrasound streaming video on GPUs
The ultrasound videos are mainly contaminated by multiplicative noises but also contaminated with additive noises. As the past few decades, there are some studies to remove the noises from ultrasound images as in the JY model [1] and the variational model which removes both types of noises. However, denoising these noises from the ultrasound video […]
Mar, 9
RASR/NN: The RWTH Neural Network Toolkit for Speech Recognition
This paper describes the new release of RASR – the open source version of the well-proven speech recognition toolkit developed and used at RWTH Aachen University. The focus is put on the implementation of the NN module for training neural network acoustic models. We describe code design, configuration, and features of the NN module. The […]
Mar, 9
A Framework for Developing Real-Time OLAP algorithm using Multi-core processing and GPU: Heterogeneous Computing
The overwhelmingly increasing amount of stored data has spurred researchers seeking different methods in order to optimally take advantage of it which mostly have faced a response time problem as a result of this enormous size of data. Most of solutions have suggested materialization as a favourite solution. However, such a solution cannot attain Real-Time […]
Mar, 9
Designing Efficient MPI and UPC Runtime for Multicore Clusters with InfiniBand, Accelerators and Co-Processors
High End Computing (HEC) has been growing dramatically over the past decades. The emerging multi-core systems, heterogeneous architectures and interconnects introduce various challenges and opportunities to improve the performance of communication middlewares and applications. The increasing number of processor cores and Co-Processors results in not only heavy contention on communication resources, but also much more […]
Mar, 7
Converting Data to Task-Parallelism by Rewrites
High-level domain-specific-languages for array processing on the GPU are increasingly common, but to date they run only on a single GPU. We argue that languages will need to target multiple devices, even simultaneous combinations of GPU/GPU and CPU/GPU. Increased flexibility may be key to making these languages more easily deployable and thus widespread. To this […]
Mar, 7
Exploring High Performance SQL Databases with Graphics Processing Units
This thesis introduces the development of a new GPU-based database to accelerate queries of Digital Humanities data to extract document texts that are then data-mined to produce visualizations of aspects of the humanities data. The goal is to advance the state-of-the-art in massively parallel database work by investigating methods for utilizing graphical processing units in […]
Mar, 7
Interactive Program Debugging and Optimization for Directive-Based, Efficient GPU Computing
Directive-based GPU programming models are gaining momentum, since they transparently relieve programmers from dealing with complexity of low-level GPU programming, which often reflects the underlying architecture. However, too much abstraction in directive models puts a significant burden on programmers for debugging applications and tuning performance. In this paper, we propose a directive-based, interactive program debugging […]
Mar, 7
Parallelization of DNA alignment algorithms using GPUs
Since the discovery of Deoxyribonucleic Acid (DNA) significant technological advances were made, leading to very large amounts of data gathered for analysis. The tools for this analysis however have advanced at a slower pace and have become one of the limiting factors of new discoveries in this field of research. Recently, from the 3D game […]
Mar, 7
Dynamic Workload Division in GPU-CPU Heterogeneous Systems
GPU provides powerful computational capabilities and huge potential optimization possibility of efficient. As a result, the CPU-GPU heterogeneous architecture is still the hot zone of the high performance computation. However, the energy consuming is still the bottle neck of the entire the system, when the system and its corresponding framework need massive scale calculation. Most […]