Posts
Feb, 8
Simulating and Benchmarking the Shallow-Water Fluid Dynamical Equations on Multiple Graphical Processing Units
The shallow-water model equations provide a simple yet realistic benchmark problem in computational fluid dynamics (CFD) that can be implemented on a variety of computational platforms. Graphical Processing Units can be used to accelerate such problems either singly using a data parallel decompositional scheme or with multiple devices using a domain decompositional approach. We implement […]
Feb, 5
Auto-Generation and Auto-Tuning of 3D Stencil Codes on Homogeneous and Heterogeneous GPU Clusters
This paper develops and evaluates search and optimization techniques for auto-tuning 3D stencil (nearest-neighbor) computations on GPUs. Observations indicate that parameter tuning is necessary for heterogeneous GPUs to achieve optimal performance with respect to a search space. Our proposed framework takes a most concise specification of stencil behavior from the user as a single formula, […]
Feb, 5
OpenACC-based GPU Acceleration of a 3-D Unstructured Discontinuous Galerkin Method
A GPU-accelerated discontinuous Galerkin (DG) method is presented for the solution of compressible flows on 3-D unstructured grids. The present work has employed two of the most attractive features in a new programming standard of parallel computing – OpenACC: 1) multi-platform/compiler support and 2) descriptive directive interface to upgrade a legacy CFD solver with the […]
Feb, 5
Depth Map Based Superresolution Method in 3D Reconstruction
In this thesis a superresolution method for geometry reconstruction is presented. Superresolution methods have already been used in order to improve texture resolution and RGB images. In a lesser degree they have also been used to improve the quality of depth maps. The objective of this thesis is to validate the hypothesis that superresolution methods […]
Feb, 5
Improved Implementation of Simulation for Membrane Computing on the Graphic Processing Unit
Membrane computing is a theoretical model of computation that inspired from the structure and functioning of cells. Membrane computing models naturally have parallel structure. Most of the simulations of membrane computing have been done in a serial way on a machine with a central processing unit (CPU). This has neglected the advantage of parallelism in […]
Feb, 5
Bone structure analysis on multiple GPGPUs
Osteoporosis is a disease that affects a growing number of people by increasing the fragility of their bones. To improve the understanding of the bone quality, large scale computer simulations are applied. A fast, scalable and memory efficient solver for such problems is ParOSol. It uses the preconditioned conjugate gradient algorithm with a multigrid preconditioner. […]
Feb, 4
Strategies for Maximizing Utilization in multi-CPU & multi-GPU Heterogeneous Architectures
This paper explores the possibility of efficiently executing a single application using multicores simultaneously with multiple GPU accelerators under a parallel task programming paradigm. In particular, we address the challenge of extending a parallel for template to allow its exploitation on heterogeneous architectures. Previous task frameworks that offer support for heterogeneous systems implement a variety […]
Feb, 4
Parallel implementation of 3D protein structure similarity searches using a GPU and the CUDA
Searching for similar 3D protein structures is one of the primary processes employed in the field of structural bioinformatics. However, the computational complexity of this process means that it is constantly necessary to search for new methods that can perform such a process faster and more efficiently. Finding molecular substructures that complex protein structures have […]
Feb, 4
Fast Acceleration of 2D Wave Propagation Simulations Using Modern Computational Accelerators
Recent developments in modern computational accelerators like Graphics Processing Units (GPUs) and coprocessors provide great opportunities for making scientific applications run faster than ever before. However, efficient parallelization of scientific code using new programming tools like CUDA requires a high level of expertise that is not available to many scientists. This, plus the fact that […]
Feb, 4
A Real-Time, GPU-Based, Non-Imaging Back-End for Radio Telescopes
Since the discovery of RRATs, interest in single pulse radio searches has increased dramatically. Due to the large data volumes generated by these searches, especially in planned surveys for future radio telescopes, such searches have to be conducted in real-time. This has led to the development of a multitude of search techniques and real-time pipeline […]
Feb, 4
A Scalable Hybrid FPGA/GPU FX Correlator
Radio astronomical imaging arrays comprising large numbers of antennas, O(10^2-10^3) have posed a signal processing challenge because of the required O(N^2) cross correlation of signals from each antenna and requisite signal routing. This motivated the implementation of a Packetized Correlator architecture that applies Field Programmable Gate Arrays (FPGAs) to the O(N) "F-stage" transforming time domain […]
Feb, 2
Parallelization of the Algorithm WHAM with NVIDIA CUDA
The aim of my thesis is to parallelize the Weighting Histogram Analysis Method (WHAM), which is a popular algorithm used to calculate the Free Energy of a molecular system in Molecular Dynamics simulations. WHAM works in post processing in cooperation with another algorithm called Umbrella Sampling. Umbrella Sampling has the purpose to add a biasing […]