Posts
Jun, 26
Using of New Possibilities of Fermi Architecture by Development of GPGPU Programs
Description of additional functions of hardware and software, which are presented in the structure of new architecture of FERMI graphic processors made by company NVIDIA, was given. Recommendations of their use within the realization of algorithms of scientific and technical calculations by means of the graphic processors were given. Application of the new possibilities of […]
Jun, 26
GPU-Accelerated Real-Time Visualization and Interaction for Coupled Fluid Dynamics
For real-time applications (dynamic data-driven applications systems like computer-assisted surgery, command and control, etc.), it is necessary to design fast or strongly-accelerated computational approaches. Reduced-order modeling (ROM) is a candidate methodology that summarizes all the parameter-dependent PDE solutions into an easy-to-compute condensed form. ROM usually requires an offline learning process that returns the essential components […]
Jun, 25
Room acoustics modelling using GPU-accelerated finite difference and finite volume methods on a face-centered cubic grid
In this paper, a room acoustics simulation using a finite difference approximation on a face-centered cubic (FCC) grid with finite volume impedance boundary conditions is presented. The finite difference scheme is accelerated on an Nvidia Tesla K20 graphics processing unit (GPU) using the CUDA programming language. A performance comparison is made between 27-point finite difference […]
Jun, 25
Differential Evolution with parallelised objective functions using CUDA
Differential Evolution (DE) algorithms can be used in various fields for problem solving where we need to find an optimal (or close to optimal) solution but we don’t have a clear, straightforward method to compute it. Unfortunately it can take a very long time to produce such a solution when implemented serially or even parallel […]
Jun, 25
String Algorithm on GPGPU
Since the last decade, the concept of general purpose computing on graphics processors was introduced and has since garnered significant adaptation in the engineering industry. The use of a Graphics Processing Unit (GPU) as a many-core processing architecture for the purpose of general-purpose computation yields performance improvement of several orders-of magnitude. One example in leveraging […]
Jun, 25
Parallelization of specialized fluid flow simulator based on lattice Boltzmann method on a multi GPU system
Computational demands of fluid flow simulations are high, with large computational resources required to perform the calculations and these applications have recently been accelerated with the help of GPU devices (Graphical Processing Units). Fluid flow simulation using discrete method called lattice Boltzmann (LB) has also been parallelized using GPU. In this paper a single-node multi-GPU […]
Jun, 25
P-HGRMS: A Parallel Hypergraph Based Root Mean Square Algorithm for Image Denoising
This paper presents a parallel Salt and Pepper (SP) noise removal algorithm in a grey level digital image based on the Hypergraph Based Root Mean Square (HGRMS) approach. HGRMS is generic algorithm for identifying noisy pixels in any digital image using a two level hierarchical serial approach. However, for SP noise removal, we reduce this […]
Jun, 24
GPU Implementation of the Particle Filter
This thesis work analyses the obstacles faced when adapting the particle filtering algorithm to run on massively parallel compute architectures. Graphics processing units are one example of massively parallel compute architectures which allow for the developer to distribute computational load over hundreds or thousands of processor cores. This thesis studies an implementation written for NVIDIA […]
Jun, 24
Integrating Two-Way Interaction Between Fluids and Rigid Bodies in the Real-Time Particle Systems Library
In the last 15 years, Video games have become a dominate form of entertainment. The popularity of video games means children are spending more of their free time play video games. Usually, the time spent on homework or studying is decreased to allow for the extended time spent on video games. In an effort to […]
Jun, 24
A Visual Approach to Investigating Shared and Global Memory Behavior of CUDA Kernels
We present an approach to investigate the memory behavior of a parallel kernel executing on thousands of threads simultaneously within the CUDA architecture. Our top-down approach allows for quickly identifying any significant differences between the execution of the many blocks and warps. As interesting warps are identified, we allow further investigation of memory behavior by […]
Jun, 24
An Energy Efficient GPGPU Memory Hierarchy with Tiny Incoherent Caches
With progressive generations and the ever-increasing promise of computing power, GPGPUs have been quickly growing in size, and at the same time, energy consumption has become a major bottleneck for them. The first level data cache and the scratchpad memory are critical to the performance of a GPGPU, but they are extremely energy inefficient due […]
Jun, 24
Provably Efficient GPU Algorithms
In this paper we present an abstract model for algorithm design on GPUs by extending the parallel external memory (PEM) model with computations in internal memory (commonly known as shared memory in GPU literature) defined in the presence of memory banks and bank conflicts. We also present a framework for designing bank conflict free algorithms […]

