Posts
Nov, 19
A study of the speed and the accuracy of the Boundary Element Method as applied to the computational simulation of biological organs
In this work, first a Fortran code is developed for three dimensional linear elastostatics using constant boundary elements; the code is based on a MATLAB code developed by the author earlier. Next, the code is parallelized using BLACS, MPI, and ScaLAPACK. Later, the parallelized code is used to demonstrate the usefulness of the Boundary Element […]
Nov, 19
Implementation of the twisted mass fermion operator in the QUDA library
We discuss an extension of the QUDA library for the Wilson twisted mass operator. A performance analysis is presented for both degenerate and non-degenerate flavor doublets. The degenerate twisted mass fermion operator runs at up to 190, 487 and 856 Gflops, for double, single and half precisions respectively on recent NVIDIA Kepler GPUs, while our […]
Nov, 19
An implicit multigrid solver for high-order compressible flow simulations on GPUs
The multigrid method has proved to be effective for a large class of numerical methods. In this study, a strategy based on Full Approximation Storage (FAS) scheme is implemented together with Full Multigrid Algorithm (FMG) to accelerate convergence of steady state solutions of the two-dimensional compressible Euler equations on Graphics Processing Unit (GPU). The Beam […]
Nov, 18
Neurokernel: An Open Scalable Software Framework for Emulation and Validation of Drosophila Brain Models on Multiple GPUs
The brain of the fruit fly Drosophila melanogaster is an extremely attractive model system for reverse engineering the emergent properties of neural circuits because it implements complex sensory-driven behaviors with a nervous system comprising a number of components that is five orders of magnitude smaller than those of mammals. A powerful toolkit of well-developed genetic […]
Nov, 18
Integrating Multi-GPU Execution in an OpenACC Compiler
GPUs have become promising computing devices in current and future computer systems due to its high performance, high energy efficiency, and low price. However, lack of high level GPU programming models hinders the wide spread of GPU applications. To resolve this issue, OpenACC is developed as the first industry standard of a directive-based GPU programming […]
Nov, 18
Specification and verification of GPGPU programs
Graphics Processing Units (GPUs) are increasingly used for general-purpose applications because of their low price, energy efficiency and enormous computing power. Considering the importance of GPU applications, it is vital that the behaviour of GPU programs can be specified and proven correct formally. This paper presents a logic to verify GPU kernels written in OpenCL, […]
Nov, 18
Probing the Statistical Validity of the Ductile-to-Brittle Transition in Metallic Nanowires Using GPU Computing
We perform a large-scale statistical analysis (> 2000 independent simulations) of the elongation and rupture of gold nanowires, probing the validity and scope of the recently proposed ductile-to-brittle transition that occurs with increasing nanowire length [Wu et. al., Nano Lett., 12, 910-914 (2012)]. To facilitate a high-throughput simulation approach, we implement the second-moment approximation to […]
Nov, 18
Performance and Power Comparisons Between Fermi and Cypress GPUs
In recent years, modern graphics processing units have been widely adopted in high performance computing areas to solve large scale computation problems. The leading GPU manufacturers Nvidia and AMD have introduced series of products to the market. While sharing many similar design concepts, GPUs from these two manufacturers differ in several aspects on processor cores […]
Nov, 17
GPGPU-accelerated Interesting Interval Discovery and other Computations on GeoSpatial Datasets – A Summary of Results
It is imperative that for scalable solutions of GIS computations the modern hybrid architecture comprising a CPUGPU pair is exploited fully. The existing parallel algorithms and data structures port reasonably well to multicore CPUs, but poorly to GPGPUs because of latter’s atypical fine-grained, single-instruction multiple-thread (SIMT) architecture, extreme memory hierarchy and coalesced access requirements, and […]
Nov, 17
Implementation of Diamond Search Algorithm Using Parallel Processing Architecture
In video communication whole content of video cannot be stored without processing. So there is a need to compress the video before transmission and storage this process is called as video compression. Video compression plays an important role with regard to real-time scouting/video conferencing applications. Regarding the entire motion based video compression process, movement estimation […]
Nov, 17
Fast Diameter Computation of Large Sparse Graphs using GPUs
In this paper we propose a highly parallel GPU-based bounding algorithm for computing the exact diameter of large real-world sparse graphs. The diameter is defined as the length of the longest shortest path between vertices in the graph, and serves as a relevant property of all types of graphs that are nowadays frequently studied. Examples […]
Nov, 17
Efficient GPU-Implementation of Adaptive Mesh Refinement for the Shallow-Water Equations
The shallow-water equations model hydrostatic flow below a free surface for cases in which the ratio between the vertical and horizontal length scales is small and are used to describe waves in lakes, rivers, oceans, and the atmosphere. The equations admit discontinuous solutions, and numerical solutions are typically computed using high-resolution schemes. For many practical […]