Posts
Nov, 30
GPU Accelerated Dissipative Particle Dynamics with Parallel Cell-list Updating
A general purpose DPD simulation entirely implemented on GPU is presented in this paper, including cell-list updating, force calculation and integrating forward. The algorithm and optimization needed to obtain best performance of GPU is discussed. The performance benchmarks show that our implementation running on single GPU can be more than 20x faster than conventional implementation […]
Nov, 30
Acceleration of computational quantum chemistry by heterogeneous computer architectures
Computational quantum chemistry mehods such as the Hartree-Fock (HF), the density functional theory (DFT) or the fragment molecular orbital (FMO) require heavy computational resources. In this study they are accelerated by using graphics processing units (GPUs) and the vector instruction set (AVX) of latest CPU. PRISM algorithm to evaluate the electron repulsion integrals was vectorized […]
Nov, 30
Optimization of the Particle-based Volume Rendering for GPUs with Hiding Data Transfer Latency
In this paper, we present the optimization of the particle-based volume rendering for GPU platforms. In general, data transfer between CPU and GPU accompanies long latency. Using page lock memory of the CUDA runtime API, data area is selected so that the data transfer between CPU and GPU becomes faster to reduce the execution time. […]
Nov, 30
Performance and numerical accuracy evaluation of heterogeneous multicore systems for Krylov orthogonal basis computation
We study the numerical behavior of heterogeneous systems such as CPU with GPU or IBM Cell processors for some orthogonalization processes. We focus on the influence of the different floating arithmetic handling of these accelerators with Gram-Schmidt orthogonalization using single and double precision. We observe for dense matrices a loss of at worst 1 digit […]
Nov, 30
GPGPU Accelerated Cardiac Arrhythmia Simulations
Computational modeling of cardiac electrophysiology is a powerful tool for studying arrhythmia mechanisms. In particular, cardiac models are useful for gaining insights into experimental studies, and in the foreseeable future they will be used by clinicians to improve therapy for the patients suffering from complex arrhythmias. Such models are highly intricate, both in their geometric […]
Nov, 30
Unleashing the Power of Distributed CPU/GPU Architectures: Massive Astronomical Data Analysis and Visualization case study
Upcoming and future astronomy research facilities will systematically generate terabyte-sized data sets moving astronomy into the Petascale data era. While such facilities will provide astronomers with unprecedented levels of accuracy and coverage, the increases in dataset size and dimensionality will pose serious computational challenges for many current astronomy data analysis and visualization tools. With such […]
Nov, 30
GPU-Accelerated SPH Model for Water Waves and Other Free Surface Flows
This paper discusses the meshless numerical method Smoothed Particle Hydrodynamics and its application to water waves and nearshore circulation. In particularly we focus on an implementation of the model on the graphics processing unit (GPU) of computers, which permits low-cost supercomputing capabilities for certain types of computational problems. The implementation here runs on Nvidia graphics […]
Nov, 29
Parallel preconditioning for spherical harmonics expansions of the Boltzmann transport equation
While the Monte Carlo method for the Boltzmann transport equation for semiconductors has already been parallelized, this is much more difficult to accomplish for the deterministic spherical harmonics expansion method which requires the solution of a linear system of equations. For the typically employed iterative solvers, preconditioners are required to obtain good convergence rates. These […]
Nov, 29
Multiresolution Flow Simulations on Multi/many-core Architectures
One of the key challenges in Computational Science is closing the gap between the available computer power and its effective utilization for the simulation of complex physical systems and engineering applications. In order to achieve this goal we must minimize the time-to-solution and the related energy requirements of simulations by developing scalable software and methods […]
Nov, 29
Applications Performance on GPGPUs with the Fermi Architecture
The latest GPU architecture released by Nvidia, code-named "Fermi", is the most advanced computing GPU architecture ever built. Radical changes took place on the GPU computing architecture compared to Fermi’s predecessors such as the GT200 series and the G80s. In this dissertation the Fermi architecture is analysed, addressing the most prominent upgrades, by running extensive […]
Nov, 29
Dynamic Task Parallelism with a GPU Work-Stealing Runtime System
NVIDIA’s Compute Unified Device Architecture (CUDA) and its attached C/C++ based API went a long way towards making GPUs more accessible to mainstream programming. So far, the use of GPUs for high performance computing has been primarily restricted to data parallel applications, and with good reason. The high number of computational cores and high memory […]
Nov, 29
Directives Based Programming of GPU Accelerated Systems
Graphics Processing Units (GPUs) are commodity chips primarily used as coprocessors for processing high definition graphics on a computer system. It possess faster processing power and efficiency in handling accurate single and double floating point numbers with less power consumption compared to CPUs. Realising its potential in general purpose computing manufacturers of these chips have […]