Posts
Mar, 1
A Simulation Suite for Lattice-Boltzmann based Real-Time CFD Applications Exploiting Multi-Level Parallelism on modern Multi- and Many-Core Architectures
We present a software approach to hardware-oriented numerics which builds upon an augmented, previously published open-source set of libraries facilitating portable code development and optimisation on a wide range of modern computer architectures. In order to maximise efficiency, we exploit all levels of parallelism, including vectorisation within CPU cores, the Cell BE and GPUs, shared […]
Feb, 28
Accelerating Molecular Dynamics Simulations with GPUs
Molecular dynamics simulations are known to run for many days or weeks before completion. In this paper we explore the use of GPUs to accelerate a Lennard-Jones-based molecular dynamics simulation of up to 27000 atoms. We demonstrate speedups that exceed 100x on commodity Nvidia GPUs and discuss the strategies that allow for such exceptional speedups. […]
Feb, 28
Fast and Accurate Generalized Harmonic Analysis and Its Parallel Computation by GPU
A fast and accurate method for Generalized Harmonic Analysis is proposed. The proposed method estimates the parameters of a sinusoid and subtracts it from a target signal one by one. The frequency of the sinusoid is estimated around a peak of Fourier spectrum using binary search. The binary search can control the trade-off between the […]
Feb, 28
A Case Study for Petascale Applications in Astrophysics: Simulating Gamma-Ray Bursts
Petascale computing will allow astrophysicists to investigate astrophysical objects, systems, and events that cannot be studied by current observational means and that were previously excluded from computational study by sheer lack of CPU power and appropriate codes. Here we present a pragmatic case study, focussing on the simulation of gamma-ray bursts as a science driver […]
Feb, 28
A general relativistic evolution code on CUDA architectures
I describe the implementation of a finite-differencing code for solving Einstein’s field equations on a GPU, and measure speed-ups compared to a serial code on a CPU for different parallelization and caching schemes. Using the most efficient scheme, the (single precision) GPU code on an NVIDIA Quadro FX 5600 is shown to be up to […]
Feb, 28
Scientific Visualization in Astronomy: Towards the Petascale Astronomy Era
Astronomy is entering a new era of discovery, coincident with the establishment of new facilities for observation and simulation that will routinely generate petabytes of data. While an increasing reliance on automated data analysis is anticipated, a critical role will remain for visualization-based knowledge discovery. We have investigated scientific visualization applications in astronomy through an […]
Feb, 28
Bothnia: a dual-personality extension to the Intel integrated graphics driver
In this paper, we introduce Bothnia, an extension to the Intel production graphics driver to support a shared virtual memory heterogeneous multithreading programming model. With Bothnia, the Intel graphics device driver can support both the traditional 3D graphics rendering software stack and a new class of heterogeneous multithreaded applications, which can use both IA (Intel […]
Feb, 28
HORIZON: Accelerated General Relativistic Magnetohydrodynamics
We present Horizon, a new graphics processing unit (GPU)-accelerated code to solve the equations of general relativistic magnetohydrodynamics in a given spacetime. We evaluate the code in several test cases, including magnetized Riemann problems and rapidly rotating neutron stars, and measure the performance benefits of the GPU acceleration in comparison to our CPU-based code Thor. […]
Feb, 28
Calculation by articificial compressibility method and virtual flux method on GPU
In this study, artificial compressibility method and virtual flux method were implemented on GPUs. Because GPUs are recognized as massively parallel computers, DP-LUR was employed as time integration method. In spite of slow convergence characteristics of DP-LUR, calculation by the coupling of DP-LUR and GPU is about 15 times faster in time than that of […]
Feb, 28
STOCHSIMGPU: Parallel stochastic simulation for the Systems Biology Toolbox 2 for MATLAB
Motivation: The importance of stochasticity in biological systems is becoming increasingly recognised and the computational cost of biologically realistic stochastic simulations urgently requires development of efficient software. We present a new software tool STOCHSIMGPU which exploits graphics processing units (GPUs) for parallel stochastic simulations of biological/chemical reaction systems and show that significant gains in efficiency […]
Feb, 28
Automatic data movement and computation mapping for multi-level parallel architectures with explicitly managed memories
Several parallel architectures such as GPUs and the Cell processor have fast explicitly managed on-chip memories, in addition to slow off-chip memory. They also have very high computational power with multiple levels of parallelism. A significant challenge in programming these architectures is to effectively exploit the parallelism available in the architecture and manage the fast […]
Feb, 27
Fast and Memory Efficient GPU-Based Rendering of Tensor Data
Graphics hardware is advancing very fast and offers new possibilities to programmers. The new features can be used in scientific visualization to move calculations from the CPU to the graphics processing unit (GPU). This is useful especially when mixing CPU intense calculations with on the fly visualization of intermediate results. We present a method to […]