This paper presents a numerical study on a fast marching method based back projection reconstruction algorithm for photoacoustic tomography in heterogeneous media. Transcranial imaging is used here as a case study. To correct for the phase aberration from the heterogeneity (i.e., skull), the fast marching method is adopted to compute the phase delay based on […]

January 19, 2015 by hgpu

Obtaining a thermodynamically accurate phase diagram through numerical calculations is a computationally expensive problem that is crucially important to understanding the complex phenomena of solid state physics, such as superconductivity. In this work we show how this type of analysis can be significantly accelerated through the use of modern GPUs. We illustrate this with a […]

January 2, 2015 by hgpu

This paper describes some applications of GPU acceleration in ab initio nuclear structure calculations. Specifically, we discuss GPU acceleration of the software package MFDn, a parallel nuclear structure eigensolver. We modify the matrix construction stage to run partly on the GPU. On the Titan supercomputer at the Oak Ridge Leadership Computing Facility, this produces a […]

December 22, 2014 by hgpu

We adopt CUDA-capable Graphic Processing Units (GPUs) for Landau, Coulomb and maximally Abelian gauge fixing in 3+1 dimensional SU(3) and SU(2) lattice gauge field theories. A combination of simulated annealing and overrelaxation is used to aim for the global maximum of the gauge functional. We use a fine grained degree of parallelism to achieve the […]

December 12, 2014 by hgpu

We describe a highly optimized implementation of MPI domain decomposition in a GPU-enabled, general-purpose molecular dynamics code, HOOMD-blue (Anderson and Glotzer, arXiv:1308.5587). Our approach is inspired by a traditional CPU-based code, LAMMPS (Plimpton, J. Comp. Phys. 117, 1995), but is implemented within a code that was designed for execution on GPUs from the start (Anderson […]

December 12, 2014 by hgpu

The gap between the cost of moving data and the cost of computing continues to grow, making it ever harder to design iterative solvers on extreme-scale architectures. This problem can be alleviated by alternative algorithms that reduce the amount of data movement. We investigate this in the context of Lattice Quantum Chromodynamics and implement such […]

December 9, 2014 by hgpu

Radio propagation simulation tools are important for prediction and verification of the radio signal coverage by individual transmitters or transmitter networks such as mobile phone cellular networks. In the case of a large geographic area with a relative high resolution, the simulation can become computationally demanding, taking a considerable amount of time to accomplish. Parallel […]

December 7, 2014 by hgpu

In this work, we present the GPU implementation of the overrelaxation and steepest descent method with Fourier acceleration methods for Laudau and Coulomb gauge fixing using CUDA for SU(N) with N>2. A multi-GPU implementation of the overrelaxation method is also presented using MPI and CUDA. The GPU performance was measured on BlueWaters and compared against […]

December 5, 2014 by hgpu

Electrical power requirements will be a constraint on the future growth of Distributed High Throughput Computing (DHTC) as used by High Energy Physics. Performance-per-watt is a critical metric for the evaluation of computer architectures for cost- efficient computing. Additionally, future performance growth will come from heterogeneous, many-core, and high computing density platforms with specialized processors. […]

December 5, 2014 by hgpu

As a generic example for crystals where the crystal-fluid interface tension depends on the orientation of the interface relative to the crystal lattice axes, the nearest neighbor Ising model on the simple cubic lattice is studied over a wide temperature range, both above and below the roughening transition temperature. Using a thin film geometry $L_x […]

November 25, 2014 by hgpu

We study the the non-equilibrium ageing behaviour of the +/-J Edwards-Anderson model in three dimensions for samples of size up to N=128^3 and for up to 10^8 Monte Carlo sweeps. In particular we are interested in the change of the ageing when crossing from the spin-glass phase to the ferromagnetic phase. The necessary long simulation […]

November 25, 2014 by hgpu

We present the Lattice QCD application CL2QCD, which is based on OpenCL and can be utilized to run on Graphic Processing Units as well as on common CPUs. We focus on implementation details as well as performance results of selected features. CL2QCD has been successfully applied in LQCD studies at finite temperature and density and […]

November 20, 2014 by hgpu