This paper concerns development of a high-performance implementation of the Particle-in-Cell method for plasma simulation on Intel Xeon Phi coprocessors. We discuss suitability of the method for Xeon Phi architecture and present our experience of porting and optimization of the existing parallel Particle-in-Cell code PICADOR. Direct porting with no code modification gives performance on Xeon […]

May 28, 2015 by hgpu

Power density constraints are limiting the performance improvements of modern CPUs. To address this we have seen the introduction of lower-power, multi-core processors, but the future will be even more exciting. In order to stay within the power density limits but still obtain Moore’s Law performance/price gains, it will be necessary to parallelize algorithms to […]

May 20, 2015 by hgpu

Highly-parallel graphics processing units (GPUs) can improve the speed of micromagnetic simulations significantly as compared to conventional computing using central processing units (CPUs). We present a strategy for performing GPU-accelerated micromagnetic simulations by utilizing cost-effective GPU access offered by cloud computing services with an open-source Python-based program for running the MuMax3 micromagnetics code remotely. We […]

May 10, 2015 by hgpu

Iterative stencil computations are important pattern of computations in different computational fields such as physics or chemistry simulations. A stencil computation repeatedly updates each point of a d-dimensional grid as a function of itself and its near neighbors. As the demand for more and more compute power is growing rapidly in different fields of research, […]

April 8, 2015 by hgpu

This work describes the challenges presented by porting parts ofthe Gysela code to the Intel Xeon Phi coprocessor, as well as techniques used for optimization, vectorization and tuning that can be applied to other applications. We evaluate the performance of somegeneric micro-benchmark on Phi versus Intel Sandy Bridge. Several interpolation kernels useful for the Gysela […]

March 22, 2015 by hgpu

We discuss several strategies to implement Dykstra’s projection algorithm on NVIDIA’s compute unified device architecture (CUDA). Dykstra’s algorithm is the central step in and the computationally most expensive part of statistical multi-resolution methods. It projects a given vector onto the intersection of convex sets. Compared with a CPU implementation our CUDA implementation is one order […]

March 14, 2015 by hgpu

VMAT optimization is a computationally challenging problem due to its large data size, high degrees of freedom, and many hardware constraints. High-performance graphics processing units have been used to speed up the computations. However, its small memory size cannot handle cases with a large dose-deposition coefficient (DDC) matrix. This paper is to report an implementation […]

March 6, 2015 by hgpu

Monte Carlo (MC) method has been recognized the most accurate dose calculation method for radiotherapy. However, its extremely long computation time impedes clinical applications. Recently, a lot of efforts have been made to realize fast MC dose calculation on GPUs. Nonetheless, most of the GPU-based MC dose engines were developed in NVidia CUDA environment. This […]

March 6, 2015 by hgpu

Quarks and gluons are the building blocks of all hadronic matter, like protons and neutrons. Their interaction is described by Quantum Chromodynamics (QCD), a theory under test by large scale experiments like the Large Hadron Collider (LHC) at CERN and in the future at the Facility for Antiproton and Ion Research (FAIR) at GSI. However, […]

March 3, 2015 by hgpu

Stochastic electrodynamics is a classical theory which assumes that the physical vacuum consists of classical stochastic fields with average energy $frac{1}{2}hbar omega$ in each mode, i.e., the zero-point Planck spectrum. While this classical theory explains many quantum phenomena related to harmonic oscillator problems, hard results on nonlinear systems are still lacking. In this work the […]

February 27, 2015 by hgpu

We present a fast GPU implementation of the image reconstruction routine, for a novel two strip PET detector that relies solely on the time of flight measurements.

February 27, 2015 by hgpu

High performance computing of Meshless Time Domain Method (MTDM) on multi-GPU using the supercomputer HA-PACS (Highly Accelerated Parallel Advanced system for Computational Sciences) at University of Tsukuba is investigated. Generally, the finite difference time domain (FDTD) method is adopted for the numerical simulation of the electromagnetic wave propagation phenomena. However, the numerical domain must be […]

February 23, 2015 by hgpu