Iterative stencil computations are important pattern of computations in different computational fields such as physics or chemistry simulations. A stencil computation repeatedly updates each point of a d-dimensional grid as a function of itself and its near neighbors. As the demand for more and more compute power is growing rapidly in different fields of research, […]

April 8, 2015 by hgpu

This work describes the challenges presented by porting parts ofthe Gysela code to the Intel Xeon Phi coprocessor, as well as techniques used for optimization, vectorization and tuning that can be applied to other applications. We evaluate the performance of somegeneric micro-benchmark on Phi versus Intel Sandy Bridge. Several interpolation kernels useful for the Gysela […]

March 22, 2015 by hgpu

We discuss several strategies to implement Dykstra’s projection algorithm on NVIDIA’s compute unified device architecture (CUDA). Dykstra’s algorithm is the central step in and the computationally most expensive part of statistical multi-resolution methods. It projects a given vector onto the intersection of convex sets. Compared with a CPU implementation our CUDA implementation is one order […]

March 14, 2015 by hgpu

VMAT optimization is a computationally challenging problem due to its large data size, high degrees of freedom, and many hardware constraints. High-performance graphics processing units have been used to speed up the computations. However, its small memory size cannot handle cases with a large dose-deposition coefficient (DDC) matrix. This paper is to report an implementation […]

March 6, 2015 by hgpu

Monte Carlo (MC) method has been recognized the most accurate dose calculation method for radiotherapy. However, its extremely long computation time impedes clinical applications. Recently, a lot of efforts have been made to realize fast MC dose calculation on GPUs. Nonetheless, most of the GPU-based MC dose engines were developed in NVidia CUDA environment. This […]

March 6, 2015 by hgpu

Quarks and gluons are the building blocks of all hadronic matter, like protons and neutrons. Their interaction is described by Quantum Chromodynamics (QCD), a theory under test by large scale experiments like the Large Hadron Collider (LHC) at CERN and in the future at the Facility for Antiproton and Ion Research (FAIR) at GSI. However, […]

March 3, 2015 by hgpu

Stochastic electrodynamics is a classical theory which assumes that the physical vacuum consists of classical stochastic fields with average energy $frac{1}{2}hbar omega$ in each mode, i.e., the zero-point Planck spectrum. While this classical theory explains many quantum phenomena related to harmonic oscillator problems, hard results on nonlinear systems are still lacking. In this work the […]

February 27, 2015 by hgpu

We present a fast GPU implementation of the image reconstruction routine, for a novel two strip PET detector that relies solely on the time of flight measurements.

February 27, 2015 by hgpu

High performance computing of Meshless Time Domain Method (MTDM) on multi-GPU using the supercomputer HA-PACS (Highly Accelerated Parallel Advanced system for Computational Sciences) at University of Tsukuba is investigated. Generally, the finite difference time domain (FDTD) method is adopted for the numerical simulation of the electromagnetic wave propagation phenomena. However, the numerical domain must be […]

February 23, 2015 by hgpu

Hyperbolic conservation laws are important mathematical models for describing many phenomena in physics or engineering. The Finite Volume (FV) method and the Discontinuous Galerkin (DG) methods are two popular methods for solving conservation laws on computers. Those two methods are good candidates for parallel computing: a) they require a large amount of uniform and simple […]

February 23, 2015 by hgpu

Here we present the revised and newly rewritten version of our earlier published CAVE package [J. Busa et al., Comput. Phys. Commun. 181 (2010) 2116] which was originally written in FORTRAN. The package has been rewritten in C language, the algorithm has been parallelized and implemented using OpenCL. This makes the program convenient to run […]

February 10, 2015 by hgpu

A system in a metastable state needs to overcome a certain free energy barrier to form a droplet of the stable phase. Standard treatments assume spherical droplets, but this is not appropriate in the presence of an anisotropy, such as for crystals. The anisotropy of the system has a strong effect on their surface free […]

February 6, 2015 by hgpu