We report on the design, verification and performance of mumax3, an open-source GPU-accelerated micromagnetic simulation program. This software solves the time- and space dependent magnetization evolution in nano- to micro scale magnets using a finite-difference discretization. Its high performance and low memory requirements allow for large-scale simulations to be performed in limited time and on […]

July 1, 2014 by hgpu

We present a way to combine Vlasov and two-fluid codes for the simulation of a collisionless plasma in large domains while keeping full information of the velocity distribution in localized areas of interest. This is made possible by solving the full Vlasov equation in one region while the remaining area is treated by a 5-moment […]

June 23, 2014 by hgpu

We implement an efficient energy-minimization algorithm for finite-difference micromagnetics that proofs especially usefull for the computation of hysteresis loops. Compared to results obtained by time integration of the Landau-Lifshitz-Gilbert equation, a speedup of up to two orders of magnitude is gained. The method is implemented in a finite-difference code running on CPUs as well as […]

May 14, 2014 by hgpu

We present sample CUDA programs for the GPU computing of the Swendsen-Wang multi-cluster spin flip algorithm. We deal with the classical spin models; the Ising model, the q-state Potts model, and the classical XY model. As for the lattice, both the 2D (square) lattice and the 3D (simple cubic) lattice are treated. We already reported […]

April 4, 2014 by hgpu

Petaflop architectures are currently being utilized efficiently to perform large scale computations in Atomic, Molecular and Optical Collisions. We solve the Schroedinger or Dirac equation for the appropriate collision problem using the R-matrix or R-matrix with pseudo-states approach. We briefly outline the parallel methodology used and implemented for the current suite of Breit-Pauli and DARC […]

April 4, 2014 by hgpu

We present finite differences numerical algorithm for solving 2D spatially homogeneous Boltzmann transport equation for semiconductor superlattices (SL) subject to time dependant electric field along SL axis and constant perpendicular magnetic field. Algorithm is implemented in C language targeted to CPU and in CUDA C language targeted to commodity NVidia GPUs. We compare performance and […]

January 25, 2014 by hgpu

We test the performances of two different approaches to the computation of forces for molecular dynamics simulations on Graphics Processing Units. A "vertex-based" approach, where a computing thread is started per particle, is compared to a newly proposed "edge-based" approach, where a thread is started per each potentially non-zero interaction. We find that the former […]

January 23, 2014 by hgpu

This paper presents a graphics processing units (GPUs) implementation of the semiclassical initial value representation (SC-IVR) propagator for vibrational molecular spectroscopy calculations. The time-averaging formulation of the SC-IVR for power spectrum calculations is employed. Details about the CUDA implementation of the semiclassical code are provided. 4 molecules with an increasing number of atoms are considered […]

December 18, 2013 by hgpu

When calculating satellite trajectories in low-earth orbit, engineers need to adequately estimate aerodynamic forces. But to this day, obtaining the drag acting on the complicated shapes of modern spacecraft suffers from many sources of error. While part of the problem is the uncertain density in the upper atmosphere, this works focuses on improving the modeling […]

December 16, 2013 by hgpu

High-order numerical methods for unstructured grids combine the superior accuracy of high-order spectral or finite difference methods with the geometric flexibility of low-order finite volume or finite element schemes. The Flux Reconstruction (FR) approach unifies various high-order schemes for unstructured grids within a single framework. Additionally, the FR approach exhibits a significant degree of element […]

December 6, 2013 by hgpu

We present a scheme for the parallelization of quantum Monte Carlo on graphical processing units, focusing on bosonic systems and variational Monte Carlo. We use asynchronous execution schemes with shared memory persistence, and obtain an excellent acceleration. Comparing with single core execution, GPU-accelerated code runs over x100 faster. The CUDA code is provided along with […]

December 6, 2013 by hgpu

Oscillation probability calculations are becoming increasingly CPU intensive in modern neutrino oscillation analyses. The independency of reweighting individual events in a Monte Carlo sample lends itself to parallel implementation on a Graphics Processing Unit. The library "Prob3++" was ported to the GPU using the CUDA C API, allowing for large scale parallelized calculations of neutrino […]

December 3, 2013 by hgpu