Yukihiro Komura, Yutaka Okabe
We present sample CUDA programs for the GPU computing of the Swendsen-Wang multi-cluster spin flip algorithm. We deal with the classical spin models; the Ising model, the q-state Potts model, and the classical XY model. As for the lattice, both the 2D (square) lattice and the 3D (simple cubic) lattice are treated. We already reported […]
View View   Download Download (PDF)   
Brendan M McLaughlin, Connor P Ballance
Petaflop architectures are currently being utilized efficiently to perform large scale computations in Atomic, Molecular and Optical Collisions. We solve the Schroedinger or Dirac equation for the appropriate collision problem using the R-matrix or R-matrix with pseudo-states approach. We briefly outline the parallel methodology used and implemented for the current suite of Breit-Pauli and DARC […]
View View   Download Download (PDF)   
Dmitri Priimak
We present finite differences numerical algorithm for solving 2D spatially homogeneous Boltzmann transport equation for semiconductor superlattices (SL) subject to time dependant electric field along SL axis and constant perpendicular magnetic field. Algorithm is implemented in C language targeted to CPU and in CUDA C language targeted to commodity NVidia GPUs. We compare performance and […]
View View   Download Download (PDF)   
Lorenzo Rovigatti, Petr Sulc, Istvan Z. Reguly, Flavio Romano
We test the performances of two different approaches to the computation of forces for molecular dynamics simulations on Graphics Processing Units. A "vertex-based" approach, where a computing thread is started per particle, is compared to a newly proposed "edge-based" approach, where a thread is started per each potentially non-zero interaction. We find that the former […]
Dario Tamascelli, Francesco S. Dambrosio, Riccardo Conte, Michele Ceotto
This paper presents a graphics processing units (GPUs) implementation of the semiclassical initial value representation (SC-IVR) propagator for vibrational molecular spectroscopy calculations. The time-averaging formulation of the SC-IVR for power spectrum calculations is employed. Details about the CUDA implementation of the semiclassical code are provided. 4 molecules with an increasing number of atoms are considered […]
View View   Download Download (PDF)   
J. Brent Parham, L. A. Barba
When calculating satellite trajectories in low-earth orbit, engineers need to adequately estimate aerodynamic forces. But to this day, obtaining the drag acting on the complicated shapes of modern spacecraft suffers from many sources of error. While part of the problem is the uncertain density in the upper atmosphere, this works focuses on improving the modeling […]
View View   Download Download (PDF)   
Freddie D Witherden, Antony M Farrington, Peter E Vincent
High-order numerical methods for unstructured grids combine the superior accuracy of high-order spectral or finite difference methods with the geometric flexibility of low-order finite volume or finite element schemes. The Flux Reconstruction (FR) approach unifies various high-order schemes for unstructured grids within a single framework. Additionally, the FR approach exhibits a significant degree of element […]
View View   Download Download (PDF)   
Y. Lutsyshyn
We present a scheme for the parallelization of quantum Monte Carlo on graphical processing units, focusing on bosonic systems and variational Monte Carlo. We use asynchronous execution schemes with shared memory persistence, and obtain an excellent acceleration. Comparing with single core execution, GPU-accelerated code runs over x100 faster. The CUDA code is provided along with […]
View View   Download Download (PDF)   
R. G. Calland, A. C. Kaboth, D. Payne
Oscillation probability calculations are becoming increasingly CPU intensive in modern neutrino oscillation analyses. The independency of reweighting individual events in a Monte Carlo sample lends itself to parallel implementation on a Graphics Processing Unit. The library "Prob3++" was ported to the GPU using the CUDA C API, allowing for large scale parallelized calculations of neutrino […]
Kirana Kumara P
In this work, first a Fortran code is developed for three dimensional linear elastostatics using constant boundary elements; the code is based on a MATLAB code developed by the author earlier. Next, the code is parallelized using BLACS, MPI, and ScaLAPACK. Later, the parallelized code is used to demonstrate the usefulness of the Boundary Element […]
Pushan Majumdar
OpenACC compilers allow one to use Graphics Processing Units without having to write explicit CUDA codes. Programs can be modified incrementally using OpenMP like directives which causes the compiler to generate CUDA kernels to be run on the GPUs. In this article we look at the performance gain in lattice simulations with dynamical fermions using […]
View View   Download Download (PDF)   
Michal Januszewski, Marcin Kostur
We present Sailfish, an open source fluid simulation package implementing the lattice Boltzmann method (LBM) on modern Graphics Processing Units (GPUs) using CUDA/OpenCL. We take a novel approach to GPU code implementation and use run-time code generation techniques and a high level programming language (Python) to achieve state of the art performance, while allowing easy […]
Page 1 of 1512345...10...Last »

* * *

* * *

* * *

Free GPU computing nodes at hgpu.org

Registered users can now run their OpenCL application at hgpu.org. We provide 1 minute of computer time per each run on two nodes with two AMD and one nVidia graphics processing units, correspondingly. There are no restrictions on the number of starts.

The platforms are

Node 1
  • GPU device 0: AMD/ATI Radeon HD 5870 2GB, 850MHz
  • GPU device 1: AMD/ATI Radeon HD 6970 2GB, 880MHz
  • CPU: AMD Phenom II X6 @ 2.8GHz 1055T
  • RAM: 12GB
  • OS: OpenSUSE 11.4
  • SDK: AMD APP SDK 2.8
Node 2
  • GPU device 0: AMD/ATI Radeon HD 7970 3GB, 1000MHz
  • GPU device 1: nVidia GeForce GTX 560 Ti 2GB, 822MHz
  • CPU: Intel Core i7-2600 @ 3.4GHz
  • RAM: 16GB
  • OS: OpenSUSE 12.2
  • SDK: nVidia CUDA Toolkit 5.0.35, AMD APP SDK 2.8

Completed OpenCL project should be uploaded via User dashboard (see instructions and example there), compilation and execution terminal output logs will be provided to the user.

The information send to hgpu.org will be treated according to our Privacy Policy

HGPU group © 2010-2014 hgpu.org

All rights belong to the respective authors

Contact us: