Alexander V. Smirnov
This paper presents a new major release of the program FIESTA (Feynman Integral Evaluation by a Sector decomposiTion Approach). The new release is mainly aimed at optimal performance at large scales when one is increasing the number of sampling points in order to reduce the uncertainty estimates. The release now supports graphical processor units (GPU) […]
Mario Schrock, Silvano Simula, Alexei Strelchenko
We present the implementation of twisted mass fermion operators for the QPhiX library. We analyze the performance on the Intel Xeon Phi (Knights Corner) coprocessor as well as on Intel Xeon Haswell CPUs. In particular, we demonstrate that on the Xeon Phi 7120P the Dslash kernel is able to reach 80% of the theoretical peak […]
Topi Siro, Ari Harju
We implement the Lanczos algorithm on an Intel Xeon Phi coprocessor and compare its performance to a multi-core Intel Xeon CPU and an NVIDIA graphics processor. The Xeon and the Xeon Phi are parallelized with OpenMP and the graphics processor is programmed with CUDA. The performance is evaluated by measuring the execution time of a […]
View View   Download Download (PDF)   
Huan-Ting Meng
In this dissertation, the hardware and API architectures of GPUs are investigated, and the corresponding acceleration techniques are applied on the traditional frequency domain finite element method (FEM), the element-level time-domain methods, and the nonlinear discontinuous Galerkin method. First, the assembly and the solution phases of the FEM are parallelized and mapped onto the granular […]
View View   Download Download (PDF)   
Nigel Cundy, Weonjong Lee
We report on our efforts to implement overlap fermions on NVIDIA GPUs using CUDA, commenting on the algorithms used, implemetation details, and the performance of our code.
View View   Download Download (PDF)   
Bei Wang, Stephane Ethier, William Tang, Khaled Ibrahim, Kamesh Madduri, Samuel Williams, Leonid Oliker
The Gyrokinetic Toroidal Code at Princeton (GTC-P) is a highly scalable and portable particle-in-cell (PIC) code. It solves the 5D Vlasov-Poisson equation featuring efficient utilization of modern parallel computer architectures at the petascale and beyond. Motivated by the goal of developing a modern code capable of dealing with the physics challenge of increasing problem size […]
View View   Download Download (PDF)   
Andreas Adelmann, Uldis Locans, Andreas Suter
Emerging processor architectures such as GPUs and Intel MICs provide a huge performance potential for high performance computing. However developing software using these hardware accelerators introduces additional challenges for the developer such as exposing additional parallelism, dealing with different hardware designs and using multiple development frameworks in order to use devices from different vendors. The […]
View View   Download Download (PDF)   
Joshua A. Anderson, M. Eric Irrgang, Sharon C. Glotzer
We design and implement HPMC, a scalable hard particle Monte Carlo simulation toolkit, and release it open source as part of HOOMD-blue. HPMC runs in parallel on many CPUs and many GPUs using domain decomposition. We employ BVH trees instead of cell lists on the CPU for fast performance, especially with large particle size disparity, […]
View View   Download Download (PDF)   
Avtech Scientific
Advanced Simulation Library is a free and open source multiphysics simulation software package and a tool for solving Partial Differential Equations. It has significant user base across many areas of engineering and science, from both industrial and academic organizations. ASL utilizes only the methods that allow efficient parallelization: Lattice Boltzmann Methods, Explicit Finite Difference, Matrix […]
V.N. Pomerantsev, V.I. Kukulin, O.A. Rubtsova, S.K. Sakhiev
A principally novel approach towards solving the few-particle (many-dimensional) quantum scattering problems is described. The approach is based on a complete discretization of few-particle continuum and usage of massively parallel computations of integral kernels for scattering equations by means of GPU. The discretization for continuous spectrum of a few-particle Hamiltonian is realized with a projection […]
View View   Download Download (PDF)   
Jens Glaser, Andrew S. Karas, Sharon C. Glotzer
We present an algorithm to simulate the many-body depletion interaction between anisotropic colloids in an implicit way, integrating out the degrees of freedom of the depletants, which we treat as an ideal gas. Because the depletant particles are statistically independent and the depletion interaction is short-ranged, depletants are randomly inserted in parallel into the excluded […]
C. A. Navarro, Huang Wei, Youjin Deng
The study of disordered spin systems through Monte Carlo simulations has proven to be a hard task due to the adverse energy landscape present at the low temperature regime, making it difficult for the simulation to escape from a local minimum. Replica based algorithms such as the Exchange Monte Carlo (also known as parallel tempering) […]
View View   Download Download (PDF)   
Page 1 of 5812345...102030...Last »

* * *

* * *

Follow us on Twitter

HGPU group

1660 peoples are following HGPU @twitter

Like us on Facebook

HGPU group

334 people like HGPU on Facebook

* * *

Free GPU computing nodes at hgpu.org

Registered users can now run their OpenCL application at hgpu.org. We provide 1 minute of computer time per each run on two nodes with two AMD and one nVidia graphics processing units, correspondingly. There are no restrictions on the number of starts.

The platforms are

Node 1
  • GPU device 0: nVidia GeForce GTX 560 Ti 2GB, 822MHz
  • GPU device 1: AMD/ATI Radeon HD 6970 2GB, 880MHz
  • CPU: AMD Phenom II X6 @ 2.8GHz 1055T
  • RAM: 12GB
  • OS: OpenSUSE 13.1
  • SDK: nVidia CUDA Toolkit 6.5.14, AMD APP SDK 3.0
Node 2
  • GPU device 0: AMD/ATI Radeon HD 7970 3GB, 1000MHz
  • GPU device 1: AMD/ATI Radeon HD 5870 2GB, 850MHz
  • CPU: Intel Core i7-2600 @ 3.4GHz
  • RAM: 16GB
  • OS: OpenSUSE 12.3
  • SDK: AMD APP SDK 3.0

Completed OpenCL project should be uploaded via User dashboard (see instructions and example there), compilation and execution terminal output logs will be provided to the user.

The information send to hgpu.org will be treated according to our Privacy Policy

HGPU group © 2010-2015 hgpu.org

All rights belong to the respective authors

Contact us: