13833
Mario Hernandez, Jose M. Garcia, Jose M. Cecilia
Iterative stencil computations are important pattern of computations in different computational fields such as physics or chemistry simulations. A stencil computation repeatedly updates each point of a d-dimensional grid as a function of itself and its near neighbors. As the demand for more and more compute power is growing rapidly in different fields of research, […]
View View   Download Download (PDF)   
G. Latu, M. Haefele, J. Bigot, V. Grandgirard, T. Cartier-Michaud, F. Rozar
This work describes the challenges presented by porting parts ofthe Gysela code to the Intel Xeon Phi coprocessor, as well as techniques used for optimization, vectorization and tuning that can be applied to other applications. We evaluate the performance of somegeneric micro-benchmark on Phi versus Intel Sandy Bridge. Several interpolation kernels useful for the Gysela […]
View View   Download Download (PDF)   
Jan Lebert, Lutz Kunneke, Johannes Hagemann, Stephan C. Kramer
We discuss several strategies to implement Dykstra’s projection algorithm on NVIDIA’s compute unified device architecture (CUDA). Dykstra’s algorithm is the central step in and the computationally most expensive part of statistical multi-resolution methods. It projects a given vector onto the intersection of convex sets. Compared with a CPU implementation our CUDA implementation is one order […]
View View   Download Download (PDF)   
Zhen Tian, Fei Peng, Michael Folkerts, Jun Tan, Xun Jia, Steve B. Jiang
VMAT optimization is a computationally challenging problem due to its large data size, high degrees of freedom, and many hardware constraints. High-performance graphics processing units have been used to speed up the computations. However, its small memory size cannot handle cases with a large dose-deposition coefficient (DDC) matrix. This paper is to report an implementation […]
View View   Download Download (PDF)   
Zhen Tian, Feng Shi, Michael Folkerts, Nan Qin, Steve B. Jiang, Xun Jia
Monte Carlo (MC) method has been recognized the most accurate dose calculation method for radiotherapy. However, its extremely long computation time impedes clinical applications. Recently, a lot of efforts have been made to realize fast MC dose calculation on GPUs. Nonetheless, most of the GPU-based MC dose engines were developed in NVidia CUDA environment. This […]
View View   Download Download (PDF)   
Matthias Bach
Quarks and gluons are the building blocks of all hadronic matter, like protons and neutrons. Their interaction is described by Quantum Chromodynamics (QCD), a theory under test by large scale experiments like the Large Hadron Collider (LHC) at CERN and in the future at the Facility for Antiproton and Ion Research (FAIR) at GSI. However, […]
View View   Download Download (PDF)   
Theo M. Nieuwenhuizen, Matthew T.P. Liska
Stochastic electrodynamics is a classical theory which assumes that the physical vacuum consists of classical stochastic fields with average energy $frac{1}{2}hbar omega$ in each mode, i.e., the zero-point Planck spectrum. While this classical theory explains many quantum phenomena related to harmonic oscillator problems, hard results on nonlinear systems are still lacking. In this work the […]
View View   Download Download (PDF)   
P. Bialas, J. Kowal, A. Strzelecki, T. Bednarski, E. Czerwinski, A. Gajos, D. Kaminska, L. Kaplon, A. Kochanowski, G. Korcyl, P. Kowalski, T. Kozik, W. Krzemien, E. Kubicz, P. Moskal, Sz. Niedzwiecki, M. Palka, L. Raczynski, Z. Rudy, O. Rundel, P. Salabura, N.G. Sharma, M. Silarski, A. Slomski, J. Smyrski, A. Wieczorek, W. Wislicki, M. Zielinski, N. Zon
We present a fast GPU implementation of the image reconstruction routine, for a novel two strip PET detector that relies solely on the time of flight measurements.
View View   Download Download (PDF)   
Soichiro Ikuno, Susumu Nakata, Yuta Hirokawa, Taku Itoh
High performance computing of Meshless Time Domain Method (MTDM) on multi-GPU using the supercomputer HA-PACS (Highly Accelerated Parallel Advanced system for Computational Sciences) at University of Tsukuba is investigated. Generally, the finite difference time domain (FDTD) method is adopted for the numerical simulation of the electromagnetic wave propagation phenomena. However, the numerical domain must be […]
View View   Download Download (PDF)   
Philippe Helluy, Thomas Strub, Michel Massaro, Malcolm Roberts
Hyperbolic conservation laws are important mathematical models for describing many phenomena in physics or engineering. The Finite Volume (FV) method and the Discontinuous Galerkin (DG) methods are two popular methods for solving conservation laws on computers. Those two methods are good candidates for parallel computing: a) they require a large amount of uniform and simple […]
View View   Download Download (PDF)   
Jan Busa Jr., Jan Busa, Shura Hayryan, Chin-Kun Hu, Ming-Chya Wu
Here we present the revised and newly rewritten version of our earlier published CAVE package [J. Busa et al., Comput. Phys. Commun. 181 (2010) 2116] which was originally written in FORTRAN. The package has been rewritten in C language, the algorithm has been parallelized and implemented using OpenCL. This makes the program convenient to run […]
View View   Download Download (PDF)   
Benjamin J. Block
A system in a metastable state needs to overcome a certain free energy barrier to form a droplet of the stable phase. Standard treatments assume spherical droplets, but this is not appropriate in the presence of an anisotropy, such as for crystals. The anisotropy of the system has a strong effect on their surface free […]
View View   Download Download (PDF)   
Page 1 of 5512345...102030...Last »

* * *

* * *

Like us on Facebook

HGPU group

238 people like HGPU on Facebook

Follow us on Twitter

HGPU group

1444 peoples are following HGPU @twitter

* * *

Free GPU computing nodes at hgpu.org

Registered users can now run their OpenCL application at hgpu.org. We provide 1 minute of computer time per each run on two nodes with two AMD and one nVidia graphics processing units, correspondingly. There are no restrictions on the number of starts.

The platforms are

Node 1
  • GPU device 0: nVidia GeForce GTX 560 Ti 2GB, 822MHz
  • GPU device 1: AMD/ATI Radeon HD 6970 2GB, 880MHz
  • CPU: AMD Phenom II X6 @ 2.8GHz 1055T
  • RAM: 12GB
  • OS: OpenSUSE 13.1
  • SDK: nVidia CUDA Toolkit 6.5.14, AMD APP SDK 3.0
Node 2
  • GPU device 0: AMD/ATI Radeon HD 7970 3GB, 1000MHz
  • GPU device 1: AMD/ATI Radeon HD 5870 2GB, 850MHz
  • CPU: Intel Core i7-2600 @ 3.4GHz
  • RAM: 16GB
  • OS: OpenSUSE 12.3
  • SDK: AMD APP SDK 3.0

Completed OpenCL project should be uploaded via User dashboard (see instructions and example there), compilation and execution terminal output logs will be provided to the user.

The information send to hgpu.org will be treated according to our Privacy Policy

HGPU group © 2010-2015 hgpu.org

All rights belong to the respective authors

Contact us: