Carolin Wolf
Many computationally intensive applications profit by parallel execution, based on using multiple cores in CPUs, data-parallel GPGPU processing or even several machines like in clusters. However, changing a program to run in parallel requires a high effort and is therefore a time-consuming step during development. During the implementation, it is necessary to consider many steps […]
View View   Download Download (PDF)   
Carolin Wolf, Georg Dotzler, Ronald Veldema, Michael Philippsen
For scientists, it is advantageous to use a high level of abstraction for programming their simulations, so that they can focus on the problem at hand instead of struggling with low-level details. However, current HPC clusters with multiple GPUs per node only offer explicit communication to and from the GPUs, require manual work to keep […]
View View   Download Download (PDF)   
Carolin Wolf
Simulations, like fluid dynamics, are very computationally intensive problems. Since the Lattice Boltzmann method uses a discrete grid of cells for simulating the flow, there are no dependencies between the single cells during the computation for one time step. Therefore, the computing can easily be done in parallel. During the last years, multi-CPU computers have […]
View View   Download Download (PDF)   
Sarod Yatawatta, Sanaz Kazemi, Saleem Zaroubi
We present the GPU based acceleration of two well known nonlinear optimization routines: Levenberg-Marquardt (LM) and Limited Memory Broyden-Fletcher-Goldfarb-Shanno (LBFGS) in radio interferometric calibration. Radio interferometric calibration is a heavily compute intensive operation where the same nonlinear optimization problem has to be solved over many time intervals, with different data. We achieve a speedup of […]
View View   Download Download (PDF)   
Ben van Werkhoven, Jason Maassen, Frank J. Seinstra
The research area of Multimedia Content Analysis (MMCA) considers all aspects of the automated extraction of knowledge from multimedia archives and data streams. To satisfy the increasing computational demands of MMCA problems, the use of High Performance Computing (HPC) techniques is essential. As most MMCA researchers are not HPC experts, there is an urgent need […]
View View   Download Download (PDF)   
Dimitar Lukarski
Partial differential equations are typically solved by means of finite difference, finite volume or finite element methods resulting in large, highly coupled, ill-conditioned and sparse (non-)linear systems. In order to minimize the computing time we want to exploit the capabilities of modern parallel architectures. The rapid hardware shifts from single core to multi-core and many-core […]
View View   Download Download (PDF)   
Kingsley Gale-Sides
The potential for decreasing the solution time for the UK Met Office NAME III [1] lagrangian particle atmospheric particle dispersion modelling code was examined. The code was ported to the EPCC Ness and Fermi0 machines and compiled with the PGI compiler. Timing benchmarks and profiling completed for a particle only run, and a cloud gamma […]
View View   Download Download (PDF)   
Ahmed Mohamed Hassan Abdalla
The latest GPU architecture released by Nvidia, code-named "Fermi", is the most advanced computing GPU architecture ever built. Radical changes took place on the GPU computing architecture compared to Fermi’s predecessors such as the GT200 series and the G80s. In this dissertation the Fermi architecture is analysed, addressing the most prominent upgrades, by running extensive […]
View View   Download Download (PDF)   
Massimo Cafaro, Giovanni Aloisio (Eds.)
Provides a thorough introduction and overview of existing technologies in grids, clouds and virtualization, including a brief history of the field. Examines both business and scientific applications of grids and clouds. Presents contributions from an international selection of experts in the field. Research into grid computing has been driven by the need to solve large-scale, […]
View View   Download Download (PDF)   
P. Zaspel, M. Griebel
We present a fully multi-GPU-based double-precision solver for the three-dimensional two-phase incompressible Navier-Stokes equations. An in-depth performance analysis shows a realistic speed-up of the order of three by comparing equally priced GPUs and CPUs and more than a doubling in energy efficiency for GPUs. We observe profound strong and weak scaling on a multi-GPU cluster.
View View   Download Download (PDF)   
Evan Lezar
This work considers the acceleration of matrix-based computational electromagnetic (CEM) techniques using graphics processing units (GPUs). These massively parallel processors have gained much support since late 2006, with software tools such as CUDA and OpenCL greatly simplifying the process of harnessing the computational power of these devices. As with any advances in computation, the use […]
View View   Download Download (PDF)   
Alejandro C. Crespo, Jose M. Dominguez, Anxo Barreiro, Moncho Gomez-Gesteira, Benedict D. Rogers
Smoothed Particle Hydrodynamics (SPH) is a numerical method commonly used in Computational Fluid Dynamics (CFD) to simulate complex free-surface flows. Simulations with this mesh-free particle method far exceed the capacity of a single processor. In this paper, as part of a dual-functioning code for either central processing units (CPUs) or Graphics Processor Units (GPUs), a […]
Page 1 of 212

* * *

* * *

Follow us on Twitter

HGPU group

1660 peoples are following HGPU @twitter

Like us on Facebook

HGPU group

334 people like HGPU on Facebook

* * *

Free GPU computing nodes at hgpu.org

Registered users can now run their OpenCL application at hgpu.org. We provide 1 minute of computer time per each run on two nodes with two AMD and one nVidia graphics processing units, correspondingly. There are no restrictions on the number of starts.

The platforms are

Node 1
  • GPU device 0: nVidia GeForce GTX 560 Ti 2GB, 822MHz
  • GPU device 1: AMD/ATI Radeon HD 6970 2GB, 880MHz
  • CPU: AMD Phenom II X6 @ 2.8GHz 1055T
  • RAM: 12GB
  • OS: OpenSUSE 13.1
  • SDK: nVidia CUDA Toolkit 6.5.14, AMD APP SDK 3.0
Node 2
  • GPU device 0: AMD/ATI Radeon HD 7970 3GB, 1000MHz
  • GPU device 1: AMD/ATI Radeon HD 5870 2GB, 850MHz
  • CPU: Intel Core i7-2600 @ 3.4GHz
  • RAM: 16GB
  • OS: OpenSUSE 12.3
  • SDK: AMD APP SDK 3.0

Completed OpenCL project should be uploaded via User dashboard (see instructions and example there), compilation and execution terminal output logs will be provided to the user.

The information send to hgpu.org will be treated according to our Privacy Policy

HGPU group © 2010-2015 hgpu.org

All rights belong to the respective authors

Contact us: