Cho Hong Ling
This thesis presents the results of a study into the use of graphical processing units (GPUs) in the simulation and modelling of gravitational microlensing. Two simulation approaches were investigated: magnification maps and the use of a dynamic engine for directly simulating gravitational microlensing light curves. It was found that the GPUs are able to speed […]
View View   Download Download (PDF)   
Laurie Elizabeth Miller
There is an increasing need for computational power to drive software tools used in power systems planning and operations, since the emergence of modern energy markets and recent renewable generation technology fundamentally alters how energy flows through the existing power grid. While special-purpose hardware, including supercomputers, has been explored for this purpose, inexpensive commodity hardware […]
View View   Download Download (PDF)   
Cedric Nugteren, Gert-Jan van den Braak, Henk Corporaal, Henri Bal
As modern GPUs rely partly on their on-chip memories to counter the imminent off-chip memory wall, the efficient use of their caches has become important for performance and energy. However, optimising cache locality systematically requires insight into and prediction of cache behaviour. On sequential processors, stack distance or reuse distance theory is a well-known means […]
Yannick van Bavel
Ultrasound scanners are often used in medical diagnostics for visualising body parts without entering the body. An image is created by visualising reflections from an ultrasound pulse, transmitted into the body. Current scanners use a scanning which creates an image line by line, using focused pulses on each line separately. This method results in high […]
View View   Download Download (PDF)   
Giovanni Agosta, Alessandro Barenghi, Gerardo Pelosi
Graphic Processing Units (GPU) are increasingly popular in the field of high-performance computing for their ability to provide computational power for massively parallel problems at a reduced cost. However, the programming model exposed by the GPGPU software development tools is often insufficient to achieve full performance, and a major rethinking of algorithmic choices is needed. […]
View View   Download Download (PDF)   
Ying-Chih Lin, Chien-Liang Huang, Chin-Sheng Chen, Wen-Chung Chang, Yu-Jen Chen, Chia-Yuan Liu
Image registration is wildly used in the biomedical image, but there are too many textures and noises in the biomedical image to get a precise image registration. In order to get the excellent registration performance, it needs more complex image processing, and it will spend expensive computation cost. For the real time issue, this paper […]
View View   Download Download (PDF)   
Prasanna Balaprakash, Karl Rupp, Azamat Mametjanov, Robert B. Gramacy, Paul D. Hovland, Stefan M. Wild
We focus on a design-of-experiments methodology for developing empirical performance models of GPU kernels. Recently, we developed an iterative active learning algorithm that adaptively selects parameter configurations in batches for concurrent evaluation on CPU architectures in order to build performance models over the parameter space. In this paper, we illustrate the adoption of the algorithm […]
View View   Download Download (PDF)   
Jorge Frances Monllor, Sergio Bleda Perez, Andres Marquez Ruiz, Cristian Neipp Lopez, Sergi Gallego Rico, Beatriz Otero Calvino, Augusto Belendez Vazquez
In this work a unified treatment of solid and fluid vibration problems is developed by means of the Finite-Difference Time-Domain (FDTD). The scheme here proposed introduces a scaling factor in the velocity fields that improves the performance of the method and the vibration analysis in heterogenous media. In order to accurately reproduce the interaction of […]
View View   Download Download (PDF)   
Jacco Bikker, Jeroen van Schijndel
In this paper, we present a simple, yet efficient implementation of the path tracing algorithm for GPUs. A reformulation of Russian Roulette is used to achieve high SIMT utilization, which leads to real-time performance in Kajiya’s classic scene, using a single GPU. We apply our scheme to larger scenes in the Brigade system, an experimental […]
View View   Download Download (PDF)   
Jacco Bikker, Jeroen van Schijndel
We investigate GPU path tracing performance in the context of real-time rendering for games. We propose a reformulation of Russian roulette, as well as an efficient implementation of the path regeneration algorithm by Novak et al. [Novak et al. 2010]. We show that a combination of these algorithms provides high performance for a variety of […]
View View   Download Download (PDF)   
Andrea Bartezzaghi
The study of thin structures is very common nowadays and useful in different fields. An important example is the analysis of sail dynamics. In this context, accurate simulations of the interaction between the sail and the wind are also required. However, this kind of fluid-structure interaction problems are very computationally expensive. First objective of this […]
View View   Download Download (PDF)   
Cedric Nugteren, Pieter Custers, Henk Corporaal
This paper presents a technique to fully automatically generate efficient and readable code for parallel processors. We base our approach on skeleton-based compilation and "algorithmic species", an algorithm classification of program code. We use a tool to automatically annotate C code with species information where possible. The annotated program code is subsequently fed into the […]
View View   Download Download (PDF)   
Page 1 of 1212345...10...Last »

* * *

* * *

* * *

Free GPU computing nodes at

Registered users can now run their OpenCL application at We provide 1 minute of computer time per each run on two nodes with two AMD and one nVidia graphics processing units, correspondingly. There are no restrictions on the number of starts.

The platforms are

Node 1
  • GPU device 0: AMD/ATI Radeon HD 5870 2GB, 850MHz
  • GPU device 1: AMD/ATI Radeon HD 6970 2GB, 880MHz
  • CPU: AMD Phenom II X6 @ 2.8GHz 1055T
  • RAM: 12GB
  • OS: OpenSUSE 11.4
  • SDK: AMD APP SDK 2.8
Node 2
  • GPU device 0: AMD/ATI Radeon HD 7970 3GB, 1000MHz
  • GPU device 1: nVidia GeForce GTX 560 Ti 2GB, 822MHz
  • CPU: Intel Core i7-2600 @ 3.4GHz
  • RAM: 16GB
  • OS: OpenSUSE 12.2
  • SDK: nVidia CUDA Toolkit 5.0.35, AMD APP SDK 2.8

Completed OpenCL project should be uploaded via User dashboard (see instructions and example there), compilation and execution terminal output logs will be provided to the user.

The information send to will be treated according to our Privacy Policy

HGPU group © 2010-2014

All rights belong to the respective authors

Contact us: