Gaurav Budjade
Nowadays, the advancements in internet technology are increasing by leaps and bounds. This has lead to the increase in threats by attackers, consequently compromising system security. Intrusion detection systems (IDS) provide an intelligent way to provide capable system security. Traditionally, IDS’s have been designed using several statistical based methods such as classification algorithms or artificial […]
View View   Download Download (PDF)   
Coleman Kendrick
The purpose of this project is to develop a computer model to investigate the formation and life cycle of classical novae. A nova is an orbiting system consisting of a white dwarf and star. Over time, the white dwarf pulls hydrogen gas from the star which gathers onto the surface of the white dwarf (the […]
View View   Download Download (PDF)   
Philippe Helluy, Jonathan Jung
In this work we propose an efficient finite volume approximation of two-fluid flows. Our scheme is based on three ingredients. We first construct a conservative scheme that removes the pressure oscillations phenomenon at the interface. The construction relies on a random sampling at the interface [6, 5]. Secondly, we replace the exact Riemann solver by […]
View View   Download Download (PDF)   
R. Gandham, D. S. Medina, T. Warburton
We discuss the development, verification, and performance of a GPU accelerated discontinuous Galerkin method for the solutions of two dimensional nonlinear shallow water equations. The shallow water equations are hyperbolic partial differential equations and are widely used in the simulation of tsunami wave propagations. Our algorithms are tailored to take advantage of the single instruction […]
View View   Download Download (PDF)   
David S Medina, Amik St-Cyr, T. Warburton
The inability to predict lasting languages and architectures led us to develop OCCA, a C++ library focused on host-device interaction. Using run-time compilation and macro expansions, the result is a novel single kernel language that expands to multiple threading languages. Currently, OCCA supports device kernel expansions for the OpenMP, OpenCL, and CUDA platforms. Computational results […]
View View   Download Download (PDF)   
Adam Gudys, Sebastian Deorowicz
Multiple sequence alignment is a crucial task in a number of biological analyses like secondary structure prediction, domain searching, phylogeny, etc. MSAProbs is currently the most accurate alignment algorithm, but its effectiveness is obtained at the expense of computational time. In the paper we present QuickProbs, the variant of MSAProbs customised for graphics processors. We […]
Sylvain Collange, David Defour, Stef Graillat, Roman Iakymchuk
On modern multi-core, many-core, and heterogeneous architectures, floating-point computations, especially reductions, may become non-deterministic and thus non-reproducible mainly due to non-associativity of floating-point operations. We introduce a solution to compute deterministic sums of floating-point numbers efficiently and with the best possible accuracy. Our multi-level algorithm consists of two main stages: a filtering stage that uses […]
View View   Download Download (PDF)   
Alessio Sclocco, Henri E. Bal, Jason Hessels, Joeri van Leeuwen, Rob V. van Nieuwpoort
Dedispersion is a basic algorithm to reconstruct impulsive astrophysical signals. It is used in high sampling-rate radio astronomy to counteract temporal smearing by intervening interstellar medium. To counteract this smearing, the received signal train must be dedispersed for thousands of trial distances, after which the transformed signals are further analyzed. This process is expensive on […]
View View   Download Download (PDF)   
Carlo del Mundo, Wu-chun Feng
The fast Fourier transform (FFT), a spectral method that computes the discrete Fourier transform and its inverse, pervades many applications in digital signal processing, such as imaging, tomography, and software-defined radio. Its importance has caused the research community to expend significant resources to accelerate the FFT, of which FFTW is the most prominent example. With […]
View View   Download Download (PDF)   
Jing Zhang, Hao Wang, Heshan Lin, Wu-chun Feng
By scheduling multiple applications with complementary resource requirements on a smaller number of compute nodes, we aim to improve performance, resource utilization, energy consumption, and energy efficiency simultaneously. In addition to our naive consolidation approach, which already achieves the aforementioned goals, we propose a new energy efficiency-aware (EEA) scheduling policy and compare its performance with […]
View View   Download Download (PDF)   
Konstantinos Krommydas, Muhsen Owaida, Christos D. Antonopoulos, Nikolaos Bellas, Wu-chun Feng
The proliferation of heterogeneous computing systems presents the parallel computing community with the challenge of porting legacy and emerging applications to multiple processors with diverse programming abstractions. OpenCL is a vendor-agnostic and industry-supported programming model that offers code portability on heterogeneous platforms, allowing applications to be developed once and deployed "anywhere". In this paper, we […]
View View   Download Download (PDF)   
Trevor Bekolay, James Bergstra, Eric Hunsberger, Travis DeWolf, Terrence C. Stewart, Daniel Rasmussen, Xuan Choo, Aaron Russell Voelker, Chris Eliasmith
Neuroscience currently lacks a comprehensive theory of how cognitive processes can be implemented in a biological substrate. The Neural Engineering Framework (NEF) proposes one such theory, but has not yet gathered significant empirical support, partly due to the technical challenge of building and simulating large-scale models with the NEF. Nengo is a software tool that […]
Page 1 of 612345...Last »

* * *

* * *

* * *

Free GPU computing nodes at

Registered users can now run their OpenCL application at We provide 1 minute of computer time per each run on two nodes with two AMD and one nVidia graphics processing units, correspondingly. There are no restrictions on the number of starts.

The platforms are

Node 1
  • GPU device 0: AMD/ATI Radeon HD 5870 2GB, 850MHz
  • GPU device 1: AMD/ATI Radeon HD 6970 2GB, 880MHz
  • CPU: AMD Phenom II X6 @ 2.8GHz 1055T
  • RAM: 12GB
  • OS: OpenSUSE 11.4
  • SDK: AMD APP SDK 2.8
Node 2
  • GPU device 0: AMD/ATI Radeon HD 7970 3GB, 1000MHz
  • GPU device 1: nVidia GeForce GTX 560 Ti 2GB, 822MHz
  • CPU: Intel Core i7-2600 @ 3.4GHz
  • RAM: 16GB
  • OS: OpenSUSE 12.2
  • SDK: nVidia CUDA Toolkit 5.0.35, AMD APP SDK 2.8

Completed OpenCL project should be uploaded via User dashboard (see instructions and example there), compilation and execution terminal output logs will be provided to the user.

The information send to will be treated according to our Privacy Policy

HGPU group © 2010-2014

All rights belong to the respective authors

Contact us: