Apr, 11

Visualizing the Radiation of the Kelvin-Helmholtz Instability

Emerging new technologies in plasma simulations allow tracking billions of particles while computing their radiative spectra. We present a visualization of the relativistic Kelvin-Helmholtz Instability from a simulation performed with the fully relativistic particle-in-cell code PIConGPU powered by 18,000 GPUs on the USA’s fastest supercomputer Titan [1].
Apr, 11

The GENGA Code: Gravitational Encounters in N-body simulations with GPU Acceleration

We describe a GPU implementation of a hybrid symplectic N-body integrator, GENGA (Gravitational ENcounters with Gpu Acceleration), designed to integrate planet and planetesimal dynamics in the late stage of planet formation and stability analysis of planetary systems. GENGA is based on the integration scheme of the Mercury code (Chambers 1999), which handles close encounters with […]
Apr, 9

Parallel face Detection and Recognition on GPU

Human face detection and recognition finds various application in domain like Surveillance, Law Enforcement, Interactive game application etc. These application requires fast image processing in real time however with the proposed work earlier does not gives that capability. In this paper we have proposed technique that process image for face detection and recognition in parallel […]
Apr, 9

Real-time and Realistic Simulation of Large-scale Deep Ocean Wave Foams Based on GPU

Using tiled small patches construct deep ocean surface, which greatly reduce the calculation, is a very common approach in ocean simulation. But it leads foams on surface to obviously repetitive artifacts. We proposed a dynamic threshold condition based on global coordinates that control and reduce the periodic repetition. Using FFT method generate a small patch […]
Apr, 9

Modeling and Simulation of a Dynamic Task-Based Runtime System for Heterogeneous Multi-Core Architectures

Multi-core architectures comprising several GPUs have become mainstream in the field of High-Performance Computing. However, obtaining the maximum performance of such heterogeneous machines is challenging as it requires to carefully offload computations and manage data movements between the different processing units. The most promising and successful approaches so far rely on task-based runtimes that abstract […]
Apr, 9

3D Hydrodynamic Simulation of Classical Nova Explosions

The purpose of this project is to develop a computer model to investigate the formation and life cycle of classical novae. A nova is an orbiting system consisting of a white dwarf and star. Over time, the white dwarf pulls hydrogen gas from the star which gathers onto the surface of the white dwarf (the […]
Apr, 9

Task-based FMM for heterogeneous architectures

High performance FMM is crucial for the numerical simulation of many physical problems. In a previous study, we have shown that task-based FMM provides the flexibility required to process a wide spectrum of particle distributions efficiently on multicore architectures. In this paper, we now show how such an approach can be extended to fully exploit […]
Apr, 9

A two-level task scheduler on Multiple DSP system for OpenCL

This paper addresses the problem that multiple DSP system doesn’t support OpenCL programming. With the compiler, runtime and the kernel scheduler proposed, an OpenCL application becomes portable not only between multiple CPU and GPU, but also between embedded multiple DSP systems. Firstly, the LLVM compiler was imported for source-to-source translation in which the translated source […]
Apr, 9

Bayesian Sparse Unsupervised Learning for Probit Models of Binary Data

We present a unified approach to unsupervised Bayesian learning of factor models for binary data with binary and spike-and-slab latent factors. We introduce a non-negative constraint in the spike-and-slab prior that eliminates the usual sign ambiguity present in factor models and lowers the generalization error on the datasets tested here. For the generative models we […]
Apr, 9

Reducing the Disk IO Bandwidth Bottleneck through Fast Floating Point Compression using Accelerators

Compute-intensive tasks in high-end high performance computing (HPC) systems often generate large amounts of data, especially floating-point data, that need to be transmitted over the network. Although computation speeds are very high, the overall performance of these applications is affected by the data transfer overhead. Moreover, as data sets are growing in size rapidly, bandwidth […]
Apr, 9

Literature Review: Parallel Computing on linear equations of linear elastic FEM stimulation with CUDA

Scientific computation is the field of study that uses computers to implement mathematical models of physical phenomena such as FEM in deformation measurement in virtual reality. Scientific and engineering problems that would be almost impossible to solve by hand whereas on a computer, it can be handled properly. A numerical algorithm calculating for different fields […]
Apr, 9

Exploring the power of GPU’s for training Deep Belief Networks

One of the major research trends currently is the evolution of heterogeneous parallel computing. GP-GPU computing is being widely used and several applications have been designed to exploit the massive parallelism that GP-GPU’s have to offer. While GPU’s have always been widely used in areas of computer vision for image processing, little has been done […]
Page 3 of 70212345...102030...Last »

* * *

* * *

* * *

Free GPU computing nodes at

Registered users can now run their OpenCL application at We provide 1 minute of computer time per each run on two nodes with two AMD and one nVidia graphics processing units, correspondingly. There are no restrictions on the number of starts.

The platforms are

Node 1
  • GPU device 0: AMD/ATI Radeon HD 5870 2GB, 850MHz
  • GPU device 1: AMD/ATI Radeon HD 6970 2GB, 880MHz
  • CPU: AMD Phenom II X6 @ 2.8GHz 1055T
  • RAM: 12GB
  • OS: OpenSUSE 11.4
  • SDK: AMD APP SDK 2.8
Node 2
  • GPU device 0: AMD/ATI Radeon HD 7970 3GB, 1000MHz
  • GPU device 1: nVidia GeForce GTX 560 Ti 2GB, 822MHz
  • CPU: Intel Core i7-2600 @ 3.4GHz
  • RAM: 16GB
  • OS: OpenSUSE 12.2
  • SDK: nVidia CUDA Toolkit 5.0.35, AMD APP SDK 2.8

Completed OpenCL project should be uploaded via User dashboard (see instructions and example there), compilation and execution terminal output logs will be provided to the user.

The information send to will be treated according to our Privacy Policy

HGPU group © 2010-2014

All rights belong to the respective authors

Contact us: