Apr, 21

A GPU-Based Enhanced Genetic Algorithm for Power-Aware Task Scheduling Problem in HPC Cloud

In this paper, we consider power-aware task scheduling (PATS) in HPC clouds. Users request virtual machines (VMs) to execute their tasks. Each task is executed on one single VM, and requires a fixed number of cores (i.e., processors), computing power (million instructions per second – MIPS) of each core, a fixed start time and non-preemption […]
Apr, 21

Rapid Rabbit: Highly Optimized GPU Accelerated Cone-Beam CT Reconstruction

Graphical processing units (GPUs) have become widely adopted in the medical imaging community. The parallel SIMD nature of GPUs maps perfectly to many reconstruction algorithms. Because of this, it is relatively straightforward to parallelize common reconstruction algorithms (e.g. FDK backprojection). This means that significant performance improvements must come from careful memory optimizations, exploiting ASICs and […]
Apr, 21

GACO: A GPU-based High Performance Parallel Multi-ant Colony Optimization Algorithm

As a population-based algorithm, Ant Colony Optimization (ACO) is intrinsically massively parallel, and therefore it is expected to be well-suited for implementation on GPUs (Graphics Processing Units). In this paper, we present a novel ant colony optimization algorithm (called GACO), which based on Compute Unified Device Architecture (CUDA) enabled GPU. In GACO algorithm, we utilize […]
Apr, 21

Computational cost estimates for parallel shared memory isogeometric multi-frontal solvers

In this paper we present computational cost estimates for parallel shared memory isogeometric multi-frontal solver. The estimates show that the ideal isogeometric shared memory parallel direct solver scales as O(p^2 log(N/p)) for one dimensional problems, O(Np^2) for two dimensional problems, and O(N^(4/3)p^2) for three dimensional problems, where N is the number of degrees of freedom, […]
Apr, 19

An Automated Tool for Converting Directive Based C Code Into Parallel CUDA Code

Parallel programming has become simple and reasonable with the preamble of GPGPUs. Now a day’s many programmers transfer their application to GPGPUs with the accessibility of APIs such as NVIDIA’s CUDA. But it is very tricky task to write CUDA program. Most of the industry extensively uses the immense serial C code, and they are […]
Apr, 19

Collision Detection Based on Fuzzy Scene Subdivision

We present a novel approach to perform collision detection queries between rigid and/or deformable models. Our method can handle arbitrary deformations and even discontinuous ones. For this, we subdivide the whole scene with all objects into connected but totally independent parts by a fuzzy clustering algorithm. Following, for every part our algorithm performs a Principal […]
Apr, 19

Architectural Support for Virtual Memory in GPUs

The proliferation of heterogeneous compute platforms, of which CPU/GPU is a prevalent example, necessitates a manageable programming model to ensure widespread adoption. A key component of this is a shared unified address space between the heterogeneous units to obtain the programmability benefits of virtual memory. Indeed, processor vendors have already begun embracing heterogeneous systems with […]
Apr, 19

Parallel Circuit Simulation on Graphical Processing Unit

So high integration of IC design and mix VLSI design have brought new complexity in IC design. This complexity brings new challenges for simulation IC time. There is interest to speed up Spice [1] simulation because for large IC simulation can take several days. Average 75% percent of simulation time is spent in evaluating transistor […]
Apr, 19

Local Alignment Tool Based on Hadoop Framework and GPU Architecture

With the rapid growth of next generation sequencing technologies, such as Slex, more and more data have been discovered and published. To analysis such huge data the computational performance is an important issue. Recently, many tools, such as SOAP, have been implemented on Hadoop and GPU parallel computing architectures. BLASTP is an important tool, implemented […]
Apr, 18

The Reconstruction Toolkit (RTK), an open-source cone-beam CT reconstruction toolkit based on the Insight Toolkit (ITK)

We propose the Reconstruction Toolkit (RTK,, an open-source toolkit for fast cone-beam CT reconstruction, based on the Insight Toolkit (ITK) and using GPU code extracted from Plastimatch. RTK is developed by an open consortium (see affiliations) under the non-contaminating Apache 2.0 license. The quality of the platform is daily checked with regression tests in […]
Apr, 18

Use of Multiple GPUs to Speedup the Execution of a Three-Dimensional Computational Model of the Innate Immune System

The development of computational systems that mimics the physiological response of organs or even the entire body is a complex task. One of the issues that makes this task extremely complex is the huge computational resources needed to execute the simulations. For this reason, the use of parallel computing is mandatory. In this work, we […]
Apr, 18

DBMS Index for Hierarchical Data Using Nested Intervals and Residue Classes

In the work an index based on B+ tree and oriented to storage of tree which are coded by nested intervals method with usage of system of residual classes is described.
Page 2 of 70512345...102030...Last »

* * *

* * *

* * *

Free GPU computing nodes at

Registered users can now run their OpenCL application at We provide 1 minute of computer time per each run on two nodes with two AMD and one nVidia graphics processing units, correspondingly. There are no restrictions on the number of starts.

The platforms are

Node 1
  • GPU device 0: AMD/ATI Radeon HD 5870 2GB, 850MHz
  • GPU device 1: AMD/ATI Radeon HD 6970 2GB, 880MHz
  • CPU: AMD Phenom II X6 @ 2.8GHz 1055T
  • RAM: 12GB
  • OS: OpenSUSE 11.4
  • SDK: AMD APP SDK 2.8
Node 2
  • GPU device 0: AMD/ATI Radeon HD 7970 3GB, 1000MHz
  • GPU device 1: nVidia GeForce GTX 560 Ti 2GB, 822MHz
  • CPU: Intel Core i7-2600 @ 3.4GHz
  • RAM: 16GB
  • OS: OpenSUSE 12.2
  • SDK: nVidia CUDA Toolkit 5.0.35, AMD APP SDK 2.8

Completed OpenCL project should be uploaded via User dashboard (see instructions and example there), compilation and execution terminal output logs will be provided to the user.

The information send to will be treated according to our Privacy Policy

HGPU group © 2010-2014

All rights belong to the respective authors

Contact us: