Mar, 24

High-Performance Image Synthesis for Radio Interferometry

A radio interferometer indirectly measures the intensity distribution of the sky over the celestial sphere. Since measurements are made over an irregularly sampled Fourier plane, synthesising an intensity image from interferometric measurements requires substantial processing. Furthermore there are distortions that have to be corrected. In this thesis, a new high-performance image synthesis tool (imaging tool) […]
Mar, 23

An unsupervised parallel genetic cluster algorithm for graphics processing units

During times of stock market turbulence, monitoring the intraday clustering behaviour of financial instruments allows one to better understand market characteristics and systemic risks. While genetic algorithms provide a versatile methodology for identifying such clusters, serial implementations are computationally intensive and can take a long time to converge to the global optimum. We implement a […]
Mar, 23

Computer Vision Accelerators for Mobile Systems based on OpenCL GPGPU Co-Processing

In this paper, we present an OpenCL-based heterogeneous implementation of a computer vision algorithm — image inpainting-based object removal algorithm — on mobile devices. To take advantage of the computation power of the mobile processor, the algorithm workflow is partitioned between the CPU and the GPU based on the profiling results on mobile devices, so […]
Mar, 23

Fast GPGPU-Based Elliptic Curve Scalar Multiplication

This paper presents a fast implementation to compute the scalar multiplication of elliptic curve points based on a General-Purpose computing on Graphics Processing Units (GPGPU) approach. A GPU implementation using Dan Bernstein’s Curve25519, an elliptic curve over a 255-bit prime field complying with the new 128-bit security level, computes the scalar multiplication in less than […]
Mar, 22

Accelerating Low-Fidelity Aerodynamic Codes

Low-fidelity aerodynamic codes, including panel, lifting line and vortex lattice methods, are used in the preliminary aerodynamic studies in the early stages of the aircraft design and constitute an important and compute-intensive part of aircraft design process. This preliminary design phase is usually very time consuming as it involves parametric studies counting tens of Excellent […]
Mar, 22

CUDA Implementation of a Lattice Boltzmann Method and Code Optimization

We study fluid flow in a 2D lid driven cavity for large Reynolds numbers using multirelaxation time – Lattice Boltzmann Method(LBM). LBM is an alternative to conventional CFD methods that solve Navier-Stokes equations to simulate incompressible fluid dynamics. In LBM, one solves the linearized Boltzmann equation on a discrete lattice to study spatio-temporal evolution of […]
Mar, 22

Applying GPU Dynamic Parallelism to High-Performance Normalization of Gene Expressions

High-density oligonucleotide microarrays allow several millions of genetic markers in a single experiment to be observed. Current bioinformatics tools for gene expression quantile data normalization are unable to process such huge data sets. In parallel with this reality, the huge volume of molecular data produced by current high-throughput technologies in modern molecular biology has increased […]
Mar, 21

GPU Accelerated Process Planning For CNC-Machined Parts:Industrial Components to Bone Implants

For manufacturing a part using conventional 3-Axis CNC machining process, one must determine a set of machining orientations. Generally this process planning task is carried out manually by the machinist, considering decision parameters such as part visibility, machinability, machining depths, tool geometry, etc. In this work, we modelled this as a Linear optimization problem; the […]
Mar, 21

Concurrent learning of a Probabilistic Graphical Model on the GPU

We introduce an algorithm for determining optimal transition paths between given configurations. The solution is obtained by solving variational equations for Freidlin–Wentzell action functionals. One of the applications of the method presented is a system controlling motion and redeployment between unit’s formations. The efficiency of the algorithm has been evaluated in a simple sandbox environment […]
Mar, 21

Implementation of Parallel Fast Hartley Transform (FHT) Using Cuda

Implementation of Fast Hartley Transform in parallel manner on Graphics Processing Unit, using CUDA technology is presented in this paper. Calculating FHT in parallel, using multiple threads, gives us huge improvement in calculation speed. Developed CUDA based parallel algorithm, which experimental results compared with results of CPU based sequential algorithm. Edge detection algorithms can be […]
Mar, 20

An Efficient Parallel ISODATA Algorithm Based on Kepler GPUs

ISODATA is a well-known clustering algorithm based on the nearest neighbor rule, which has been widely used in various areas. It employs a heuristic strategy allowing the clusters to split and merge as appropriate. However, since the volume of the data to be clustered in the real world is growing continuously, the efficiency of the […]
Mar, 20

A pattern recognition system for prostate mass spectra discrimination based on the CUDA parallel programming model

The aim of the present study was to implement a pattern recognition system for the discrimination of healthy from malignant prostate tumors from proteomic Mass Spectroscopy (MS) samples and to identify m/z intervals of potential biomarkers associated with prostate cancer. One hundred and six MS-spectra were studied in total. Sixty three spectra corresponded to healthy […]
Page 10 of 704« First...89101112...203040...Last »

* * *

* * *

* * *

Free GPU computing nodes at

Registered users can now run their OpenCL application at We provide 1 minute of computer time per each run on two nodes with two AMD and one nVidia graphics processing units, correspondingly. There are no restrictions on the number of starts.

The platforms are

Node 1
  • GPU device 0: AMD/ATI Radeon HD 5870 2GB, 850MHz
  • GPU device 1: AMD/ATI Radeon HD 6970 2GB, 880MHz
  • CPU: AMD Phenom II X6 @ 2.8GHz 1055T
  • RAM: 12GB
  • OS: OpenSUSE 11.4
  • SDK: AMD APP SDK 2.8
Node 2
  • GPU device 0: AMD/ATI Radeon HD 7970 3GB, 1000MHz
  • GPU device 1: nVidia GeForce GTX 560 Ti 2GB, 822MHz
  • CPU: Intel Core i7-2600 @ 3.4GHz
  • RAM: 16GB
  • OS: OpenSUSE 12.2
  • SDK: nVidia CUDA Toolkit 5.0.35, AMD APP SDK 2.8

Completed OpenCL project should be uploaded via User dashboard (see instructions and example there), compilation and execution terminal output logs will be provided to the user.

The information send to will be treated according to our Privacy Policy

HGPU group © 2010-2014

All rights belong to the respective authors

Contact us: