Feb, 27

Exploitation of GPUs for the Parallelisation of Probably Parallel Legacy Code

General purpose Gpus provide massive compute power, but are notoriously difficult to program. In this paper we present a complete compilation strategy to exploit Gpus for the parallelisation of sequential legacy code. Using hybrid data dependence analysis combining static and dynamic information, our compiler automatically detects suitable parallelism and generates parallel OpenCl code from sequential […]
Feb, 27

Extending the Generalized Fermat Prime Number Search Beyond One Million Digits Using GPUs

Great strides have been made in recent years in the search for ever larger prime Generalized Fermat Numbers (GFN). We briefly review the history of the GFN prime search, and describe new implementations of the ‘Genefer’ software (now available as open source) using CUDA and optimised CPU assembler which have underpinned this unprecedented progress. The […]
Feb, 26

2014 3rd International Conference on Software and Computer Applications, ICSCA 2014

All papers for the ICSCA 2014 will be published in the Journal of Lecture Notes on Software Engineering (LNSE, ISSN: 2301-3559) as one volume, and will be indexed by DOAJ, Electronic Journals Library, Engineering & Technology Digital Library, EBSCO, Ulrich’s Periodicals Directory, International Computer Science Digital Library (ICSDL), ProQuest and Google Scholar. Software Engineering Artificial […]
Feb, 26

Real-Time GPU Implementation of Transverse Oscillation Vector Velocity Flow Imaging

Rapid estimation of blood velocity and visualization of complex flow patterns are important for clinical use of diagnostic ultrasound. This paper presents real-time processing for two-dimensional (2-D) vector flow imaging which utilizes an off-the-shelf graphics processing unit (GPU). In this work, Open Computing Language (OpenCL) is used to estimate 2-D vector velocity flow in vivo […]
Feb, 26

REMODE: Probabilistic, Monocular Dense Reconstruction in Real Time

In this paper, we solve the problem of estimating dense and accurate depth maps from a single moving camera. A probabilistic depth measurement is carried out in real time on a per-pixel basis and the computed uncertainty is used to reject erroneous estimations and provide live feedback on the reconstruction progress. Our contribution is a […]
Feb, 26

Comparative evaluation of platforms for parallel Ant Colony Optimization

The rapidly growing field of nature-inspired computing concerns the development and application of algorithms and methods based on biological or physical principles. This approach is particularly compelling for practitioners in high-performance computing, as natural algorithms are often inherently parallel in nature (for example, they may be based on a "swarm"-like model that uses a population […]
Feb, 26

Full-Speed Deterministic Bit-Accurate Parallel Floating-Point Summation on Multi- and Many-Core Architectures

On modern multi-core, many-core, and heterogeneous architectures, floating-point computations, especially reductions, may become non-deterministic and thus non-reproducible mainly due to non-associativity of floating-point operations. We introduce a solution to compute deterministic sums of floating-point numbers efficiently and with the best possible accuracy. Our multi-level algorithm consists of two main stages: a filtering stage that uses […]
Feb, 26

ShearLab 3D: Faithful Digital Shearlet Transforms based on Compactly Supported Shearlets

Wavelets and their associated transforms are highly efficient when approximating and analyzing one-dimensional signals. However, multivariate signals such as images or videos typically exhibit curvilinear singularities, which wavelets are provably deficient of sparsely approximating and also of analyzing in the sense of, for instance, detecting their direction. Shearlets are a directional representation system extending the […]
Feb, 25

An Evaluation of the GAMA/StarPU Frameworks for Heterogeneous Platforms: the Progressive Photon Mapping Algorithm

Recent evolution of high performance computing moved towards heterogeneous platforms: multiple devices with different architectures, characteristics and programming models, share application workloads. To aid the programmer to efficiently explore these heterogeneous platforms several frameworks have been under development. These dynamically manage the available computing resources through workload scheduling and data distribution, dealing with the inherent […]
Feb, 25

Fast Feature Selection in a GPU Cluster Using the Delta Test

Feature or variable selection still remains an unsolved problem, due to the infeasible evaluation of all the solution space. Several algorithms based on heuristics have been proposed so far with successful results. However, these algorithms were not designed for considering very large datasets, making their execution impossible, due to the memory and time limitations. This […]
Feb, 25

Precision-Aware Soft Error Protection for GPUs

With the advent of general-purpose GPU computing, it is becoming increasingly desirable to protect GPUs from soft errors. For high computation throughout, GPUs must store a significant amount of state and have many execution units. The high power and area costs of full protection from soft errors make selective protection techniques attractive. Such approaches provide […]
Feb, 25

A GaBP-GPU Algorithm of Solving Large-Scale Sparse Linear Systems

According to GaBP (Gaussian Belief Propagation) algorithm, this article presents a GaBP-GPU algorithm of solving large-scale symmetric diagonally dominant sparse linear systems based on GPU. Combined with GaBP-GPU algorithm, a storage format (MCSC) is presented. We extract some diagonally dominant matrices from the University of Florida Sparse Matrix Collection as test examples. The experimental results […]
Page 20 of 705« First...10...1819202122...304050...Last »

* * *

* * *

* * *

Free GPU computing nodes at

Registered users can now run their OpenCL application at We provide 1 minute of computer time per each run on two nodes with two AMD and one nVidia graphics processing units, correspondingly. There are no restrictions on the number of starts.

The platforms are

Node 1
  • GPU device 0: AMD/ATI Radeon HD 5870 2GB, 850MHz
  • GPU device 1: AMD/ATI Radeon HD 6970 2GB, 880MHz
  • CPU: AMD Phenom II X6 @ 2.8GHz 1055T
  • RAM: 12GB
  • OS: OpenSUSE 11.4
  • SDK: AMD APP SDK 2.8
Node 2
  • GPU device 0: AMD/ATI Radeon HD 7970 3GB, 1000MHz
  • GPU device 1: nVidia GeForce GTX 560 Ti 2GB, 822MHz
  • CPU: Intel Core i7-2600 @ 3.4GHz
  • RAM: 16GB
  • OS: OpenSUSE 12.2
  • SDK: nVidia CUDA Toolkit 5.0.35, AMD APP SDK 2.8

Completed OpenCL project should be uploaded via User dashboard (see instructions and example there), compilation and execution terminal output logs will be provided to the user.

The information send to will be treated according to our Privacy Policy

HGPU group © 2010-2014

All rights belong to the respective authors

Contact us: