Nithish Divakar
Here we present an implementation of Primal-Dual Affine scaling method to solve linear optimization problem on GPU based systems. Strategies to convert the system generated by complementary slackness theorem into a symmetric system are given. A new CUDA friendly technique to solve the resulting symmetric positive definite subsystem is also developed. Various strategies to reduce […]
View View   Download Download (PDF)   
Jose M. Andion, Manuel Arenaz, Francois Bodin, Gabriel Rodriguez, Juan Tourino
The use of GPUs for general purpose computation has increased dramatically in the past years due to the rising demands of computing power and their tremendous computing capacity at low cost. Hence, new programming models have been developed to integrate these accelerators with high-level programming languages, giving place to heterogeneous computing systems. Unfortunately, this heterogeneity […]
View View   Download Download (PDF)   
M. P. Wachowiak, B. B. Sarlo, A. E. Lambe Foster
Much work has recently been reported in parallel GPU-based particle swarm optimization (PSO). Motivated by the encouraging results of these investigations, while also recognizing the limitations of GPU-based methods for big problems using a large amount of data, this paper explores the efficacy of employing other types of parallel hardware for PSO. Most commodity systems […]
View View   Download Download (PDF)   
Vincent Aranega, A. Wendell O. Rodrigues, Anne Etien, Frederic Guyomarch, Jean-Luc Dekeyser
Scientific computation requires more and more performance in its algorithms. New massively parallel architectures suit well to these algorithms. They are known for offering high performance and power efficiency. Unfortunately, as parallel programming for these architectures requires a complex distribution of tasks and data, developers find difficult to implement their applications effectively. Although approaches based […]
View View   Download Download (PDF)   
Charalampos S. Kouzinopoulos, John-Alexander M. Assael, Themistoklis K. Pyrgiotis, Konstantinos G. Margaritis
Multiple matching algorithms are used to locate the occurrences of patterns from a finite pattern set in a large input string. Aho-Corasick and Wu-Manber, two of the most well known algorithms for multiple matching require an increased computing power, particularly in cases where large-size datasets must be processed, as is common in computational biology applications. […]
Pedro Alonso, Manuel F. Dolz, Francisco D. Igual, Rafael Mayo, Enrique S. Quintana-Orti
The road towards Exascale Computing requires a holistic effort to address three different challenges simultaneously: high performance, energy efficiency, and programmability. The use of runtime task schedulers to orchestrate parallel executions with minimal developer intervention has been introduced in recent years to tackle the programmability issue while maintaining, or even improving, performance. In this paper, […]
View View   Download Download (PDF)   
Ivan Devic
In this thesis we explore how application of graphics processors can accelerate calculations in fluid dynamics. We derive semi-implicit pressure linked equations (SIMPLE) and present SIMPLE method (algorithm) which is used with a great success in calculation of steady flows. Motivation for using graphics processors (GPUs) comes from their ability to significantly shorten execution time […]
View View   Download Download (PDF)   
Michael Benguigui, Francoise Baude
This article presents a multi-GPU adaptation of a specific Monte Carlo and classification based method for pricing American basket options, due to Picazo. The first part relates how to combine fine and coarse-grained parallelization to price American basket options. A dynamic strategy of kernel calibration is proposed. Doing so, our implementation on a reasonable size […]
View View   Download Download (PDF)   
Stefan Breuer, Michel Steuwer, Sergei Gorlatch
The implementation of stencil computations on modern, massively parallel systems with GPUs and other accelerators currently relies on manually-tuned coding using low-level approaches like OpenCL and CUDA, which makes it a complex, time-consuming, and error-prone task. We describe how stencil computations can be programmed in our SkelCL approach that combines high level of programming abstraction […]
View View   Download Download (PDF)   
Gary Macindoe
The use of linear algebra routines is fundamental to many areas of computational science, yet their implementation in software still forms the main computational bottleneck in many widely used algorithms. In machine learning and computational statistics, for example, the use of Gaussian distributions is ubiquitous, and routines for calculating the Cholesky decomposition, matrix inverse and […]
View View   Download Download (PDF)   
Xiao Zhang, Wan Guo, Xiao Qin, Xiaonan Zhao
Molecular dynamics (MD) was widely used in chemistry and bio molecules. Numerous attempts have been made to accelerate MD simulations. CUDA enabled NVIDIA Graphics processing units (GPUs) use as a general purpose parallel computer chips as CPU. But it is not easy to port a program to GPU. We present a highly extensible framework for […]
Michael Benguigui, Francoise Baude
This article presents a multi GPU adaptation of a specific Monte Carlo and classification based method for pricing American basket options, due to Picazo [1]. The first part relates how to combine fine and coarse grained parallelization to price American basket options. In order to benefit from different GPU devices, a dynamic strategy of kernel […]
View View   Download Download (PDF)   
Page 1 of 1412345...10...Last »

* * *

* * *

Follow us on Twitter

HGPU group

1666 peoples are following HGPU @twitter

Like us on Facebook

HGPU group

338 people like HGPU on Facebook

* * *

Free GPU computing nodes at hgpu.org

Registered users can now run their OpenCL application at hgpu.org. We provide 1 minute of computer time per each run on two nodes with two AMD and one nVidia graphics processing units, correspondingly. There are no restrictions on the number of starts.

The platforms are

Node 1
  • GPU device 0: nVidia GeForce GTX 560 Ti 2GB, 822MHz
  • GPU device 1: AMD/ATI Radeon HD 6970 2GB, 880MHz
  • CPU: AMD Phenom II X6 @ 2.8GHz 1055T
  • RAM: 12GB
  • OS: OpenSUSE 13.1
  • SDK: nVidia CUDA Toolkit 6.5.14, AMD APP SDK 3.0
Node 2
  • GPU device 0: AMD/ATI Radeon HD 7970 3GB, 1000MHz
  • GPU device 1: AMD/ATI Radeon HD 5870 2GB, 850MHz
  • CPU: Intel Core i7-2600 @ 3.4GHz
  • RAM: 16GB
  • OS: OpenSUSE 12.3
  • SDK: AMD APP SDK 3.0

Completed OpenCL project should be uploaded via User dashboard (see instructions and example there), compilation and execution terminal output logs will be provided to the user.

The information send to hgpu.org will be treated according to our Privacy Policy

HGPU group © 2010-2015 hgpu.org

All rights belong to the respective authors

Contact us: