Yan Zhou
Sequential Monte Carlo is a family of algorithms for sampling from a sequence of distributions. Some of these algorithms, such as particle filters, are widely used in the physics and signal processing researches. More recent developments have established their application in more general inference problems such as Bayesian modeling. These algorithms have attracted considerable attentions […]
Abdullah Cavusoglu, Baha Sen, Caner Ozcan, Salih Gorgunoglu
In this study, surfaces of solid objects are coloured with Cropped Quad-Tree method utilizing GPU computing optimization. There are numerous methods used in solid object colouring. When the studies carried out in different fields are taken into consideration, it is seen that quad-tree method displays a prominent position in terms of speed and performance. Cropped […]
View View   Download Download (PDF)   
Da Li, Kittisak Sajjapongse, Huan Truong, Gavin Conant, Michela Becchi
Several problems in computational biology require the all-against-all pairwise comparisons of tens of thousands of individual biological sequences. Each such comparison can be performed with the well-known Needleman-Wunsch alignment algorithm. However, with the rapid growth of biological databases, performing all possible comparisons with this algorithm in serial becomes extremely time-consuming. The massive computational power of […]
View View   Download Download (PDF)   
Baha Sen, Caner Ozcan, Nesrin Aydin Atasoy
We propose an implementation for quad-tree based solid object coloring using Compute Unified Device Architecture (CUDA). There are numerous different techniques in use for solid object coloring. One commonly used technique is the quad-tree, which has evolved from work in different fields. A quad-tree is a tree data structure in which each internal node has […]
View View   Download Download (PDF)   
Kostadin Damevski, Madhan Muralimanohar
Considerable recent attention has been given to the problem of porting existing code to heterogeneous computing architectures, such as GPUs. In this paper, we describe a novel, interactive refactoring tool that allows for quick and easy transformation of affine loops to execute on GPUs. Compared to previous approaches, our refactoring approach interactively combines the user’s […]
View View   Download Download (PDF)   
Michail Emmanouil Pappas
This work presents the parallelisation of a shallow water simulation model. Two parallel implementations are developed. One is for a multi-core NUMA architecture, developed in OpenMP. The other one is for a many-core GPU-accelerated architecture and is developed in OpenCL. The parallelisation process is based on an iterative approach, starting off from a naive implementation. […]
View View   Download Download (PDF)   
Joel Sanchez Dominguez, Luiz Fernando de Oliveira, Nilton Alves Junior, Joaquim Teixeira de Assis
The paper presents the implementation of a parallel version of FDK (Felkamp, David e Kress) algorithm using graphics processing units. Discussion was briefly some elements the computed tomographic scan and FDK algorithm; and some ideas about GPUs (Graphics Processing Units) and its use in general purpose computing were presented. The paper shows a computational implementation […]
View View   Download Download (PDF)   
Alexander Breuer, Michael Bader
We present a software package that supports teaching different parallel programming models in a computational science and engineering context. It implements a Finite Volume solver for the shallow water equations, with application to tsunami simulation in mind. The numerical model is kept simple, using patches of Cartesian grids as computational domain, which can be connected […]
Adam Dabrowski, Pawel Pawlowski, Mateusz Stankiewicz, Filip Misiorek
An idea of the so-called quasi-maximum accuracy computations for improvement of precision of the floating-point digital signal processing with graphic processing units (GPUs) is presented in this paper. In the presented approach, the increase of the precision of computations does not need any increase of the length of the data words. Special attention has been […]
View View   Download Download (PDF)   
Jonathan Thompson, Kristofer Schlachter
This paper presents an overview of the OpenCL 1.1 standard [Khronos 2012]. We first motivate the need for GPGPU computing and then discuss the various concepts and technological background necessary to understand the programming model. We use concurrent matrix multiplication as a framework for explaining various performance characteristics of compiling and running OpenCL code, and […]
View View   Download Download (PDF)   
Ya Shu
Optical techniques are a promising technology to realize high frequency ultrasound arrays. High sensitivity and broad bandwidth have been demonstrated with optoacoustic sensors based on thin film etalons. A thin film etalon consists of a transparent layer (e.g. photoresist or parylene) with gold coatings on a glass substrate. One-dimensional (1-D) data acquisition is realized by […]
View View   Download Download (PDF)   
Jason Spencer
We examine the problem of optimizing classification tree evaluation for on-line and real-time applications by using GPUs. Looking at trees with continuous attributes often used in image segmentation, we first put the existing algorithms for serial and data-parallel evaluation on solid footings. We then introduce a speculative parallel algorithm designed for single instruction, multiple data […]
View View   Download Download (PDF)   

* * *

* * *

Follow us on Twitter

HGPU group

1655 peoples are following HGPU @twitter

Like us on Facebook

HGPU group

334 people like HGPU on Facebook

* * *

Free GPU computing nodes at hgpu.org

Registered users can now run their OpenCL application at hgpu.org. We provide 1 minute of computer time per each run on two nodes with two AMD and one nVidia graphics processing units, correspondingly. There are no restrictions on the number of starts.

The platforms are

Node 1
  • GPU device 0: nVidia GeForce GTX 560 Ti 2GB, 822MHz
  • GPU device 1: AMD/ATI Radeon HD 6970 2GB, 880MHz
  • CPU: AMD Phenom II X6 @ 2.8GHz 1055T
  • RAM: 12GB
  • OS: OpenSUSE 13.1
  • SDK: nVidia CUDA Toolkit 6.5.14, AMD APP SDK 3.0
Node 2
  • GPU device 0: AMD/ATI Radeon HD 7970 3GB, 1000MHz
  • GPU device 1: AMD/ATI Radeon HD 5870 2GB, 850MHz
  • CPU: Intel Core i7-2600 @ 3.4GHz
  • RAM: 16GB
  • OS: OpenSUSE 12.3
  • SDK: AMD APP SDK 3.0

Completed OpenCL project should be uploaded via User dashboard (see instructions and example there), compilation and execution terminal output logs will be provided to the user.

The information send to hgpu.org will be treated according to our Privacy Policy

HGPU group © 2010-2015 hgpu.org

All rights belong to the respective authors

Contact us: