Alexandru Voicu
In the course of less than a decade, Graphics Processing Units (GPUs) have evolved from narrowly scoped application specific accelerators to general-purpose parallel machines capable of accommodating an ever-growing set of algorithms. At the same time, programming GPUs appears to have become trapped around an attractor characterised by ad-hoc practices, non-portable implementations and inexact, uninformative […]
View View   Download Download (PDF)   
Vincent Boulos, Vincent Fristot, Dominique Houzet, Luc Salvo, Pierre Lhuissier
In this article, we present an optimized GPU implementation of a granulometry algorithm which is used a lot in the study of material domain. The main contribution to this algorithm is the binarization of the input data which increases throughput while reducing data allocated memory space. Also, the optimized GPU implementation brings an order of […]
View View   Download Download (PDF)   
Floris De Smedt, Lars Stuyf, Sander Beckers, Joost Vennekens, Gorik De Samblanx, Toon Goedeme
In this paper we present out experiences with the implementation of an object detector using OpenCL. With this implementation we fullfil the need for fast and robust object detection, necessary in many applications in multiple domains (surveillance, traffic, image retrieval, …). The algorithm lends itself to be implemented in a parallel way. We exploit this […]
View View   Download Download (PDF)   
Rudi Giot, Abilio Rodrigues e Sousa
Image Processing Production lines featuring industrial vision are becoming more and more widespread. That kind of automation needs systems able to capture pictures, analyze and learn from them in order to take appropriate action. These processes are often heavy and applied to high-definition images with important frame rate. Powerful calculators are thus needed to follow […]
View View   Download Download (PDF)   
Wenjing Ma
Modern accelerators and multi-core architectures offer significant computing power at a very modest cost. With this trend, an important research issue at the software end is how to make the best use of these computing devices, and how to enable high performance without the users having to put too much effort into learning the architecture […]
View View   Download Download (PDF)   
Snaider Carrillo, Jakob Siegel, Xiaoming Li
Control statements in a GPU program such as loops and branches pose serious challenges for the efficient usage of GPU resources because those control statements will lead to the serialization of threads and consequently ruin the occupancy of GPU, that is, the number of threads running concurrently. Unlike traditional vector processing units that are inside […]
Shinichi Yamagiwa, Koichi Wada
In the last years, the performance and capabilities of Graphics Processing Units (GPUs) improved drastically, mostly due to the demands of the entertainment market, with consumers and companies alike pushing for improvements in the level of visual fidelity, which is only achieved with high performing GPU solutions. Beside the entertainment market, there is an ongoing […]
Victor W. Lee,Changkyu Kim,Jatin Chhugani,Michael Deisher,Daehyun Kim,Anthony D. Nguyen,Nadathur Satish,Mikhail Smelyanskiy,Srinivas Chennupaty,Per Hammarlund,Ronak Singhal,Pradeep Dubey
Recent advances in computing have led to an explosion in the amount of data being generated. Processing the ever-growing data in a timely manner has made throughput computing an important aspect for emerging applications. Our analysis of a set of important throughput computing kernels shows that there is an ample amount of parallelism in these […]
View View   Download Download (PDF)   

* * *

* * *

Follow us on Twitter

HGPU group

1658 peoples are following HGPU @twitter

Like us on Facebook

HGPU group

335 people like HGPU on Facebook

* * *

Free GPU computing nodes at hgpu.org

Registered users can now run their OpenCL application at hgpu.org. We provide 1 minute of computer time per each run on two nodes with two AMD and one nVidia graphics processing units, correspondingly. There are no restrictions on the number of starts.

The platforms are

Node 1
  • GPU device 0: nVidia GeForce GTX 560 Ti 2GB, 822MHz
  • GPU device 1: AMD/ATI Radeon HD 6970 2GB, 880MHz
  • CPU: AMD Phenom II X6 @ 2.8GHz 1055T
  • RAM: 12GB
  • OS: OpenSUSE 13.1
  • SDK: nVidia CUDA Toolkit 6.5.14, AMD APP SDK 3.0
Node 2
  • GPU device 0: AMD/ATI Radeon HD 7970 3GB, 1000MHz
  • GPU device 1: AMD/ATI Radeon HD 5870 2GB, 850MHz
  • CPU: Intel Core i7-2600 @ 3.4GHz
  • RAM: 16GB
  • OS: OpenSUSE 12.3
  • SDK: AMD APP SDK 3.0

Completed OpenCL project should be uploaded via User dashboard (see instructions and example there), compilation and execution terminal output logs will be provided to the user.

The information send to hgpu.org will be treated according to our Privacy Policy

HGPU group © 2010-2015 hgpu.org

All rights belong to the respective authors

Contact us: