14500
Yanping Huang, Sai Zhang
Deep learning methods have shown great promise in many practical applications, ranging from speech recognition, visual object recognition, to text processing. However, most of the current deep learning methods suffer from scalability problems for large-scale applications, forcing researchers or users to focus on small-scale problems with fewer parameters. In this paper, we consider a well-known […]
Tom Runia
In this thesis we design, implement and study a high-speed object detection framework. Our baseline detector uses integral channel features as object representation and AdaBoost as supervised learning algorithm. We suggest the implementation of two approximation techniques for speeding up the baseline detector and show their effectiveness by performing experiments on both detection quality and […]
View View   Download Download (PDF)   
Andre Viebke
Deep learning, a sub-topic of machine learning inspired by biology, have achieved wide attention in the industry and research community recently. State-of-the-art applications in the area of computer vision and speech recognition (among others) are built using deep learning algorithms. In contrast to traditional algorithms, where the developer fully instructs the application what to do, […]
View View   Download Download (PDF)   
Markus Schlafli
GPU architectures are becoming increasingly important due to their high number of processors. The single input multiple data architecture has proven to work not just for the graphics domain, but also for many other disciplines. This is due to the potential performance that can be achieved by a consumer-level GPU being significantly higher than the […]
View View   Download Download (PDF)   
Saeed Taheri, Apan Qasem, Martin Burtscher
Future computing systems, from handhelds to supercomputers, will undoubtedly be more parallel and heterogeneous than today’s systems to provide more performance and energy efficiency. Thus, GPUs are increasingly being used to accelerate general-purpose applications, including applications with data-dependent, irregular control flow and memory access patterns. However, the growing complexity, exposed memory hierarchy, incoherence, heterogeneity, and […]
View View   Download Download (PDF)   
John Clemens
Recent research has repeatedly shown that machine learning techniques can be applied to either whole files or file fragments to classify them for analysis. We build upon these techniques to show that for samples of un-labeled compiled computer object code, one can apply the same type of analysis to classify important aspects of the code, […]
View View   Download Download (PDF)   
Carlos Alberto Martinez-Angeles, Ines Dutra, Vitor Santos Costa, Jorge Buenabad-Chavez
Graphics Processing Units (GPUs) are being widely used to improve performance of machine learning and logic programming systems. Next, we propose using this technique to improve the performance of Markov logic programs. In this paper we focus on the first step of the inference phase, the grounding of first-order logical formulas composing a Markov network. […]
View View   Download Download (PDF)   
Gavin Davidson
The self organising map is a machine learning algorithm used to produce low dimensional representations of high dimensional data. While the process is becoming more and more useful with the rise of big data, it is hindered by the sheer amount of time the algorithm takes to run serially. This project produces a parallel version […]
Thomas Chun Pong Chau
This thesis addresses the problem of designing real-time reconfigurable systems. Our first contribution of this thesis is to propose novel data structures and memory architectures for accelerating real-time proximity queries, with potential application to robotic surgery. We optimise performance while maintaining accuracy by several techniques including mixed precision, function transformation and streaming data flow. Significant […]
Clemens-Alexander Brust, Sven Sickert, Marcel Simon, Erik Rodner, Joachim Denzler
In this paper, we present convolutional patch networks, which are convolutional (neural) networks (CNN) learned to distinguish different image patches and which can be used for pixel-wise labeling. We show how to easily learn spatial priors for certain categories jointly with their appearance. Experiments for urban scene understanding demonstrate state-of-the-art results on the LabelMeFacade dataset. […]
Baoguang Shi, Xiang Bai, Cong Yao
Image-based sequence recognition has been a long-standing research topic in computer vision. In this paper, we investigate the problem of scene text recognition, which is among the most important and challenging tasks in image-based sequence recognition. A novel neural network architecture, which integrates feature extraction, sequence modeling and transcription into a unified framework, is proposed. […]
View View   Download Download (PDF)   
Keiron O'Shea
Greedy Restrictive Boltzmann Machines yield an fairly low 0.72% error rate on the famous MNIST database of handwritten digits. All that was required to achieve this result was a high number of hidden layers consisting of many neurons, and a graphics card to greatly speed up the rate of learning.
View View   Download Download (PDF)   
Page 1 of 1512345...10...Last »

* * *

* * *

Follow us on Twitter

HGPU group

1546 peoples are following HGPU @twitter

Like us on Facebook

HGPU group

275 people like HGPU on Facebook

* * *

Free GPU computing nodes at hgpu.org

Registered users can now run their OpenCL application at hgpu.org. We provide 1 minute of computer time per each run on two nodes with two AMD and one nVidia graphics processing units, correspondingly. There are no restrictions on the number of starts.

The platforms are

Node 1
  • GPU device 0: nVidia GeForce GTX 560 Ti 2GB, 822MHz
  • GPU device 1: AMD/ATI Radeon HD 6970 2GB, 880MHz
  • CPU: AMD Phenom II X6 @ 2.8GHz 1055T
  • RAM: 12GB
  • OS: OpenSUSE 13.1
  • SDK: nVidia CUDA Toolkit 6.5.14, AMD APP SDK 3.0
Node 2
  • GPU device 0: AMD/ATI Radeon HD 7970 3GB, 1000MHz
  • GPU device 1: AMD/ATI Radeon HD 5870 2GB, 850MHz
  • CPU: Intel Core i7-2600 @ 3.4GHz
  • RAM: 16GB
  • OS: OpenSUSE 12.3
  • SDK: AMD APP SDK 3.0

Completed OpenCL project should be uploaded via User dashboard (see instructions and example there), compilation and execution terminal output logs will be provided to the user.

The information send to hgpu.org will be treated according to our Privacy Policy

HGPU group © 2010-2015 hgpu.org

All rights belong to the respective authors

Contact us: