14493
James Clarkson, Christos Kotselidis, Gavin Brown, Mikel Lujan
Heterogeneous programming has started becoming the norm in order to achieve better performance by running portions of code on the most appropriate hardware resource. Currently, significant engineering efforts are undertaken in order to enable existing programming languages to perform heterogeneous execution mainly on GPUs. In this paper we describe Jacc, an experimental framework which allows […]
View View   Download Download (PDF)   
Juan Jose Fumero, Toomas Remmelg, Michel Steuwer, Christophe Dubach
GPUs (Graphics Processing Unit) and other accelerators are nowadays commonly found in desktop machines, mobile devices and even data centres. While these highly parallel processors offer high raw performance, they also dramatically increase program complexity, requiring extra effort from programmers. This results in difficult-to-maintain and non-portable code due to the low-level nature of the languages […]
View View   Download Download (PDF)   
Spencer Davis, Brandon Jones, Hai Jiang
The recent rise in the popularity of mobile computing has brought the attention of mobile security to the forefront. As users depend more on tablets and smartphones, sensitive data is left to be secured using devices with vastly weaker resources than a typical computer. As mobile technology matures, the industry is starting to provide devices […]
View View   Download Download (PDF)   
Artur Malinowski
Parallel algorithms are popular method of increasing system performance. Apart from showing their properties using asymptotic analysis, proof-of-concept implementation and practical experiments are often required. In order to speed up the development and provide simple and easily accessible testing environment that enables execution of reliable experiments, the paper proposes a platform with multi-core computational accelerator: […]
View View   Download Download (PDF)   
Oren Segal, Philip Colangelo, Nasibeh Nasiri, Zhuo Qian, Martin Margala
We introduce SparkCL, an open source unified programming framework based on Java, OpenCL and the Apache Spark framework. The motivation behind this work is to bring unconventional compute cores such as FPGAs/GPUs/APUs/DSPs and future core types into mainstream programming use. The framework allows equal treatment of different computing devices under the Spark framework and introduces […]
Ken Miura, Tatsuya Harada
Deep learning can achieve outstanding results in various fields. However, it requires so significant computational power that graphics processing units (GPUs) and/or numerous computers are often required for the practical application. We have developed a new distributed calculation framework called "Sashimi" that allows any computer to be used as a distribution node only by accessing […]
Ken Miura, Tetsuaki Mano, Atsushi Kanehira, Yuichiro Tsuchiya, Tatsuya Harada
MILJS is a collection of state-of-the-art, platform-independent, scalable, fast JavaScript libraries for matrix calculation and machine learning. Our core library offering a matrix calculation is called Sushi, which exhibits far better performance than any other leading machine learning libraries written in JavaScript. Especially, our matrix multiplication is 177 times faster than the fastest JavaScript benchmark. […]
Xiangyu Li
MapReduce is a programming model capable of processing massive data in parallel across hundreds of computing nodes in a cluster. It hides many of the complicated details of parallel computing and provides a straightforward interface for programmers to adapt their algorithms to improve productivity. Many MapReduce-based applications have utilized the power of this model, including […]
View View   Download Download (PDF)   
Jelena Tekic, Predrag Tekic, Milos Rackovic
This paper presents performance comparison, of the lid-driven cavity flow simulation, with Lattice Boltzmann method, example, between CUDA and OpenCL parallel programming frameworks. CUDA is parallel programming model developed by NVIDIA for leveraging computing capabilities of their products. OpenCL is an open, royalty free, standard developed by Khronos group for parallel programming of heterogeneous devices […]
View View   Download Download (PDF)   
Edward Meeds, Remco Hendriks, Said al Faraby, Magiel Bruntink, Max Welling
With few exceptions, the field of Machine Learning (ML) research has largely ignored the browser as a computational engine. Beyond an educational resource for ML, the browser has vast potential to not only improve the state-of-the-art in ML research, but also, inexpensively and on a massive scale, to bring sophisticated ML learning and prediction to […]
Jie Zhu, Hai Jiang, Juanjuan Li, Erikson Hardesty, Kuan-Ching Li, Zhongwen Li
As the size of high performance applications increases, four major challenges including heterogeneity, programmability, fault resilience, and energy efficiency have arisen in the underlying distributed systems. To tackle with all of them without sacrificing performance, traditional approaches in resource utilization, task scheduling and programming paradigm should be reconsidered. While Hadoop has handled data-intensive applications well […]
View View   Download Download (PDF)   
Aamir Shafi, Aleem Akhtar, Ansar Javed, Bryan Carpenter
This paper presents an overview of the "Applied Parallel Computing" course taught to final year Software Engineering undergraduate students in Spring 2014 at NUST, Pakistan. The main objective of the course was to introduce practical parallel programming tools and techniques for shared and distributed memory concurrent systems. A unique aspect of the course was that […]
View View   Download Download (PDF)   
Page 1 of 712345...Last »

* * *

* * *

Follow us on Twitter

HGPU group

1543 peoples are following HGPU @twitter

Like us on Facebook

HGPU group

274 people like HGPU on Facebook

* * *

Free GPU computing nodes at hgpu.org

Registered users can now run their OpenCL application at hgpu.org. We provide 1 minute of computer time per each run on two nodes with two AMD and one nVidia graphics processing units, correspondingly. There are no restrictions on the number of starts.

The platforms are

Node 1
  • GPU device 0: nVidia GeForce GTX 560 Ti 2GB, 822MHz
  • GPU device 1: AMD/ATI Radeon HD 6970 2GB, 880MHz
  • CPU: AMD Phenom II X6 @ 2.8GHz 1055T
  • RAM: 12GB
  • OS: OpenSUSE 13.1
  • SDK: nVidia CUDA Toolkit 6.5.14, AMD APP SDK 3.0
Node 2
  • GPU device 0: AMD/ATI Radeon HD 7970 3GB, 1000MHz
  • GPU device 1: AMD/ATI Radeon HD 5870 2GB, 850MHz
  • CPU: Intel Core i7-2600 @ 3.4GHz
  • RAM: 16GB
  • OS: OpenSUSE 12.3
  • SDK: AMD APP SDK 3.0

Completed OpenCL project should be uploaded via User dashboard (see instructions and example there), compilation and execution terminal output logs will be provided to the user.

The information send to hgpu.org will be treated according to our Privacy Policy

HGPU group © 2010-2015 hgpu.org

All rights belong to the respective authors

Contact us: