Machine Learning from Streaming Data in Heterogeneous Computing Environments

hgpu.org » Programming » Algorithms » Machine Learning from Streaming Data in Heterogeneous Computing Environments

Machine Learning from Streaming Data in Heterogeneous Computing Environments

Yusuf Guven Toprakkiran

Technische Universitat Berlin

Technische Universitat Berlin, 2016

@article{jung2016modular,

title={A modular GPU raytracer using OpenCL for non-interactive graphics},

author={Jung, Henrique Nunes and Cassol, Vinicius Jurinic},

year={2016}

}

Download (PDF)

View

Source

2549

views

With the advent of many-core general-purpose processors (CPUs), the use of an increased number of cores has provided a certain speedup for algorithms that can be parallized. Nowadays, there are distributed and parallel data processing platforms, such as Apache Flink, which inherently makes use of parallel computing. On the other hand, graphics processors(GPUs) offers high performance solutions for certain problems thanks to their architecture that is suitable for massivelly data parallel computations. In the last decade, GPU computing has became popular also for general purpose applications. Although there are some drawbacks such as memory transfer latency, it has been proven that GPUs provide substantial speedup especially in computationally intensive problems thanks to their massively parallel computation capability. Nowadays, there are also heterogeneous computing platforms such as OpenCL which enables developers to write portable programs that can be executed in parallel in a range of processors such as CPUs and GPUs while providing certain abstractions that simplify parallel programming across different computing devices. Streaming k-means is an unsupervised online learning algorithm which is an adaptation of batch k-means algorithm which is still one of the most commonly used algorithms due to its simplicity, efficiency and empirical success. In this thesis, we initially implement sliding window based streaming k-means algorithm in OpenCL and Apache Flink, and give an overview regarding the impact of the window size, the tuple size, the number of clusters and the window slide size on system throughput in two CPUs and three GPUs. We achieve higher throughput than Flink in our OpenCL application. Besides, we show that GPUs still produce higher throughput than many-core CPUs. However, the difference between the performances of OpenCL applications where the computational intensive step is executed in CPUs and GPU is reduced in modern architectures. Furthermore, the modern many-core CPUs can occasionally show competitive performance with GPUs in particular when our streaming k-means algorithm is used.

Tags: Algorithms, AMD Radeon R9 Fury, ATI, Computer science, Heterogeneous systems, Machine learning, nVidia, nVidia GeForce GTX 980, OpenCL, Thesis

April 11, 2017 by hgpu

Rating: 1.8/5. From 3 votes.

Please wait...

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

gpu_tracker: Python package for tracking and profiling GPU utilization in both desktop and high-performance computing environments

* * *

high performance computing on graphics processing units: hgpu.org