Hadoopcl2: Motivating the design of a distributed, heterogeneous programming system with machine-learning applications
Rice University, Department of Computer Science, 6100 Main St, Houston, TX, USA
Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW), 2013
@article{grossman2016hadoopcl2,
title={Hadoopcl2: Motivating the design of a distributed, heterogeneous programming system with machine-learning applications},
author={Grossman, Max and Breternitz, Mauricio and Sarkar, Vivek},
journal={IEEE Transactions on Parallel and Distributed Systems},
volume={27},
number={3},
pages={762–775},
year={2016},
publisher={IEEE}
}
Machine learning (ML) algorithms have garnered increased interest as they demonstrate improved ability to extract meaningful trends from large, diverse, and noisy data sets. While research is advancing the state-of-the-art in ML algorithms, it is difficult to drastically improve the real-world performance of these algorithms. Porting new and existing algorithms from single-node systems to multi-node clusters, or from architecturally homogeneous systems to heterogeneous systems, is a promising optimization technique. However, performing optimized ports is challenging for domain experts who may lack experience in distributed and heterogeneous software development. This work explores how challenges in ML application development on heterogeneous, distributed systems shaped the development of the HadoopCL2 (HCL2) programming system. ML applications guide this work because they exhibit features that make application development difficult: large & diverse datasets, complex algorithms, and the need for domain-specific knowledge. The goal of this work is a general, MapReduce programming system that outperforms existing programming systems. This work evaluates the performance and portability of HCL2 against five ML applications from the Mahout ML framework on two hardware platforms. HCL2 demonstrates speedups of greater than 20x relative to Mahout for three computationally heavy algorithms and maintains minor performance improvements for two I/O bound algorithms.
November 13, 2016 by hgpu