Parallel tree-ensemble algorithms for GPUs using CUDA
School of Business and IT, University of Boras, 501 90, Boras, Sweden
Sixth Swedish Workshop on Multicore Computing (MCC13), 2013
@article{jansson2013parallel,
title={Parallel tree-ensemble algorithms for GPUs using CUDA},
author={Jansson, Karl and Sundell, H{aa}kan and Bostr{"o}m, Henrik},
year={2013}
}
We present two new parallel implementations of the tree-ensemble algorithms Random Forest (RF) and Extremely randomized trees (ERT) for emerging many-core platforms, e.g., contemporary graphics cards suitable for general-purpose computing (GPGPU). Random Forest and Extremely randomized trees are ensemble learners for classification and regression. They operate by constructing a multitude of decision trees at training time and outputting a prediction by comparing the outputs of the individual trees. Thanks to the inherent parallelism of the task, an obvious platform for its computation is to employ contemporary GPUs with a large number of processing cores. Previous parallel algorithms for Random Forests in the literature are either designed for traditional multi-core CPU platforms or early history GPUs with simpler hardware architecture and relatively few number of cores. The new parallel algorithms are designed for contemporary GPUs with a large number of cores and take into account aspects of the newer hardware architectures as memory hierarchy and thread scheduling. They are implemented using the C/C++ language and the CUDA interface for best possible performance on NVidia-based GPUs. An experimental study comparing with the most important previous solutions for CPU and GPU platforms shows significant improvement for the new implementations, often with several magnitudes.
December 6, 2013 by hgpu