17334

GPU-acceleration for Large-scale Tree Boosting

Huan Zhang, Si Si, Cho-Jui Hsieh
Dept. of Electrical and Computer Engineering, University of California, Davis
arXiv:1706.08359 [stat.ML], (26 Jun 2017)

@article{zhang2017gpuacceleration,

   title={GPU-acceleration for Large-scale Tree Boosting},

   author={Zhang, Huan and Si, Si and Hsieh, Cho-Jui},

   year={2017},

   month={jun},

   archivePrefix={"arXiv"},

   primaryClass={stat.ML}

}

In this paper, we present a novel massively parallel algorithm for accelerating the decision tree building procedure on GPUs (Graphics Processing Units), which is a crucial step in Gradient Boosted Decision Tree (GBDT) and random forests training. Previous GPU based tree building algorithms are based on parallel multi-scan or radix sort to find the exact tree split, and thus suffer from scalability and performance issues. We show that using a histogram based algorithm to approximately find the best split is more efficient and scalable on GPU. By identifying the difference between classical GPU-based image histogram construction and the feature histogram construction in decision tree training, we develop a fast feature histogram building kernel on GPU with carefully designed computational and memory access sequence to reduce atomic update conflict and maximize GPU utilization. Our algorithm can be used as a drop-in replacement for histogram construction in popular tree boosting systems to improve their scalability. As an example, to train GBDT on epsilon dataset, our method using a main-stream GPU is 7-8 times faster than histogram based algorithm on CPU in LightGBM and 25 times faster than the exact-split finding algorithm in XGBoost on a dual-socket 28-core Xeon server, while achieving similar prediction accuracy.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2017 hgpu.org

All rights belong to the respective authors

Contact us: